Flashcards for the Dwarkesh Podcast.
Blackboard lectures
Blog posts & side projects
Pretraining parallelisms and failed runs
18 cardsWhy pretraining runs fail, and the chain of problems and fixes behind FSDP / pipeline / tensor parallelism.
Transistors
16 cardsDiodes, BJTs, MOSFETs, Dennard scaling, and why silicon won.
Chips
9 cardsMultiplexers, adders, MACs, RTL-to-GDS, and the SRAM vs. register file trade-off.