🦞🌯 Lobster Roll

Stories by timmyd

Highly efficient matrix transpose in Mojo (veitner.bearblog.dev)
Inworld TTS: 20x cheaper, state-of-the-art, text-to-speech (inworld.ai)
GPU Comic: When GPU programming hurts so much, all we can do is laugh (comic.modular.com)
Matrix Multiplication on Nvidia's Blackwell: Part 1 – Introduction (modular.com)
How to Beat Unsloth's CUDA Kernel Using Mojo–With Zero GPU Experience (modular.com)
The Five Eras of KVCache (modular.com)
Scale or Surrender: When watts determine freedom (timdavis.com)
Fast vector sum without CUDA (veitner.bearblog.dev)
Modular Platform 25.3: 450K+ Lines of Open Source Code and Pip Packaging (modular.com)