All Posts

  1. How LLMs Generate Tokens in Production

    A walkthrough of the path from prompt text to generated tokens, and why production LLM serving is really about schedu...

  2. Omotenashi, A Week of Noticing in Tokyo

    I spent a week in Tokyo for the ClickHouse offsite and I couldn’t stop noticing the small design choices that made li...

  3. LLM Benchmarks Are Flatlined. Task Horizons Are Not.

    The headline accuracy numbers on standard benchmarks have stagnated. MMLU, TruthfulQA, HellaSwag: the top models have...

  4. Ditch grep and Speed up Claude Code with LSPs

    Grep is a text search. Code is not text. It’s a graph of symbols, types, and call chains. That gap is where Claude wa...

  5. AI B*llsh*tting

    Spotting AI Lies: How to Know When Your LLM is BS-ing

  6. Edit Survival

    Edit Survival - Quality metrics for AI coding agents

  7. How To Write A Coding Agent In 169 Lines Of Python

    Writing a minimal coding agent from scratch with no hidden magic. Just prompts, tool calls and a loop.