Posted by Alumni from Substack
May 28, 2026
There is a particular sleight-of-hand at the heart of modern LLM reasoning that, the more I look at it, the more it bothers me. The argument goes like this: Transformers are shallow. A 70-layer stack is fixed depth ' it sits in complexity classes like AC' or TC', which is a polite way of saying it cannot, in a single forward pass, solve problems that fundamentally require sequential computation. So we paper over this by making the model think out loud. We give it a scratchpad. We call it Chain-of-Thought. We celebrate. But CoT is not reasoning. CoT is the model renting depth from its own output tokens. Every reasoning step has to leave the residual stream, become a discrete token in a vocabulary built for human communication, and come back in through the embedding layer for the next step. It is, mechanically, an absurd way to do internal computation ' like a CPU that must spill every intermediate register to disk in plaintext English. Sapient Intelligence's bet, made first with the... learn more