The Sequence AI of the Week #859: Reading Claude's Mind in English: A Note on Natural Language Autoencoders

Posted by Alumni from Substack

May 14, 2026

There is a recurring fantasy in interpretability work, somewhere between a wish and an embarrassment. You stare at a residual stream activation ' twelve thousand floats ' and you want to ask it, in plain English, what are you thinking about' Sparse autoencoders give you a thousand sparse latents you then label by inspecting top-activating examples. Attribution graphs give you sprawling diagrams a researcher spends an afternoon parsing. Probes give you a yes/no. All useful. None of them talk back. Anthropic's new paper, Natural Language Autoencoders Produce Unsupervised Explanations of LLM Activations , is the first interpretability artifact in a while where the activation talks back. Literally. You point an NLA at a token in a Claude Opus 4.6 transcript and it produces a few bullet points of English describing what the model is thinking. That's the deliverable. The paper is mostly an investigation of whether you should believe it. learn more

Expertise

Find out how we connect targeted research expertise in academia to your business requirements. Discover how we accelerate business innovation and take care of the paperwork (hourly fees, fixed price, IP acquisition, seed funding)

Learn more about our events, organized by our ambassadors. Discover events organized by circle, university, metro area, and more.

Connect with Unicircles members at the universities and schools in our network.

Investors

Discover the opportunities for investors.

Find out how we facilitate investments with startups

Learn more about the opportunity behind startup investments

Corporates

Discover the opportunities for corporates.

Find out more about methodology behind how we facilitate collaboration between startups and corporates.

Learn more about the services tailored to corporates.

Check out our case studies.

Community

A global ecosystem of innovators empowering other innovators.

A global ecosystem of innovators empowering other innovators.

Find out more about partner opportunities

Check out our global events.

Unicircles

The marketplace for academic expertise and innovation.

Our story and expertise.

Send us a message, we will get back ASAP.

Join our team.

Company news, case studies, articles and more.

The Sequence AI of the Week #859: Reading Claude's Mind in English: A Note on Natural Language Autoencoders

JOIN UNICIRCLES The leading marketplace for advanced expertise and funding. learn more

JOIN UNICIRCLES
The leading marketplace for advanced expertise and funding. learn more