Thinking Machines' Interaction Models Are Clever, But Still LLMs Under the Hood

Mira Murati’s Thinking Machines Lab has been quiet since launch, and this is their first real public release. They call it interaction models, and the idea is to design the model around how humans actually collaborate rather than around how long an agent can run alone. Audio, video, text all flowing continuously instead of turn-by-turn. 200ms time-aligned micro-turns, concurrent input and output, a separate background reasoning model for the heavier thinking.

But here’s the question I keep coming back to. Don’t OpenAI and Anthropic already do this? GPT-4o is literally “omni”, real-time speech in and out, vision, interruption handling, sub-300ms latency. Anthropic has Claude with vision, tool use, strong streaming. The user experience of talking to GPT-4o already feels natural and real-time. So what is TML actually doing that’s different?

Their answer is technical: interaction is built into the model architecture rather than layered on top. The 200ms micro-turn loop is native. They split the system into a real-time interaction model and a separate background reasoning model. In theory that’s a different architecture choice. In practice, the question is whether “we built it native” produces a noticeably different product, or whether OpenAI’s “we made the streaming really good” already gets you 95% of the way there.

My read is that the differentiation is thinner than the marketing suggests. The architectural distinction is real, but it’s subtle, and consumers won’t feel it directly. What they will feel is whether the latency is lower, the interruptions are more graceful, the multimodal feels more integrated. And in those user-facing terms, the frontier labs are already there.

There’s also the bigger question I keep waiting for someone to answer. This is still an LLM. A faster, better-stitched-together, more responsive LLM, but the underlying architecture hasn’t moved. What I was actually hoping for from a lab built by ex-frontier researchers with a clean sheet was a rethink of the primitives. Something that doesn’t just polish the transformer, but questions it.

Link to the article