China's Kimi K2.6 Is Closing the Gap Faster Than Anyone Expected

Moonshot AI just open-sourced K2.6, and the benchmarks are hard to ignore. It beats or matches GPT-5.4, Opus 4.6, and Gemini 3.1 Pro on Humanity’s Last Exam with tools and SWE-Bench Pro, which are two of the more credible tests for reasoning and coding. It can run for 12 hours straight across 4,000 tool calls. One internal agent apparently ran autonomously for five days. And it can spin up 300 parallel sub-agents at the same time. ...

April 21, 2026 · 2 min · 298 words · bjr

An AI Ran a Real Store for Three Years. Here's What Happened.

Andon Labs put an AI called Luna in charge of a real retail store in San Francisco. Not a simulation, not a sandbox. A real shop, real money, real decisions. Luna hired human staff, selected inventory, set prices, and ran marketing outreach, all on her own, for three years. What I find genuinely impressive is not that it worked perfectly, it didn’t, but that it worked at all at this level. Luna was doing things that require judgment: reading job applicants in brief interviews, deciding which products fit the store’s identity, reaching out to suppliers. She picked books on AI risk and handmade art prints for the shelves. She hired on the spot about half the people she met. ...

April 19, 2026 · 2 min · 264 words · bjr