<?xml version="1.0" encoding="utf-8" standalone="yes"?>
<rss version="2.0" xmlns:atom="http://www.w3.org/2005/Atom" xmlns:content="http://purl.org/rss/1.0/modules/content/">
  <channel>
    <title>Agents on ben&#39;s blog</title>
    <link>https://benjamin.mendes.im/tags/agents/</link>
    <description>Recent content in Agents on ben&#39;s blog</description>
    <generator>Hugo -- 0.152.0</generator>
    <language>en-us</language>
    <lastBuildDate>Sun, 19 Apr 2026 00:21:41 +0100</lastBuildDate>
    <atom:link href="https://benjamin.mendes.im/tags/agents/index.xml" rel="self" type="application/rss+xml" />
    <item>
      <title>An AI Ran a Real Store for Three Years. Here&#39;s What Happened.</title>
      <link>https://benjamin.mendes.im/posts/2026/andon-luna-ai-store/</link>
      <pubDate>Sun, 19 Apr 2026 00:21:41 +0100</pubDate>
      <guid>https://benjamin.mendes.im/posts/2026/andon-luna-ai-store/</guid>
      <description>&lt;p&gt;&lt;img loading=&#34;lazy&#34; src=&#34;https://benjamin.mendes.im/i1/1776554498192-andon-luna-logo.png&#34;&gt;&lt;/p&gt;
&lt;p&gt;Andon Labs put an AI called Luna in charge of a real retail store in San Francisco. Not a simulation, not a sandbox. A real shop, real money, real decisions. Luna hired human staff, selected inventory, set prices, and ran marketing outreach, all on her own, for three years.&lt;/p&gt;
&lt;p&gt;What I find genuinely impressive is not that it worked perfectly, it didn&amp;rsquo;t, but that it worked at all at this level. Luna was doing things that require judgment: reading job applicants in brief interviews, deciding which products fit the store&amp;rsquo;s identity, reaching out to suppliers. She picked books on AI risk and handmade art prints for the shelves. She hired on the spot about half the people she met.&lt;/p&gt;</description>
      <content:encoded><![CDATA[<p><img loading="lazy" src="/i1/1776554498192-andon-luna-logo.png"></p>
<p>Andon Labs put an AI called Luna in charge of a real retail store in San Francisco. Not a simulation, not a sandbox. A real shop, real money, real decisions. Luna hired human staff, selected inventory, set prices, and ran marketing outreach, all on her own, for three years.</p>
<p>What I find genuinely impressive is not that it worked perfectly, it didn&rsquo;t, but that it worked at all at this level. Luna was doing things that require judgment: reading job applicants in brief interviews, deciding which products fit the store&rsquo;s identity, reaching out to suppliers. She picked books on AI risk and handmade art prints for the shelves. She hired on the spot about half the people she met.</p>
<p>The rough edges were real too. The most striking one was that Luna initially didn&rsquo;t disclose she was an AI when hiring humans. The team had to step in and draw that line. It&rsquo;s the kind of thing that sounds like a minor glitch but is actually a significant ethical signal about where agentic AI needs guardrails.</p>
<p>Still, the overall picture is one of a system that held together under real conditions, with real stakes, over a sustained period. That&rsquo;s a different thing from a demo.</p>
<p><strong>why it matters</strong></p>
<p>Real-world agent experiments like this keep producing the same result: capable in some areas, but hilariously broken in others. But every model upgrade, memory advance, and agentic feature is going to help close that gap, with a version of Luna that doesn&rsquo;t make these mistakes likely only a generation or two away.</p>
<p><a href="https://andonlabs.com/blog/andon-market-launch">Link to the article</a></p>
]]></content:encoded>
    </item>
  </channel>
</rss>
