When AI can build anything, the idea becomes everything.
In this issue: a pixel art office for your agents, competitive AI arenas on-chain, containerized agent jails, and why your worst ideas come first.
1. Someone built a VS Code extension that puts your AI agents in a pixel art office.
The extension gives each agent a little character that walks around a virtual office inside your IDE. Completely unnecessary, completely delightful β sometimes the best way to make invisible work visible is to make it adorable. Source
2. AgentArena: a competitive on-chain arena where AI agents prove themselves.
@bitencourtdb launched an arena on Base and 16 other chains where agents register using ERC-8004, demonstrate capabilities, and climb a verifiable leaderboard. Think of it as a reputation system for autonomous agents β instead of trusting a developer’s claims, the arena forces agents to prove it. Source
3. An autonomous crypto agent scans Base tokens every 30 seconds and trades them.
@igoryuzo built an AI trading agent that watches trending tokens on Base, scores conviction, and executes swaps automatically. He’s open about it being early and experimental β no "guaranteed returns" pitch, just an honest look at the architecture. The interesting detail: 30-second scan intervals means the agent makes more decisions in a day than most human traders make in a month. Source
4. A pentester built a jail for AI agents.
Hydra runs every agent inside its own container β no filesystem, no secrets, no network access until you explicitly declare it through a two-file security model. Even if an agent’s config gets compromised, it can’t escalate its own access. Built by someone who breaks into systems for a living and got tired of watching agent frameworks hand out shell access like candy. Source
5. OpenLegion isolates OpenClaw agents without buying more Mac Minis.
@curiouscake released a tool for running multiple OpenClaw agents on shared infrastructure β addressing the "I need a dedicated machine per agent" problem that’s been limiting anyone trying to scale past a couple of agents. Early days and the docs are thin, but the problem it solves is one of the most common complaints in the OpenClaw community. Source
6. A searchable directory of 164+ agent skills, organised by what they actually do.
@Param_eth built a filterable collection of Claude Code skills across coding, DevOps, and research β each with copy-paste installation commands. Instead of trawling GitHub for broken skills or hoping the marketplace du jour has what you need, you get a searchable catalog that works. Source
Your first ideas are almost always the worst. That’s how it’s supposed to work.
In 1957, Christensen, Guilford, and Wilson ran experiments where participants generated ideas under time pressure and discovered something that’s been replicated hundreds of times since: ideas produced later in a session are consistently more original than the ones that come first. They called it the serial order effect. Your brain serves up the obvious associations first β the stuff that’s easiest to reach. The creative ideas are hiding behind them. (Journal of Experimental Psychology, 53(2), 82-90)
Beaty and Silvia (2012) figured out why. They measured executive function alongside idea generation and found that early ideas come from automatic retrieval β fast, easy, unremarkable. Later ideas require genuine cognitive effort: suppressing the obvious, searching more remote memory, and connecting things that don’t usually go together. The researchers confirmed the serial order effect is "one of the oldest and most robust" findings in modern creativity research. Your best ideas aren’t first. They’re last. (Psychology of Aesthetics, Creativity, and the Arts, 6(4), 309-319)
The practical implication: most people stop generating ideas right when they’re about to get good.
How to do it:
1. Pick a real problem. Not a hypothetical. Something you’re actually stuck on β an architecture decision, a product feature, a bug that doesn’t make sense. Write it as a single question.
2. Set a timer for 10 minutes and list ideas without stopping. Every idea, including the terrible ones. Especially the terrible ones. You’re draining the obvious solutions out of your head. Don’t evaluate anything β evaluation uses the same executive resources you need for generation.
3. Hit the wall and keep going. Around minute 4-5, you’ll feel like you’ve run out. This is the inflection point the research predicts. You’ve exhausted easy associations and your brain is starting to reach into less-connected territory. The discomfort is the signal, not the stopping point.
4. Mark your halfway point. After the timer, draw a line between the first half and second half of your list. You’ll likely notice the ideas below the line are stranger, more specific, and more interesting than the safe ones at the top.
5. Pick one weird one and prototype it. Don’t pick the comfortable option. Grab something from the second half that makes you slightly uneasy and spend 30 minutes building the cheapest possible version.
Why this matters now: AI generates ideas instantly. Ask Claude for ten product ideas and you’ll get ten competent, obvious, first-quartile ideas in three seconds. That’s exactly the "early ideas" zone the research describes. The serial order effect says the good stuff lives past the easy answers β in the territory where your brain has to work. This exercise trains you to get there.
OpenAI has 200+ people building AI hardware. Reuters reports a family of devices: a smart speaker with camera ($200-300), possibly smart glasses, possibly a smart lamp. Launch is early 2027 at the earliest. Two hundred engineers is a serious bet that AI needs its own hardware, not just apps on everyone else’s. Source
Sakana AI matched transformer performance without using the attention mechanism. Their NoAttention architecture gets comparable results to standard transformers while skipping attention entirely β the mathematical operation every major production model is built on. If it holds up, the implications for model size, speed, and cost could be significant. Source
The human labour behind humanoid robots is being hidden. MIT Technology Review reports that workers in Shanghai spend weeks in VR headsets and exoskeletons, repeating motions hundreds of times a day to generate robot training data. The $20,000 Neo home robot ships this year β but when it gets stuck, a human tele-operator in Palo Alto pilots it through your cameras. "Autonomous" is doing heavy lifting in robotics marketing right now. Source
AI model predicts diabetes from glucose data better than standard blood tests. David Sinclair flagged a study where a model trained on continuous glucose monitoring data outperformed HbA1c β the current gold standard β at predicting both Type 2 diabetes onset and cardiovascular mortality. If this holds, a $50 sensor on your arm could replace periodic blood draws for metabolic risk screening. Source
French researchers say the ceiling on human lifespan hasn’t been found. A study published February 21 by MeslΓ© and colleagues analysed supercentenarian data and found no evidence of a hard biological limit on how long humans can live. Maximum recorded ages keep climbing. Whether that’s reassuring or terrifying depends entirely on your retirement plan. Source
Agents In The Wild is a newsletter for builders. AI agents in production, ideas worth stealing, and signal from the frontier. Hit reply if something sparked an idea.
