Agents In The Wild

AI agents in production. Ideas worth building. Signal, not noise.

Some years back you signed up to my Adventures in Machine Learning email list, where I was teaching deep learning. A lot has changed in AI since then.

I’m starting something new. With agent frameworks like OpenClaw and the explosion of skills in Codex and Claude Code, the distance between idea and execution has shrunk drastically. The future belongs to those who can generate ideas.

This newsletter goes out 3x a week and is split into three parts:

🔥 Sparks — interesting things builders are shipping with agent frameworks right now.

🪵 Kindling — exercises to sharpen your creativity.

🔭 Horizon — an inspiring look at the future, from AI breakthroughs to longevity research.

If that doesn’t interest you — please hit unsubscribe in the footer below, definitely no hard feelings (it’s been a while!). If it does — welcome to Agents In The Wild.

In this issue: self-deploying agents, marketing swarms, an autonomous crypto trader, and a creativity technique backed by a 117-study meta-analysis.

🔥 Sparks — What builders are doing with AI agents right now

1. Someone prompted Claude Opus 4.6 to deploy itself. It did.

A developer gave Claude Code a simple brief: write yourself, deploy to a $5 VPS, wire up Telegram for control. The agent wrote its own codebase, stood up the server, and now sits there with 30+ callable tools. You can install it with npm install -g kernelbot. An AI setting up its own production infrastructure used to be a thought experiment. Now it’s a Tuesday. Source

2. A dev automated half his company’s marketing with agents that talk to each other.

Jeel Patel at FieldCamp built a swarm of specialised agents — competitive intel, analytics, SEO, writer — wired into 26 Discord channels with a shared Supabase database. The agents coordinate: SEO flags a keyword, writer drafts content, analytics measures, intel adjusts strategy. Handles 50%+ of marketing ops and runs 24/7. The architecture is messy (Discord as an agent bus?) but it works. Source

3. Someone automated an entire crude oil trading operation with Claude Code’s team mode.

A trader wired up Claude Code to research academic papers, write NinjaTrader/MT5 strategies, compile, debug, and backtest — all without touching anything. The agents work in "team mode," bouncing between research and implementation until they produce profitable systems. Whether it stays profitable is anyone’s guess, but the pipeline itself is genuinely autonomous. Source

4. An autonomous agent that needs nothing but a GitHub repo.

Nikshep Svn built a setup where an agent runs entirely through GitHub Actions. No servers. No infrastructure. You push a goal, the agent works through it, commits the results. Every decision is transparent in the commit history. It solves the "where does the agent live?" problem by making it homeless on purpose — it lives in CI/CD, the one piece of infrastructure every developer already has. Source

5. Someone built an autonomous crypto trader on Solana. It manages its own wallet.

A developer on r/SideProject built Milo, an AI portfolio manager with its own Solana smart wallet. It watches markets, executes trades, manages risk, and rebalances — all without a human staring at charts. The hardest part wasn’t trading logic, it was guardrails: position limits, drawdown thresholds, kill switches for when the model gets weird. Source

6. A dev is running 30+ AI agents that generate $140K/month on a $230 server.

An n8n setup on a single Hetzner box (Redis + Postgres) handles Fortune 500 lead gen and outreach with 30+ specialised agents using Serper, Apollo, and Claude’s API. The economics are wild: $230/month in infra supporting six figures in revenue. It’s the kind of boring, stitched-together automation that nobody demos on stage but actually pays the bills. Source

7. A fully automated SEO pipeline built in n8n.

An n8n workflow that connects to GA4, pulls ranking data, crawls competitor FAQs, identifies content gaps, and auto-rewrites underperforming articles. Runs on schedule, pushes updates, saves reports. Nobody’s posting demos with dramatic music — but this is what most production agents actually look like: stitched-together API calls doing boring work reliably. Source

🪵 Kindling Practice: The Idle Loop

Stop working on the problem. Seriously.

In 1926, Graham Wallas published The Art of Thought, breaking creative thinking into four stages: preparation (load the problem), incubation (walk away), illumination (the answer appears), and verification (check it).

The incubation stage got the least respect for about 80 years. Managers don’t love "go for a walk" as a deliverable. Then researchers started testing it properly.

Sio and Ormerod (2009) ran a meta-analysis across 117 studies and found incubation produces a reliable, medium-sized effect on creative problem solving. But the findings were specific: the effect was strongest for divergent thinking tasks, and low-demand activities (walking, showering, tidying) outperformed high-demand activities during the incubation period. (Psychological Bulletin, 135, 94-120)

Baird et al. (2012) found that people who did a simple, boring task during incubation — one that maximised mind-wandering — performed 41% better on creative problems they’d already been exposed to. The key: mind-wandering only helped for problems you’d already started thinking about. Your unconscious isn’t a general-purpose idea generator. It’s a background process that keeps working on whatever you loaded before you walked away. (Psychological Science, 23(10), 1117-1122)

How to do it:

Load the problem. Spend 10-20 minutes genuinely working on it. Write down what you know, what you don’t, and where you’re stuck. Don’t skip this — incubation without preparation is just procrastination.
Pick a boring physical task. Walk. Shower. Wash dishes. Fold laundry. The activity needs to occupy your hands and eyes but leave your mind free. Scrolling your phone doesn’t count — it fills the channel your unconscious needs.
Set a timer for 20 minutes. Sio and Ormerod found the sweet spot in the data. Too short and spreading activation doesn’t complete. Too long and you lose the thread.
Don’t chase the insight. If a thought about the problem surfaces, let it exist without grabbing it. You’re not brainstorming. You’re running a background process.
Sit back down and write freely for 5 minutes. Don’t evaluate, just capture. The illumination stage doesn’t always feel like a lightning bolt — often it arrives as "wait, what if…" while you’re writing the first thing that comes to mind.

Why this matters now: Builders live in a world of instant feedback loops. Type a prompt, get a response. Push code, see the build. The incubation research says the hardest problems — system architecture, product direction, that bug that doesn’t make sense — benefit from the opposite rhythm. Load the problem, then leave it alone. The 41% improvement isn’t marginal. It’s the difference between the obvious solution and the one nobody else sees.

🔭 Horizon — What’s happening out there

GPT-5.2 spent 12 hours reasoning and discovered a new physics formula. OpenAI published a preprint with researchers from the Institute for Advanced Study, Vanderbilt, Cambridge, and Harvard. The model conjectured a formula proving a specific gluon interaction — one physicists assumed would always equal zero — can actually occur. A separate model formally proved the conjecture correct. First time an AI has independently produced a novel mathematical result in theoretical physics. Source

Anthropic shipped Claude Sonnet 4.6 — flagship performance at a fifth of the cost. 1-million-token context window, better coding, improved computer use, now the default for free and Pro users. VentureBeat reports it matches Opus-class performance on most benchmarks. The model most people actually use just got much better, and the price gap between "good enough" and "best available" is almost nothing. Source

Google’s Gemini 3 Deep Think scored 84.6% on ARC-AGI-2, lapping the field. For context: GPT-5.2 scored 52.9%, Claude scored 68.8%. It’s computationally heavy ($13.62 per task) but verified by the ARC Prize Foundation. Also hit 48.4% on Humanity’s Last Exam without tools. A genuine 16-point gap over the next best result. Source

Scientists found a protein that rejuvenates aging neural stem cells. A team at the National University of Singapore identified DMTF1 — a transcription factor that declines significantly in aging brains. When they restored DMTF1 levels in aged neural stem cells, the cells regained their ability to regenerate. Published in Science Advances on Feb 12. Early-stage (mouse models), but a specific, targetable mechanism rather than the usual "aging is complicated" handwave. Source

SpaceX Crew-12 arrived at the ISS on Valentine’s Day. Four astronauts launched aboard Falcon 9 from Cape Canaveral on Feb 14, docking the following day. The 12th crew rotation under NASA’s Commercial Crew Program. Less dramatic than Starship tests, but the cadence is the story — routine crewed spaceflight is now genuinely routine. Three more Falcon 9 launches from Vandenberg in February alone. Source

Agents In The Wild is a newsletter for builders. AI agents in production, ideas worth stealing, and signal from the frontier. Hit reply if something sparked an idea.