An agent that
researches its own
improvements.
Learn Loop is an autonomous experimentation engine. Each cycle it proposes an idea, rewrites the customer-support agent, evaluates it, and scores the result on held-out, unseen tasks — then feeds what it learned into the next cycle. You can nudge it with ideas; it keeps improving on its own.
Airline customer support
τ-bench airline domain: the agent follows a written policy and uses booking-DB tools to resolve a simulated user's request.
Graded on the unseen
Every change is scored by pass^1 on 20 held-out tasks the agent never trains on — the honest measure of generalization.
kapso engine, daily
An ideate→implement→evaluate loop (Claude Opus) runs on a schedule, each cycle building on the last best version.
Nudge the loop
Stack ideas for the agent to try. Each pending idea is fed in as extra context on the next cycle — then turns green once applied.
No ideas yet — be the first to nudge the loop.