Learn Loop
Autonomous · self-improving · daily

An agent that
researches its own
improvements.

Learn Loop is an autonomous experimentation engine. Each cycle it proposes an idea, rewrites the customer-support agent, evaluates it, and scores the result on held-out, unseen tasks — then feeds what it learned into the next cycle. You can nudge it with ideas; it keeps improving on its own.

Ideateresearch + propose
Implementrewrite the agent
Evaluatetrain signal
Held-out score20 unseen tasks
Feedbackwhat to fix next
AutonomousLearn Loop

Airline customer support

τ-bench airline domain: the agent follows a written policy and uses booking-DB tools to resolve a simulated user's request.

Graded on the unseen

Every change is scored by pass^1 on 20 held-out tasks the agent never trains on — the honest measure of generalization.

kapso engine, daily

An ideate→implement→evaluate loop (Claude Opus) runs on a schedule, each cycle building on the last best version.

Nudge the loop

Stack ideas for the agent to try. Each pending idea is fed in as extra context on the next cycle — then turns green once applied.

Auto · daily 09:00

Sign in (Leeroo) to submit ideas and trigger runs · 0 queued for the next cycle.

No ideas yet — be the first to nudge the loop.

Experiment stats

airline_3iter

Held-out pass^1 over iterations

held-out train
Best held-out
60%
pass^1, 20 tasks
Latest
60%
train 53%
Iterations
3
completed
Total cost
$0.23
metered LLM spend

Experiments

3 iterations · newest first