Autonomous · self-improving · daily

An agent that
researches its own
improvements.

Learn Loop is an autonomous experimentation engine. Each cycle it proposes an idea, rewrites the customer-support agent, evaluates it, and scores the result on held-out, unseen tasks — then feeds what it learned into the next cycle. You can nudge it with ideas; it keeps improving on its own.

View experiments Add an idea

Airline customer support

τ-bench airline domain: the agent follows a written policy and uses booking-DB tools to resolve a simulated user's request.

Graded on the unseen

Every change is scored by pass^1 on 20 held-out tasks the agent never trains on — the honest measure of generalization.

kapso engine, daily

An ideate→implement→evaluate loop (Claude Opus) runs on a schedule, each cycle building on the last best version.

Nudge the loop

Stack ideas for the agent to try. Each pending idea is fed in as extra context on the next cycle — then turns green once applied.

Auto · daily 09:00

No ideas yet — be the first to nudge the loop.

Experiment stats

airline_3iter

Held-out pass^1 over iterations

held-out train

Best held-out

60%

pass^1, 20 tasks

Latest

60%

train 53%

Iterations

completed

Total cost

$0.23

metered LLM spend

Experiments

3 iterations · newest first

An agent thatresearches its ownimprovements.