>_
$

>Peak Progressive Discovery

Author: Miguel Castro | Date: 2/15/2026

TL;DR

Most agentic systems fail the same way: they start “over-exposed”, too many tools, too many subagents, too much context, so the model thrashes, misuses capabilities, and burns tokens. Peak Progressive Discovery flips the default: every agent begins with only the minimum surface area required for correctness, and optional capabilities stay fully invisible until a relevant skill is explicitly loaded. This post lays out a platform-agnostic blueprint, plus where it’s weak today and what I’d add next.


Reaching Progressive Discovery

You start with a capable agent. You give it five tools. It works great. You give it twenty tools, ten distinct sub-agents, and a 40 page documentation corpus, and suddenly your “intelligent” system starts hallucinating parameters, confusing distinct APIs, or getting stuck in analysis paralysis loops.

This isn’t a failure of the model (entirely). It’s a failure of exposure. We’ve been building agentic systems that hand the AI a Swiss Army knife with 100 blades and expect it to intuit which three blades actually matter for cutting this particular rope. The result is predictable: confused agents, bloated token counts, and a peculiar form of technical debt where the system appears capable of everything while reliably doing nothing well.

The answer isn’t simply a larger context window. We can use millions of tokens now, and it hasn’t solved the problem because attention mechanisms are not infinite resources, they are noisy. The answer, for now, is structural.

A common approach today to tackle this is progressive discovery, popularized (in part) by Anthropic’s skills: markdown files an agent loads on demand. This introduced a simple and potent way to organize useful, but not always needed, information. However, I don’t think this is enough. So I’ll describe the system I want, one that would let progressive discovery reach its peak potential. Let’s humbly call it Peak Progressive Discovery.

Always-Needed vs. Optional

The foundation of Peak Progressive Discovery is a simple but strict classification: for every capability your system offers, you must decide if it’s always-needed or optional. There is no middle ground.

Always-needed capabilities are primitives so fundamental that hiding them behind a skill would add friction without benefit. For a dinner party planner, this might be:

  • The ability to ask for date, time, and guest count
  • A basic menu proposal workflow
  • Shopping list generation

These are the core business rules of the domain. If every dinner party requires them, they belong in the agent’s base context.

Optional capabilities are everything else. These are the “specialty tools” that only matter in specific scenarios:

  • Allergy-safe menu planning
  • Wine pairing recommendations
  • Outdoor venue contingency planning
  • Dietary restriction verification

Optional means invisible: not listed, not hinted at, not described, until the relevant skill is loaded.

Adding Deterministic Guardrails

We rely too much on the model “remembering” to behave. “Don’t run this if the X is missing” is a weak prompt.

The architecture must enforce Programmatic Prerequisites. We need a policy layer with rigid checks and reliable checkpoints to drive the process. For example:

  • Precondition: “Cannot invoke deploy_production without user_confirmation artifact.”
  • Action: Agent tries to deploy.
  • System Response: 403 Forbidden: Missing artifact ‘user_confirmation’.

The model doesn’t need to reason about safety; it just needs to handle the error. This separates “business logic” (the AI) from “safety logic” (the code).

Da Rules

  • Explicit Classification
  • Skill Load Reveals Capabilities
  • No Mirror Skills
  • Reuse Without Global Visibility
  • Prerequisites are Enforced Deterministically
  • Artifact Exposure Is Minimized

Current Limitations

Now for the part where I tell you what doesn’t work yet. Because any system that claims to be perfect is either lying or doesn’t recognize its trade-offs.

1. Nesting Hell

It is tempting to make this fractal, skills loading skills loading skills. But nested structures in AI (and in general) can be fragile. The deeper you stack the context, the more “telephone” distortion occurs. It is more points of failure in the system (not that you can always avoid it, especially for complex systems).

A pure Peak Progressive Discovery system risks becoming a labyrinth. In practice, a two-tier system (e.g., Orchestrator + Executor, Executor + Reviewer, Planner + Implementer) is often more reliable than a deep tree for agentic systems. Ideally we don’t build Russian Dolls, but that is not very synergistic with this system.

2. Skill Issues

Agents are currently bad at knowing when to use a Skill. Without a highly specified process or rubric, an agent might stare at a problem, not realizing it has a debug_skill available to load.

We currently have to “hard-code” the triggers (e.g., “If you see an error log, load the Debug Skill”). True semantic autonomy, where the agent intuitively knows which latent skill to grab, is still a pending objective.

3. No “Out of The Box” Solutions

I’ve tried using tools like Claude Code and OpenCode with this system without (straightforward) success. OpenCode was the closest, but limitations (at the time of writing), like not letting subagents spawn subagents, meant it wasn’t enough to adapt the system from the get-go.

4. Clearly defined and complex issues

A system like this only becomes necessary on higher complexity tasks that warrant not overloading the agent with a lot of information. And for clearly defined tasks. More open-ended or simpler tasks are not a good match.

Closure

Progressive Discovery is about structure. It’s about accepting that hiding information from your AI is just as important as giving it information.

Progressive discovery is my current best answer to a systems question: how do we build agentic behavior that stays coherent as capability surfaces grow? Hide what isn’t needed. Reveal what is. Enforce prerequisites in policy, not prose. Keep artifacts session-scoped.

If you’re building this, start with a small implementation. For example:

  • one agent,
  • two skills,
  • one policy gate,
  • and a trace log for visibility changes.

Everything else can evolve, progressively.

ONLINE UTF-8 SSH-2.0
READ ONLY