A capability-bound LLM agent
capa_agent_demo is an LLM agent harness in about 650 lines of Capa across four files. The model can call a small set of tools (read a file, list a directory, fetch a GET-only URL, read the clock); each tool's authority is statically narrowed by the type system; and the capability manifest is the audit contract. This page describes the real repository; every manifest shown here was regenerated from its source with capa --manifest in June 2026.
The problem
Mainstream agent harnesses (LangChain, OpenAI function calling, MCP servers) ship tools as arbitrary functions with no permission system. Whether a prompt injection that says "now delete the home directory" succeeds depends on whether some tool happens to reach subprocess.run or an unrestricted filesystem handle somewhere deep in its body. The blast radius is implicit, dynamic, and auditable only by reading every line of every tool.
Industry guidance on this (the "lethal trifecta" framing, model-vendor security guidelines) ultimately rests on convention: this tool only reads, trust me. The demo replaces the convention with a type-system guarantee: a tool's capability surface is its signature, the agent loop's surface is the union of its tools' signatures, and the compiler refuses to type-check a call to any authority a function did not name.
What the demo is
A four-tool agent loop against the Anthropic Messages API. The model runs for up to 5 turns: it can issue tool calls, read the results, and emit a final answer. Every tool call is forwarded through one function, dispatch_tool, which is the narrow waist: its signature holds the entire authority union the loop is allowed to exercise.
| Tool | Capability | What it does |
|---|---|---|
read_file |
ReadOnlyFs |
Read a file's contents. |
list_dir |
ReadOnlyFs |
List a directory's entries. |
get_url |
GetOnlyHttp |
HTTP GET against an allow-listed host (no POST, PUT, or DELETE). |
current_time |
Clock |
Return Unix seconds. |
ReadOnlyFs and GetOnlyHttp are not built-ins. They are user-defined capabilities in attenuated.capa, implemented as cap-bearing structs: ReadOnlyFsImpl holds the underlying Fs internally and exposes only read_file and list_dir; GetOnlyHttpImpl holds the raw HTTP client plus a host allow-list and exposes only fetch. A holder of ReadOnlyFs cannot call Fs.write because the wrapper never exposes it. The wrapper legitimately holds the raw Fs inside; what the analyzer guarantees is that the implementation stays within the capability surface it declares, and that consumers can never reach past that surface to the authority underneath.
The audit moment
One command answers the question every agent-security review starts with: what is the worst this agent can do?
$ capa --manifest agent.capa
tool_read_file -> [ReadOnlyFs]
tool_list_dir -> [ReadOnlyFs]
tool_get_url -> [GetOnlyHttp]
tool_current_time -> [Clock]
dispatch_tool -> [ReadOnlyFs, GetOnlyHttp, Clock]
run_agent_loop -> [Stdio, Logger, LlmClient, ReadOnlyFs, GetOnlyHttp, Clock]
main -> [Stdio, Env, Clock, Fs, Unsafe]
The real command emits a JSON manifest; the listing above is a condensed function-to-capabilities view of it. The function names and capability sets are taken verbatim from the actual output.
Read the run_agent_loop line and you are done. The model is in control of that function for up to 5 turns, and its blast radius is exactly those six capabilities: no write-capable Fs, no POST-capable HTTP, no Net, no Db, no Unsafe. A prompt injection cannot widen the surface, because the compiler refuses to type-check a call to an authority the function did not declare; there is nothing for the injected text to reach.
main is the wiring point. It legitimately holds Unsafe and the raw Fs, passes them through three factories (make_readonly_fs, make_get_only_http, make_anthropic_client), and hands the loop only the attenuated versions. Past main, the unattenuated authorities are unreachable. That split is visible in the manifest itself: the dangerous line is the one line a human has to review, and the function behind it is under twenty lines of wiring.
The manifest as the audit contract
The same source feeds capa --cyclonedx, so the per-function authority bound above ships inside a standard CycloneDX SBOM. A reviewer assessing an agent product built this way reads the agent's full authority boundary statically, from the artifact, without reading the agent's source and without trusting an attestation about its behaviour. The bound holds by construction: the compiler will not produce the artifact if the code exceeds it. The manifest page covers the format; the regulatory page covers where such evidence fits in CRA-style frameworks.
The demo's own supply chain
The harness consumes two Capa seed libraries through capa.toml, both pinned and key-verified:
[dependencies.capa_http]
git = "https://github.com/nelsonduarte/capa_http"
tag = "v0.1.3"
verify_key = "6C1D222D491FB88031E041A536CFB426101AA24B"
[dependencies.capa_log]
git = "https://github.com/nelsonduarte/capa_log"
tag = "v0.1.2"
verify_key = "6C1D222D491FB88031E041A536CFB426101AA24B"
capa install runs three independent checks on each dependency: the SHA in capa.lock must match the resolved commit (catches tag retags), git verify-tag must match the pinned GPG fingerprint (catches account compromise), and when the dependency declares a verify_key and lives on GitHub, the SLSA L2 build-provenance attestation is verified through Sigstore Rekor via gh attestation verify.
Run it
You need an Anthropic API key; the tool calls themselves run locally.
$ git clone https://github.com/nelsonduarte/capa_agent_demo
$ cd capa_agent_demo
$ capa install
$ export ANTHROPIC_API_KEY=sk-ant-...
$ capa --run agent.capa -- "What's the weather in Lisbon today? \
Use get_url with https://wttr.in/Lisbon?format=3"
The get_url allow-list is hard-coded in agent.capa (wttr.in and api.github.com); the weather prompt works because wttr.in is on it. Point the model at any other host and the wrapper refuses before a request is made.
Honest limits (v0.1)
- Single user prompt per run. Conversation history does not persist across
capa --runinvocations; the loop already supports multi-turn, the entry point does not expose it. - Anthropic Messages API only. Adding another provider is one new
LlmClientimplementor; the loop and the dispatcher do not change. - One string input per tool. Multi-argument tool schemas would extend
dispatch_tool's JSON extraction; the point of the demo is the capability claim, not a schema language. - Hard-coded host allow-list. Production would source it from configuration or a
restrict_tocall. - No streaming. v0.1 is request-response.
Where to go next
Read the repository →
Four .capa files: the loop, the attenuated wrappers, the LLM client, the tools. Dual MIT or Apache-2.0.
Define your own capability →
The tutorial chapter behind ReadOnlyFs and GetOnlyHttp: capability + impl + a cap-bearing struct.
The manifest format →
What capa --manifest emits, and how it maps onto CycloneDX, SPDX, VEX, and SLSA provenance.
How Capa compares →
Why convention-based tool permissions cannot give an auditor this property, and which languages get close.