Migrating from Python

You do not have to rewrite a Python program to get a Capa authority manifest out of it. Wrap the Python file in a thin Capa shell that delegates everything, then move one function at a time into typed Capa. At every step the Unsafe capability shrinks and the per-function manifest gets more honest. The Python file itself never changes; only the .capa file does.

Why migrate at all

Capa programs declare their authority in function signatures. A function that opens a network socket takes a Net parameter; one that reads a file takes Fs; one that does neither takes neither. The compiler reads those signatures and emits a manifest an SBOM consumer (CycloneDX, SPDX, VEX, SLSA provenance) can audit at per-function granularity.

Python has no equivalent. pip freeze lists packages, not functions; a static analyser is heuristic, and its granularity bottoms out at the import boundary. If you have a Python program whose authority surface you want to make auditable, the cheapest path is not a rewrite. Keep the Python file intact, build a thin Capa shell that does nothing but delegate, and then migrate the program into typed Capa function by function.

The running example throughout this page is migrate_logfetcher_naive.py, a ~60-line Python program that touches Fs, Env, and Net. It is paired with three .capa files that show the same behaviour at three stages of hardening. All four live in the examples directory; the snippets and manifests below are taken straight from them.

Stage 1: all Unsafe, behaviour preserved

Write a Capa file with one entry point that imports the original Python module and calls into it. Everything happens through py_import and py_invoke, which together require the Unsafe capability.

fun main(stdio: Stdio, u: Unsafe)
    bootstrap_path(u)
    let mod = py_import(u, "migrate_logfetcher_naive")
    stdio.println("step1: delegating to migrate_logfetcher_naive.main() via py_invoke")
    py_invoke(u, mod.main, [])

What capa --manifest reports about this stage:

bootstrap_path -> [Unsafe]
main           -> [Stdio, Unsafe]

The Unsafe is the audit signal. An SBOM consumer reading this manifest sees: this program escapes Capa's analysis, so I cannot make claims about its true authority surface. That is honest reporting of the not-yet-migrated state, which is exactly the point.

Stage 2: move one function at a time

Pick the function whose Capa equivalent is simplest. The first easy win is usually one that needs a single built-in capability and does not need to read structured data back from Python. In the example, save_response(path, content) is that function: two strings in, one file written, mapping cleanly to Fs.write.

fun save_response(fs: Fs, path: String, content: String) -> Result<Unit, IoError>
    return fs.write(path, content)

The rest of main still calls back into the Python module for the fetch, parse, and env-read work; only the file write is now typed Capa. The Python file is unchanged. The manifest now reports:

bootstrap_path -> [Unsafe]
save_response  -> [Fs]                  <- new, typed, no Unsafe
main           -> [Stdio, Fs, Unsafe]   <- Fs is now visible

The win is Fs becoming explicit in main's signature. The SBOM consumer can now see that the file-write authority is exercised by a typed function, not buried inside an Unsafe block.

Stage 3: fully typed, Unsafe gone

Move every remaining function into typed Capa. Once the last py_invoke is gone, the Python file is unreferenced and can be deleted. main now threads the exact capabilities the program uses and nothing more:

fun main(stdio: Stdio, fs: Fs, env: Env, net: Net)
    match load_config(fs, "config.json")
        ...

And the manifest is a clean per-function authority bound:

config_field   -> []                      pure
load_config    -> [Fs]
get_api_key    -> [Env]
build_url      -> []                      pure
fetch_status   -> [Net]
save_response  -> [Fs]
main           -> [Stdio, Fs, Env, Net]   no Unsafe

Compare this to stage 1, where the only honest thing the manifest could say about main was [Stdio, Unsafe]. The supply-chain audit story is now load-bearing: the SBOM is a true per-function authority bound rather than a single Unsafe blob.

Bridging tricks for the middle stage

The awkward part of the middle stage is moving values back and forth across the Capa-to-Python boundary. A few patterns make it tractable.

py_invoke returns Unknown, and Capa accepts it anywhere

A py_invoke call returns Unknown at the type-system level. Capa lets Unknown stand wherever a concrete type is expected, so you can pass the result of py_invoke straight into a typed Capa function without an explicit cast. That is how stage 2 feeds a Python-side response string into save_response.

The trade-off is honest: passing Unknown everywhere is unsafe in the type sense, because no real check happens until runtime. It works for the middle stage, but it is a reason to keep moving toward fully typed Capa rather than declaring victory early.

Field access on Python dicts via builtins.dict.get

If a Python function returns a dict and you need a single field, pull it through py_invoke against builtins.dict.get:

let py_builtins = py_import(u, "builtins")
let base = py_invoke(u, py_builtins.dict.get, [cfg, "base_url"])

This stays Unknown and serves the transitional stage. By stage 3 you replace it with typed JsonValue navigation, for example cfg.as_object()?.get("base_url")?.as_string()?.

Result and Option chaining in fully-typed Capa

Stage 3's config_field uses the explicit-match style: pattern-match on the Option, return Err with a descriptive message on None. The ? operator is also available for Result and Option chaining if you prefer the terser form.

Track your progress

You do not have to read manifests by eye to know how far a migration has come. capa migrate <file.capa> reports it directly:

$ capa migrate examples/migrate_logfetcher_step2_mixed.capa
Migration progress for examples/migrate_logfetcher_step2_mixed.capa
  [########----------------] 33% Unsafe-free
  1/3 function(s) are Unsafe-free; 2 still use Unsafe.

Next, consider hardening (fewest bridge calls first):
  - bootstrap_path  ...:26:1  (5 bridge calls)
  - main            ...:39:1  (9 bridge calls)

It surfaces three things:

Add --json for the machine-readable form, which is useful in a CI gate that watches the percentage trend upward over a migration.

When to stop

The migration is complete when:

You can also stop sooner. If a single function genuinely needs a Python library Capa does not cover, leave that one function with Unsafe; the rest of the program still benefits from typing. The Unsafe is then a precise audit signal pointing at the one place that needs human review.

Honest limits

Reproduce it

The three Capa files compile and emit honest manifests:

# Each stage type-checks
$ capa --check examples/migrate_logfetcher_step1_unsafe.capa
$ capa --check examples/migrate_logfetcher_step2_mixed.capa
$ capa --check examples/migrate_logfetcher_step3_typed.capa

# The manifest progression: Unsafe shrinks, real capabilities appear
$ capa --manifest examples/migrate_logfetcher_step1_unsafe.capa \
    | jq '.functions[] | {name, declared_capabilities}'
$ capa --manifest examples/migrate_logfetcher_step2_mixed.capa \
    | jq '.functions[] | {name, declared_capabilities}'
$ capa --manifest examples/migrate_logfetcher_step3_typed.capa \
    | jq '.functions[] | {name, declared_capabilities}'

Running any of the three actually fetches data and writes a file, so you need a config.json next to where you run it and a LOGS_API_KEY environment variable set. The full setup is in the docstring of migrate_logfetcher_naive.py. The full writeup is in docs/migration.md.

Where to go next