Language reference

Full specification of the syntax and semantics of the Capa language. For a guided introduction, see the Learn track. For the built-in APIs, see the standard library page.

1. Lexical structure

1.1. Encoding

UTF-8 is required. Identifiers may contain any Unicode letter, digits, and _, but must start with a letter or _.

1.2. Comments

// Line comment (runs to the end of the line)
/// Doc comment (attaches to the next declaration)
/** Block doc comment (same role) */

Regular block comments /* ... */ are also accepted by the lexer (and ignored). Only the doc variants are attached to AST nodes.

1.3. Indentation

Capa is indentation-sensitive, à la Python. Implicit INDENT/DEDENT/NEWLINE tokens are produced by the lexer:

1.4. Implicit continuation by leading dot

For multi-line method chaining, a line beginning with . is treated as a continuation of the previous line:

let r = xs
    .filter(...)
    .map(...)
    .fold(...)

1.5. Keywords

fun let var if then elif else match while for in
break continue return import const type trait impl capability
true false and or not consume self Self
async await yield defer where mut

The last row lists reserved-for-future-use keywords. The lexer recognises them; the parser rejects their use.

1.6. Literals

TypeExamples
Integer42, -7, 0, 1_000_000, 0xff, 0o755, 0b1010
Float3.14, 2.0, 1e10
String"hello", "a\nb", "x = ${x}"
Char'a', '\n'
Booltrue, false
List[1, 2, 3], []
Tuple(1, "a"), (x,), ()
Rangea..b (exclusive), a..=b (inclusive)

1.7. Interpolated strings

${expr} inside a string literal is parsed as a Capa expression:

let n = 7
"value = ${n * 2}"  // "value = 14"
"len = ${xs.length()}"

$$ is the literal-$ escape. Nested string literals inside interpolation are not supported.

2. Type system

2.1. Primitive types

Int, Float, String, Bool, Char, Unit. See the standard library for the methods on each.

2.2. Compound types

ConstructSyntax
ListList<T>
Tuple(T1, T2, ..., Tn)
FunctionFun(T1, T2) -> Ret
MapMap<K, V>
SetSet<T>
OptionOption<T>
ResultResult<T, E>

2.3. User-defined types

Structs:

type Person { name: String, age: Int }

Sum types (nominal variants):

type Shape =
    Circle(Float)
    Rectangle(Float, Float)
    Square(Float)

Variants may have zero or more payloads. Variants without a payload (type X = A) are constants, used without ().

The variant names Ok, Err, Some, and None are reserved: a user-defined sum type cannot redeclare any of them. They belong to the built-in Result and Option; shadowing them would silently change the meaning of pattern matches across a module.

2.4. Generics

Functions and types can take type parameters delimited by <>:

fun first<T>(xs: List<T>) -> Option<T>
    return xs.first()

type Pair<A, B> { first: A, second: B }

Local inference: the caller rarely needs to supply explicit args. first<Int>([1,2,3]) is equivalent to first([1,2,3]).

2.5. Cross-statement inference

let xs = [] produces List<TyVar>. The first use pins the type parameter:

let xs = []
xs.push(42)        // OK, infers List<Int>
xs.push("oops")    // error: expects Int, got String

TyVar sharing propagates through aliases (let ys = xs) and into calls to typed functions (process(xs) where process: List<Int> -> ...).

2.6. Compatibility

compatible(expected, actual) is structural with exceptions:

3. Statements

3.1. Bindings

let name = "Ana"               // immutable, type inferred
let age: Int = 30              // immutable, explicit type
var counter = 0                // mutable
counter = counter + 1          // assignment (only for var)

Pattern matching in bindings:

let (a, b) = pair()            // tuple destructuring
let Person { name, age } = p   // struct destructuring

3.2. Control flow

// if-statement
if cond
    body1
elif cond2
    body2
else
    body3

// while
while cond
    body

// for
for x in iter
    body

// match (statement)
match scrutinee
    pat1 -> body1
    pat2 -> body2

// match (expression, multi-line)
let r = match scrutinee
    pat1 -> expr1
    pat2 -> expr2

// match (expression, inline single-line)
let r = match scrutinee { pat1 -> expr1, pat2 -> expr2 }

// break / continue (only inside loops)
break
continue

// return
return                  // returns ()
return expr             // returns a value

3.3. Expressions as statements

Any expression can be a statement (value discarded):

stdio.println("hello")      // call with side effect
xs.push(42)                 // mutation
1 + 2                       // value discarded (valid but useless)

4. Expressions

4.1. Operators

In decreasing precedence:

OperatorDescription
() [] .Call, index, field access
not -Unary
* / %Multiplicative
+ -Additive
.. ..=Range
< <= > >= == !=Comparison
andShort-circuit conjunction
orShort-circuit disjunction
?Try (Err propagation)

4.2. if as an expression

let cat = if cond then e1 else e2

The then keyword is the discriminator: without it, if is a statement.

4.3. match as an expression

match is the same production whether used as a statement or as an expression: the value is consumed in expression position and discarded in statement position. Two surface forms exist:

// Multi-line (indented arms, expression OR block body)
let r = match scrutinee
    pat1 -> expr1
    pat2 -> expr2

// Inline (single-line, comma-separated, expression body only)
let r = match scrutinee { pat1 -> expr1, pat2 -> expr2 }

Both forms accept guards and or-patterns. All arms must produce compatible types.

The inline form's { ... } opens immediately after the scrutinee. This collides syntactically with the struct-literal heuristic; to force a struct literal as the scrutinee, wrap it in parentheses:

match (Point { x: 1.0, y: 2.0 })
    Point { x, y } -> stdio.println("${x}, ${y}")

4.4. Lambdas

fun (x: Int) -> Int => x * 2                    // single-expression
fun (x: Int) -> Int =>                          // block body
    let y = x * 2
    return y + 1
fun () -> Int => 42                             // no params
fun (a: Int, b: Int) -> Int => a + b            // multiple params

Lambdas capture the lexical environment. If a single-line lambda contains a nested match, the transpiler automatically promotes it to a nested function.

4.5. The ? operator

Propagates Err in functions that return Result:

fun read_two(fs: Fs) -> Result<(String, String), IoError>
    let a = fs.read("a")?      // if Err, returns immediately
    let b = fs.read("b")?
    return Ok((a, b))

5. Pattern matching

5.1. Available patterns

PatternSyntaxMatches
Wildcard_Any value
IdentifierxBinds to x
Literal42, "x", trueEquality
Variant without payloadNoneSingleton variant
Variant with payloadSome(x), Ok(v)Match + bind
StructPerson { name, age }Match + bind fields
Tuple(a, b), (x, _, z)Tuple of the same arity
Or-patterna | b | cAny alternative

5.2. Or-patterns with bindings

Each alternative can bind variables, provided all of them bind the same set of names with compatible types:

match op
    Add(n) | Sub(n) | Mul(n) -> n   // n is Int in all

5.3. Guards

match n
    x if x > 0 -> "positive"
    x if x < 0 -> "negative"
    _ -> "zero"

5.4. Exhaustiveness

The checker requires full coverage:

type Color = Red | Green | Blue

match c
    Red -> "r"
    Green -> "g"
    // error: missing variant Blue

5.5. Type-parameter substitution

match m.get(k) where m: Map<String, Int> infers Some(n) with n: Int, not n: T. The owner's type parameters are substituted by the scrutinee's type arguments.

6. Capabilities

6.1. What they are

Capabilities are primitive types representing access to system resources (Stdio, Fs, Env, Net, Clock, Random, Unsafe). They are only accessible via function parameters; there are no global instances.

6.2. The capability discipline (3 layers)

Structural (v1): capabilities cannot appear in struct fields, variant payloads, function return types, constants, let/var bindings, generic args, or tuples. They only flow through parameters. One relaxation: cap-bearing structs that implement a user-defined capability may hold built-in caps as fields.

Flow (v2):

Linearity (v3): the consume keyword indicates ownership transfer:

fun close(consume f: File)
    // f cannot be used after this call

"Consumed" variables are tracked across fork/merge in if/elif/else and match. In loops, the analysis uses dry-run + redo to discover consumes in the first iteration.

6.3. Capability in the signature

fun main(stdio: Stdio, fs: Fs)            // multiple
fun pure(x: Int) -> Int                   // no capabilities (pure)
fun with_consume(consume cap: MyCap)      // ownership transfer

6.4. Attenuation

Every built-in capability has an attenuator that returns a fresh, narrower instance:

CapabilityAttenuatorSemantics
Netrestrict_to(host: String)Allowed host set, monotonic intersection
Fsrestrict_to(prefix: String)Allowed path prefix, monotonic
Envrestrict_to_keys(keys: List<String>)Allowed key set, monotonic intersection
Clockrestrict_to_after(t: Float)Active only after timestamp
Randomwith_seed(seed: Int)Deterministic sequence (no denied state)

Attenuated capabilities are also recorded in the --manifest output via the args_flow field. See the manifest page.

7. Information-flow control

7.1. Security labels

Capa tracks a two-point security lattice over values: @public (the default) sits below @secret. A label attaches to a type expression, so it can appear anywhere a type is written: parameters, let/var bindings, return types, and struct fields.

fun handle(token: @secret String)              // labelled parameter
let xs: @secret List<Int> = collect()          // labelled binding

type Card { pan: @secret String, brand: String }   // labelled field

An unlabelled type is @public. The label is part of the flow analysis, not the runtime representation: a @secret String is an ordinary String at run time.

7.2. Propagation

A value's label is the join of every label flowing into it: if any input is @secret, the result is @secret. The join propagates through:

A function call returns the join of its argument labels: a pure function of a tainted input is itself tainted.

7.3. Sources and sinks

One source is secret by default: env.get(...) produces a @secret result with no annotation, because environment variables routinely carry credentials. fs.read(...) is intentionally not a source: config and data files are usually public, so annotate the binding @secret yourself when it holds a secret.

The public sinks are the exfiltration points where a @secret value reaching a sink-position argument is an information-flow violation:

CapabilitySink methods
Stdioprint, println, eprintln
Netget, post
Fswrite
Dbexec, query

7.4. Enforcement (warn-then-enforce)

By default a violation is a compile-time warning: existing, unlabelled code is unaffected, while labelled code surfaces disclosures without breaking the build. The @strict_ifc() function attribute promotes those warnings into hard errors for that function.

The leak below is caught: the value bound by the match arm inherits the secret label of the scrutinee, then reaches a Stdio sink.

@strict_ifc()
fun dump(stdio: Stdio, env: Env)
    match env.get("API_KEY")           // env.get is @secret
        Some(key) -> stdio.println(key)  // error: @secret reaches a sink
        None -> stdio.println("no key")

7.5. Declassification

declassify(value, reason: "...") is the single auditable bridge from @secret to @public. It is the identity at run time and relabels its result @public. The reason must be a named string literal, so the SBOM can record it. Declassifying a value that is not @secret is reported as a no-op warning.

@strict_ifc()
fun dump(stdio: Stdio, env: Env)
    match env.get("API_KEY")
        Some(key) -> stdio.println(declassify(mask(key), reason: "show masked key in logs"))
        None -> stdio.println("no key")

Each call is recorded in the SBOM: per-function declassifications and a declassification_sites count in the summary.

7.6. Pattern binding

A match or let destructure of a secret scrutinee taints the names it binds. After match env.get("K") { Some(key) -> ... }, key is @secret.

7.7. Anti-laundering

Labels cannot be shed by repackaging a value:

7.8. Implicit flow

A sink inside a branch (if/match) guarded by a @secret condition is an implicit flow. It is reported only under @strict_ifc: the default tier focuses on explicit data flows.

7.9. Boundaries

Two limits are by design:

8. Imports

8.1. Import forms

import util                     // sibling: ./util.capa
import sinks.csv_sink           // nested: ./sinks/csv_sink.capa
import capa_log.log             // package dep: <vendor or path>/capa_log/log.capa
import util as U                // alias the module name
import util (greet as hi, Color) // selective import with optional rename

After import util, every pub name from util.capa is reachable unqualified (greet(...)) or by qualified call (util.greet(...)). With import util as U, qualified calls take the alias.

8.1.1. Selective import (and renaming)

import foo (a, b as c) brings only the listed pub symbols into scope: a under its own name, b under the alias c. Every other pub item of foo stays hidden. This is the hygienic form, and the way to resolve a symbol collision between two dependencies that export the same pub name:

import capa_csv (parse as csv_parse)
import capa_cli (parse as cli_parse)

fun main(stdio: Stdio)
    stdio.println(csv_parse("a,b"))
    stdio.println(cli_parse("--flag"))

Only one side needs a rename; the other may keep the bare name. Selectors work for functions, types, consts, and capabilities; selecting an unrenamed pub sum type carries its variants along. A selector that names a symbol the target does not declare, or declares without pub, is a load-time error (module 'foo' has no public symbol 'X'). Renaming a sum type via as in a selective import is rejected (its variants would be orphaned); import it without as to bring its constructors. Selective import is strictly additive: import foo and import foo as bar are unchanged.

8.2. Visibility

Top-level items are private by default. Mark a function, constant, type, trait, or capability with pub to expose it to importers; anything without pub stays callable only from inside the same file.

// util.capa
fun helper(x: Int) -> Int          // private to util
    return x + 1
pub fun outer(x: Int) -> Int       // visible to importers
    return helper(x)

// main.capa
import util
fun main(stdio: Stdio)
    stdio.println("${outer(3)}")   // works: 4
    stdio.println("${helper(3)}")  // error: undefined name 'helper'

The same rule applies to qualified access: util.outer(...) works because outer is public; util.helper(...) does not. pub on root-file items is accepted but has no effect (root callers see one another regardless).

8.3. Module search paths

The loader resolves import x.y in this order: importing-file directory, CAPA_PATH entries, ./vendor/ (when capa.toml declares a git dep), the parent of every path = "..." entry in capa.toml, ./libraries/, and finally the directory of the root file. See docs/packages.md for the package manager's role.

To pull in modules that live elsewhere on disk (stdlib-style libraries, shared internal modules), set CAPA_PATH to one or more additional roots separated by your platform's path separator (; on Windows, : elsewhere). The importer-relative path always wins when the same module name exists in both places, so the env var never silently shadows a project-local file.

$ export CAPA_PATH=/usr/local/share/capa:./libs
$ capa --run app.capa
# 'import greeter' first tries ./greeter.capa, then
# ./libs/greeter.capa, then /usr/local/share/capa/greeter.capa.

If no candidate exists, the diagnostic lists every path that was tried so the right next step (install the dependency, fix the import, adjust CAPA_PATH) is obvious.

8.4. Python interop

For Python interop, use the typed builtins py_import(unsafe, name) and py_invoke(unsafe, callable, args); both require the Unsafe capability. See the standard library page.

9. The main program

The entry point is a function called main that may take one or more capabilities as parameters. The capabilities are instantiated by the runtime at boot:

fun main(stdio: Stdio, fs: Fs, env: Env)
    let argv = env.args()
    stdio.println("received ${argv.length()} arguments")

If main returns Result<(), E>, an Err causes a non-zero exit code.

10. Attributes

Functions can carry static, source-level metadata via attribute syntax. The analyzer rejects unknown names, unknown keys, and duplicates; the schema is fixed. v1 recognises four attributes:

AttributeKeysRole
@securitycve, cwe, severity, fixed_in, descriptionLink a function to a known security history.
@deprecatedreason, since, use, removed_inMark an API as superseded.
@auditeddate, by, scope, notesRecord a manual security audit.
@vexcve, status, justification, detailPer-function CycloneDX VEX exploitability claim. Embeds in --cyclonedx output and surfaces in --vex.

@vex status accepts the CycloneDX VEX vocabulary (not_affected, exploitable, in_triage, resolved, false_positive); justification accepts the CycloneDX justification vocabulary (code_not_reachable, requires_configuration, and so on). See the manifest page for full output examples.

11. Compiler CLI

Flag-order reference. All flags take one or more .capa source files. Output goes to stdout unless noted.

FlagOutput
replStart the Capa REPL with every standard capability pre-bound (subcommand, not a flag).
testDiscover and run tests/test_*.capa (subcommand). --wasm runs them on the Wasm backend; --both runs both backends and diffs their stdout for cross-backend parity. A test passes on exit 0; a panic fails it.
--runTranspile to Python and execute in-process.
--transpileTranspile to Python and print the generated code to stdout.
--watchRe-run the program every time it (or any of its imported modules) changes on disk. Implies --run.
--wasmCompile through CIR to WebAssembly text (WAT). With --run, assembles and executes on a wasmtime-backed host that provides the Capa capability interfaces.
--wasm --componentWith --output, wrap the core module in a Component Model component (WIT embedded). Consumable by any Component-Model-aware runtime.
--prefer-wasmWith --run: try the Wasm backend first and fall back to the Python pipeline only when CIR or Wasm emission rejects a construct. Also honoured via CAPA_PREFER_WASM=1.
--docSelf-contained HTML documentation: signatures, doc comments, attributes, per-function call lists.
--manifestCapa-native JSON manifest: per-function declared capabilities, attributes, Unsafe crossings, call sites, args_flow.
--cyclonedxCycloneDX 1.5 SBOM (JSON). Capability metadata embedded as properties[] under the capa:* namespace. When any function carries @vex, the VEX block is embedded under vulnerabilities[].
--spdxSPDX 2.3 SBOM (JSON). Per-function capability metadata exposed via SPDX annotations[].
--vexCycloneDX VEX (JSON). One vulnerabilities[] entry per @vex attribute, with affects[] pointing at the function's bom-ref.
--provenanceSLSA Build L1 provenance attestation: in-toto Statement v1 envelope wrapping a SLSA Provenance v1.0 predicate, subject = SHA-256 of the source.

Full output examples and per-flag schemas on the manifest page. The examples/sbom_diff.capa and examples/vex_demo.capa auditor programs demonstrate consumption.

12. Differences from Python

Capa transpiles to Python 3.10+, but the semantics differ:

CapaPython
Capabilities required for I/OGlobals such as print, open
Types checked at compile timeDuck typing
Exhaustive match checkedmatch at runtime, no exhaustiveness
Or-patterns with consistent bindingsOr-patterns without bindings
let x: List<Int> = [] validPython equivalent has no checks
Mutation only with var or consumeEverything mutable
Manifest, SBOM, doc emitted by compilerManual via external tools

13. Known limitations

For the full roadmap, see the roadmap page and the TODO.md.