Treeship
Get started

Coverage levels

What each harness coverage tier actually claims — and what it doesn't.

A harness's coverage level is a one-word promise about how much of an agent's behavior Treeship can observe through that harness. Each level documents both what it captures and what it doesn't, so readers of a session report aren't surprised by gaps.

LevelWhat's capturedBackstopsTypical harnesses
Highfiles.read, files.write, commands.run, mcp.call, model/providergit-reconcileClaude Code (native hook + MCP)
Mediumfiles.write, commands.run, mcp.call (no files.read, no model attribution by default)git-reconcileCursor, Cline, Codex, Hermes, OpenClaw, Ninja Dev
Basicfiles.write (via backstop), command boundary onlygit-reconcile is doing most of the workShell-wrap custom agents, SuperNinja remote
Backstop onlyfiles.write only, after the factgit-reconcile aloneManual / unknown agents working in the same workspace

Potential vs verified

A harness manifest's coverage level is a potential: what the harness could capture if attached and working as designed. It is not a claim that the harness has actually captured anything in your workspace.

Run treeship harness inspect <id> and you'll see two distinct rows:

potential captures (when attached and working):
  files.read     yes
  files.write    yes
  commands.run   yes
  mcp.call       yes
  model/provider yes

verified captures (proven by harness-specific smoke):
  (none yet -- run a real session through this harness)

The verified captures row only fills in when a smoke session has exercised that specific signal end-to-end through the harness's own capture path. v0.9.8's treeship setup smoke is generic — it proves Treeship's signing pipeline works, not that Claude's native hook fired. Setup leaves verified captures empty and promotes harnesses to Instrumented rather than Verified.

The semantic split is enforced at the type level. PotentialCaptures is bool per signal; VerifiedCaptures is Option<bool>. Nothing in the codebase can render one in the other's slot.

High coverage in detail

The Claude Code native hook harness is currently the only high-coverage integration. It captures:

SignalHow
files.readPostToolUse hook fires on Read; agent.read_file event records path + digest.
files.writePostToolUse hook fires on Write/Edit; agent.wrote_file records path, digest, op (created / modified / deleted), additions/deletions.
commands.runPostToolUse hook fires on Bash; agent.ran_process records command (sanitized), exit code, duration.
mcp.callThe bundled @treeship/mcp server records every MCP-routed tool call with sanitized inputs.
model/providerSessionEnd hook records the model that produced the session (claude-opus-4-7, etc.) and provider (anthropic).

Known gap: tools the user invokes manually inside a Bash command (e.g. sed -i ...) don't fire a per-tool hook. The git-reconcile backstop catches the resulting file changes at session close, tagging them with [git-reconcile] instead of [hook].

Medium coverage in detail

MCP harnesses (Cursor, Codex, Cline, Ninja Dev, generic-mcp) capture what passes through MCP:

SignalHow
files.writeMCP tool calls that touch files (write_file, edit_file) record path + digest.
commands.runMCP tool calls that exec processes record command + exit code.
mcp.callEvery MCP-routed tool invocation is recorded with tool name and sanitized inputs.
files.readNot captured by default. Some clients route reads through MCP (Cursor's editor read). Most don't.
model/providerNot captured by default. Set via meta.model if the client emits it.

Known gap: anything the agent does outside the MCP channel (built-in editor actions, direct shell commands the IDE never routes through MCP) doesn't appear unless treeship wrap was used or git-reconcile catches the file change.

Skill harnesses (Hermes, OpenClaw) work differently: they install a SKILL.md that tells the agent what to capture. Coverage depends on whether the agent reads the skill and follows its instructions; Treeship measures actual capture honestly via verified_captures.

Basic coverage in detail

Shell-wrap and SuperNinja-remote harnesses see only the command boundary:

SignalHow
files.writegit-reconcile at session close — the agent's edits are visible only as a git diff against HEAD.
commands.runtreeship wrap -- <cmd> records start/end + exit code. Not the command line itself by default — Treeship sanitizes that to avoid leaking secrets.
All other signalsNot captured.

Known gap: an attacker (or a confused agent) could write files outside the workspace, set environment variables, or make network calls; only files.write under the workspace is recovered.

Backstop only

git-reconcile in isolation. The session is essentially a "what did the workspace look like before vs after" diff. Useful as a last resort when nothing else is wired; not enough to prove what an agent did, only what happened in the directory.

How to raise coverage

HaveWantHow
Cursor (medium)highCursor doesn't currently expose a native hook surface. The MCP-routed signals are the practical ceiling.
Codex (medium)highSame — Codex's tool surface is MCP.
SuperNinja (basic)medium / highWait for v0.9.9's treeship agent invite / treeship join --invite so Treeship can run inside the remote VM.
Custom shell-wrap (basic)mediumAdd an MCP server in front of your custom agent and route tool calls through it; treeship harness inspect generic-mcp for the shape.

Reading coverage in a session report

treeship package inspect <pkg> renders the harness coverage panel as part of the report:

harness coverage (4 attached)
  Claude Code Native Hook Harness  (instrumented, coverage: high)
    potential: read=y write=y cmd=y mcp=y model=y
    verified:  (none yet -- run a real session through this harness)
    last smoke (pass): setup generic trust-fabric smoke ok (does not prove harness-specific capture)
    gap: Built-in tools the user invokes outside hooks (manual sed inside Bash) rely on git-reconcile.

Read top-to-bottom: potential is the harness's design, verified is your workspace's evidence. last smoke says exactly what was tested. gap lists the honest limits.

See also