Dogfood checklist
Post-ship review template — drive a real session through each surface and report what was useful, confusing, or broken.
When a release ships, run a real session through Treeship from a fresh install and answer the questions below. The goal is to catch what cargo test and the acceptance smoke can't: what the human (or AI agent) actually experiences when they hold the product in their hands. File the result as a comment on the release PR or as a markdown file under docs/dogfood/<version>.md.
This template is deliberately short. The lesson from the v0.9.6 → v0.9.10 release path: the things that broke weren't the things tests were watching. A 60-second dogfood from a fresh install would have caught the agent-bootstrap install timeout, the broken docs section roots, and the missing receipt page SSR before they shipped.
Setup (one-time per dogfood run)
# Fresh tmpdir to keep the run isolated from any existing workspace
TMP=$(mktemp -d -t treeship-dogfood.XXXX)
cd "$TMP"
treeship init --config "$TMP/.treeship/config.json" --name dogfoodCapture the version under test:
treeship --version
# treeship X.Y.ZRun a session
treeship session start --config "$TMP/.treeship/config.json" --name dogfood
treeship wrap --config "$TMP/.treeship/config.json" --action dogfood.write -- \
sh -c 'echo hello > a.txt'
treeship session close --config "$TMP/.treeship/config.json" --summary "dogfood run"
PKG=$(find "$TMP/.treeship/sessions" -name 'ssn_*.treeship' -print -quit)
treeship package inspect --config "$TMP/.treeship/config.json" "$PKG"Then push and grab the shareable URL:
treeship session report --config "$TMP/.treeship/config.json" "$PKG"
# → captures receipt_url, raw_json_url, package_download_urlThe checklist
Fill in each line. Honest one-line answers are better than long ones.
Identity
- Treeship version:
____ - Date / time:
____ - Host:
____(uname -srm + node version + python version) - Receipt URL:
____(the shareable URLtreeship session reportreturned)
Agents and harnesses
- Agent under test:
____(claude-code, cursor, codex, hermes, openclaw, ninjatech, generic-mcp, shell-wrap, custom) - Harness used:
____(fromtreeship harness list) - Harness status before this run: Detected | Available | Instrumented | Verified
- Harness status after this run: Detected | Available | Instrumented | Verified
What was useful
Three things — be specific. Examples: "the panel's per-grant breakdown made the consume chain obvious," "JSON output of harness list was easy to pipe into jq," "the receipt URL rendered the Merkle tree without me asking for it."
____________
What was confusing
Three things — be specific. Examples: "I had to guess that treeship attest action needed --parent <grant_artifact_id> even though the help text just said --parent <ID>," "the docs said setup --format json returned a rich shape but it returned {}," "I couldn't tell whether replay-hub-org was supposed to fire."
____________
What broke
Anything that errored, hung, or produced an obviously-wrong result. Include the exact command and the error message. If it didn't break, write "nothing broke."
____
Agent-readable surfaces
treeship --versionreturned a version string: ☐ yes / ☐ notreeship add --discover --format jsonparsed cleanly: ☐ yes / ☐ no / ☐ NAtreeship harness list --format jsonparsed cleanly: ☐ yes / ☐ notreeship package inspect <pkg> --format jsonparsed cleanly: ☐ yes / ☐ no / ☐ NA- Raw receipt JSON downloaded from the receipt URL: ☐ yes / ☐ no / ☐ NA
.treeshippackage downloaded from the receipt URL: ☐ yes / ☐ no / ☐ NA
Verification
treeship verify <receipt_url>returnedoutcome: pass: ☐ yes / ☐ notreeship package verify --strict <pkg>returned exit 0: ☐ yes / ☐ no- Every approval-authority replay row matched expectations (no overclaim, no silent pass): ☐ yes / ☐ no / ☐ NA
Install / bootstrap (only if you started from zero)
npm install -g treeshipcompleted in under 60s on a clean machine: ☐ yes / ☐ no / ☐ NApip install treeship-sdkcompleted without compiling Rust from source: ☐ yes / ☐ no / ☐ NA- AI agent (Claude, Codex, Kimi, Cursor) was able to set up Treeship without asking the human a clarifying question: ☐ yes / ☐ no / ☐ NA
- An AI agent's
treeship-cli on crates.io404 confused them: ☐ yes / ☐ no / ☐ NA (if yes, point them at /guides/install)
What to do with the answers
The pattern from past releases:
| Finding | Triage |
|---|---|
| A surface returned wrong/empty data, or a panel overclaimed | Hotfix in next patch |
| A docs route 404'd or a sidebar link was broken | Docs PR; the docs-route-health CI gate should also fire |
| Onboarding took more than 5 minutes from a fresh install | Investigate; agent-native bootstrap is the durable fix |
| AI agent had to ask a clarifying question to install | Update /guides/install with the missing context |
| Confusing copy in the panel or CLI output | Copy-only fix; can ride next release |
| Test gap (something broke that no test would have caught) | Add regression test in next hardening PR |
File a dogfood result for every release that changes the user-facing surface — a CHANGELOG-only release doesn't need one, but a release with new commands, new panel rows, or new install paths absolutely does. Save the file under docs/dogfood/<version>.md so the trail is durable.
This checklist is a sibling of release-adversarial review — they cover different failure modes. Adversarial review hunts trust bypasses an attacker would exploit; dogfood hunts UX gaps a real user would hit. A release passes both gates before it's actually done.
Mini-template (copy-paste into a comment)
For drive-by dogfood runs that don't need the full structure:
## Dogfood: vX.Y.Z (date)
host: ...
agent: ...
harness: ...
receipt: <url>
useful:
- ...
- ...
- ...
confusing:
- ...
- ...
- ...
broken:
- ... (or "nothing")That's enough to drive the next hardening PR.
Verifying receipts on the edge
Deploy Treeship receipt verification to Vercel Edge, Cloudflare Workers, AWS Lambda, or the browser. Zero server roundtrip, zero subprocess, pure WASM.
Trust fabric
Treeship is a trust fabric for AI agents — the universal layer that gives every agent both a verifiable identity and a verifiable work record. This page is the whole architecture in one read.