hevn — CLI Agent Evaluation
Evaluated against the CLI Agent Spec critical failure modes for CLI tools used under AI agent orchestration.
CLI version: hevn-cli 0.1.0 Evaluated: 2026-06-01 Scope: critical (22 of 71 failure modes)
Scores
| Metric | Result |
|---|---|
| Failure mode score | 0.9/3 — 2 passing · 12 partial · 8 failing |
| Readiness score | 7/15 [C] |
| Observed bugs | 4 confirmed during live evaluation |
| Worst gaps | §43 output size, §45 headless auth, §64 headless GUI, §74 credential scopes |
Key Findings
hevn agent-skillis advertised but crashes becausehevn_cli/res/CLAUDE.mdis missing from the wheel.hevn --versionis absent, so agents cannot use the conventional binary health check.- Browser login is not safe for unattended headless execution; even
--no-open --jsonwaits for callback and times out with prose output. - JSON output exists on many commands, but there is no global schema or consistent
ok/data/error/metaenvelope. - Mutating operations have some confirmations and
--yes, but no dry-run/effect/danger contract.
Files
| File | What it is |
|---|---|
| report-index.md | Full scorecard, readiness breakdown, and links to reports |
| report-issues.md | Concrete bugs and gaps agents will hit when using this CLI as-is |
| report-runtime.md | Compact operational brief for agents invoking this CLI |
| report-agent-dev.md | Integration guide and per-gap workarounds for agent developers |
| report-dev.md | Fix list for CLI authors |
| findings.md | Raw scorecard, one row per evaluated failure mode |
| issues.md | Observed bugs recorded during live evaluation |
| trace.md | Audit trail with commands, exits, stdout, and stderr |
| environment.md | Environment profile for this audit |
| readiness.md | Proactive readiness score |
Generated by cli-agent-audit · CLI Agent Spec