Skip to content

hevn — CLI Agent Evaluation

Evaluated against the CLI Agent Spec critical failure modes for CLI tools used under AI agent orchestration.

CLI version: hevn-cli 0.1.0 Evaluated: 2026-06-01 Scope: critical (22 of 71 failure modes)

Scores

Metric Result
Failure mode score 0.9/3 — 2 passing · 12 partial · 8 failing
Readiness score 7/15 [C]
Observed bugs 4 confirmed during live evaluation
Worst gaps §43 output size, §45 headless auth, §64 headless GUI, §74 credential scopes

Key Findings

  • hevn agent-skill is advertised but crashes because hevn_cli/res/CLAUDE.md is missing from the wheel.
  • hevn --version is absent, so agents cannot use the conventional binary health check.
  • Browser login is not safe for unattended headless execution; even --no-open --json waits for callback and times out with prose output.
  • JSON output exists on many commands, but there is no global schema or consistent ok/data/error/meta envelope.
  • Mutating operations have some confirmations and --yes, but no dry-run/effect/danger contract.

Files

File What it is
report-index.md Full scorecard, readiness breakdown, and links to reports
report-issues.md Concrete bugs and gaps agents will hit when using this CLI as-is
report-runtime.md Compact operational brief for agents invoking this CLI
report-agent-dev.md Integration guide and per-gap workarounds for agent developers
report-dev.md Fix list for CLI authors
findings.md Raw scorecard, one row per evaluated failure mode
issues.md Observed bugs recorded during live evaluation
trace.md Audit trail with commands, exits, stdout, and stderr
environment.md Environment profile for this audit
readiness.md Proactive readiness score

Generated by cli-agent-audit · CLI Agent Spec