link-cli — CLI Agent Evaluation
Evaluated against the CLI Agent Spec, a specification defining 71 failure modes for CLI tools used under AI agent orchestration.
CLI version: 0.7.1 Evaluated: 2026-06-08 Scope: Critical (22 of 71 failure modes)
Scores
| Metric | Result |
|---|---|
| Failure mode score | 1.2/3 — 3 passing · 11 partial · 8 failing |
| Readiness score | 10/15 [B] |
| Observed bugs | 6 confirmed during live evaluation |
| Worst gaps | §1 Exit Codes & Status Signaling, §11 Timeouts & Hanging Processes, §12 Idempotency & Safe Retries, §13 Partial Failure & Atomicity, §23 Side Effects & Destructive Operations, §25 Prompt Injection via Output, §60 OS Output Buffer Deadlock, §74 Credential Scope Declaration Absence |
Key Findings
- Validation, auth, unknown-flag, network, and expired-token failures all exited 1 with no structured
exit_code. --format jsonworks, but success output is not consistently wrapped inok/dataunless--full-outputis also passed.- API calls lack a general timeout contract and return
UNKNOWNfor transport failures. - Polling auth login buffered output until command exit and returned pending states inside an
ok: trueenvelope. - Mutating commands lack idempotency, dry-run, effect, danger, and scope declarations.
Files
| File | What it is |
|---|---|
| report-index.md | Full scorecard — all failure modes, readiness breakdown, links to all reports |
| report-issues.md | Concrete bugs and gaps agents will hit when using this CLI as-is |
| report-runtime.md | Compact operational brief — what to set, what to avoid, what to watch for |
| report-agent-dev.md | Integration guide — invocation invariants and per-gap workarounds for agent developers |
| report-dev.md | Fix list for CLI authors — what to implement |
| findings.md | Raw scorecard — one row per evaluated failure mode |
| issues.md | Observed bugs recorded during live evaluation |
| trace.md | Audit trail — exact check commands, exit codes, stdout/stderr per §N |
| environment.md | CLI environment profile — binary path, version, flags, timeout method |
| readiness.md | Proactive readiness scores across 5 dimensions |
Generated by cli-agent-audit · CLI Agent Spec