link-cli — CLI Agent Evaluation

Evaluated against the CLI Agent Spec, a specification defining 71 failure modes for CLI tools used under AI agent orchestration.

CLI version: 0.7.1 Evaluated: 2026-06-08 Scope: Critical (22 of 71 failure modes)

Scores

Metric	Result
Failure mode score	1.2/3 — 3 passing · 11 partial · 8 failing
Readiness score	10/15 [B]
Observed bugs	6 confirmed during live evaluation
Worst gaps	§1 Exit Codes & Status Signaling, §11 Timeouts & Hanging Processes, §12 Idempotency & Safe Retries, §13 Partial Failure & Atomicity, §23 Side Effects & Destructive Operations, §25 Prompt Injection via Output, §60 OS Output Buffer Deadlock, §74 Credential Scope Declaration Absence

Validation, auth, unknown-flag, network, and expired-token failures all exited 1 with no structured exit_code.
--format json works, but success output is not consistently wrapped in ok/data unless --full-output is also passed.
API calls lack a general timeout contract and return UNKNOWN for transport failures.
Polling auth login buffered output until command exit and returned pending states inside an ok: true envelope.
Mutating commands lack idempotency, dry-run, effect, danger, and scope declarations.

File	What it is
report-index.md	Full scorecard — all failure modes, readiness breakdown, links to all reports
report-issues.md	Concrete bugs and gaps agents will hit when using this CLI as-is
report-runtime.md	Compact operational brief — what to set, what to avoid, what to watch for
report-agent-dev.md	Integration guide — invocation invariants and per-gap workarounds for agent developers
report-dev.md	Fix list for CLI authors — what to implement
findings.md	Raw scorecard — one row per evaluated failure mode
issues.md	Observed bugs recorded during live evaluation
trace.md	Audit trail — exact check commands, exit codes, stdout/stderr per §N
environment.md	CLI environment profile — binary path, version, flags, timeout method
readiness.md	Proactive readiness scores across 5 dimensions

Generated by cli-agent-audit · CLI Agent Spec