Skip to content

link-cli — CLI Agent Evaluation

Evaluated against the CLI Agent Spec, a specification defining 71 failure modes for CLI tools used under AI agent orchestration.

CLI version: 0.7.1 Evaluated: 2026-06-08 Scope: Critical (22 of 71 failure modes)

Scores

Metric Result
Failure mode score 1.2/3 — 3 passing · 11 partial · 8 failing
Readiness score 10/15 [B]
Observed bugs 6 confirmed during live evaluation
Worst gaps §1 Exit Codes & Status Signaling, §11 Timeouts & Hanging Processes, §12 Idempotency & Safe Retries, §13 Partial Failure & Atomicity, §23 Side Effects & Destructive Operations, §25 Prompt Injection via Output, §60 OS Output Buffer Deadlock, §74 Credential Scope Declaration Absence

Key Findings

  • Validation, auth, unknown-flag, network, and expired-token failures all exited 1 with no structured exit_code.
  • --format json works, but success output is not consistently wrapped in ok/data unless --full-output is also passed.
  • API calls lack a general timeout contract and return UNKNOWN for transport failures.
  • Polling auth login buffered output until command exit and returned pending states inside an ok: true envelope.
  • Mutating commands lack idempotency, dry-run, effect, danger, and scope declarations.

Files

File What it is
report-index.md Full scorecard — all failure modes, readiness breakdown, links to all reports
report-issues.md Concrete bugs and gaps agents will hit when using this CLI as-is
report-runtime.md Compact operational brief — what to set, what to avoid, what to watch for
report-agent-dev.md Integration guide — invocation invariants and per-gap workarounds for agent developers
report-dev.md Fix list for CLI authors — what to implement
findings.md Raw scorecard — one row per evaluated failure mode
issues.md Observed bugs recorded during live evaluation
trace.md Audit trail — exact check commands, exit codes, stdout/stderr per §N
environment.md CLI environment profile — binary path, version, flags, timeout method
readiness.md Proactive readiness scores across 5 dimensions

Generated by cli-agent-audit · CLI Agent Spec