Skip to content

docuseal-cli - CLI Agent Evaluation

Evaluated against the CLI Agent Spec - a specification defining 71 failure modes for CLI tools used under AI agent orchestration.

CLI version: 1.0.3
Evaluated: 2026-05-20
Scope: all (71 of 71 failure modes)

Scores

Metric Result
Failure mode score 0.72/3 - 9 passing · 20 partial · 42 failing
Readiness score 7/15 [C]
Observed bugs 4 confirmed during live evaluation
Worst gaps §1 Exit Codes & Status Signaling (0/3), §2 Output Format & Parseability (1/3), §18 Error Message Quality (0/3)

Key Findings

  • Common failures produce stack traces instead of structured JSON, so agents cannot parse errors consistently.
  • Generic exit code 1 is used across validation, auth, network, and runtime failures.
  • The configure prompt path can exit 0 under non-TTY stdin without writing configuration.
  • Async command handlers are registered under program.parse(), contributing to unhandled async failures.
  • The bundled skill metadata version drifts from the installed CLI version.

Files

File What it is
report-index.md Full scorecard - all failure modes, readiness breakdown, links to all reports
report-issues.md Concrete bugs and gaps agents will hit when using this CLI as-is
report-runtime.md Compact operational brief - what to set, what to avoid, what to watch for
report-agent-dev.md Integration guide - invocation invariants and per-gap workarounds for agent developers
report-dev.md Fix list for CLI authors - what to implement, mapped to spec requirements
findings.md Raw scorecard - one row per evaluated failure mode
issues.md Observed bugs recorded during live evaluation
trace.md Audit trail - exact check commands, exit codes, stdout/stderr per §N
environment.md CLI environment profile - binary path, version, flags, timeout method
readiness.md Proactive readiness scores across 5 dimensions

Generated by cli-agent-audit · CLI Agent Spec