docuseal-cli - CLI Agent Evaluation
Evaluated against the CLI Agent Spec - a specification defining 71 failure modes for CLI tools used under AI agent orchestration.
CLI version: 1.0.3
Evaluated: 2026-05-20
Scope: all (71 of 71 failure modes)
Scores
| Metric | Result |
|---|---|
| Failure mode score | 0.72/3 - 9 passing · 20 partial · 42 failing |
| Readiness score | 7/15 [C] |
| Observed bugs | 4 confirmed during live evaluation |
| Worst gaps | §1 Exit Codes & Status Signaling (0/3), §2 Output Format & Parseability (1/3), §18 Error Message Quality (0/3) |
Key Findings
- Common failures produce stack traces instead of structured JSON, so agents cannot parse errors consistently.
- Generic exit code 1 is used across validation, auth, network, and runtime failures.
- The configure prompt path can exit 0 under non-TTY stdin without writing configuration.
- Async command handlers are registered under program.parse(), contributing to unhandled async failures.
- The bundled skill metadata version drifts from the installed CLI version.
Files
| File | What it is |
|---|---|
| report-index.md | Full scorecard - all failure modes, readiness breakdown, links to all reports |
| report-issues.md | Concrete bugs and gaps agents will hit when using this CLI as-is |
| report-runtime.md | Compact operational brief - what to set, what to avoid, what to watch for |
| report-agent-dev.md | Integration guide - invocation invariants and per-gap workarounds for agent developers |
| report-dev.md | Fix list for CLI authors - what to implement, mapped to spec requirements |
| findings.md | Raw scorecard - one row per evaluated failure mode |
| issues.md | Observed bugs recorded during live evaluation |
| trace.md | Audit trail - exact check commands, exit codes, stdout/stderr per §N |
| environment.md | CLI environment profile - binary path, version, flags, timeout method |
| readiness.md | Proactive readiness scores across 5 dimensions |
Generated by cli-agent-audit · CLI Agent Spec