Skip to content

Execution And Reliability

Execution flow, blocking behavior, atomicity, and reliability under agent orchestration.

Failure modes: 8 active  |  🔴 4 critical · 🟠 3 high · 🟡 1 medium


File Severity Summary
10-critical-interactivity.md 🔴 Critical Agents run in non-interactive environments
11-critical-timeouts.md 🔴 Critical Agents have finite time budgets per tool call
12-critical-idempotency.md 🔴 Critical Agents retry on failure
13-critical-partial-failure.md 🔴 Critical Multi-step commands can fail mid-execution, leaving the system in an unknown intermediate state
14-high-arg-validation.md 🟠 High Many CLI tools begin executing — creating files, sending requests, modifying state — before validating all their argu...
15-high-race-conditions.md 🟠 High Agents may invoke multiple tool calls in parallel
16-high-signal-handling.md 🟠 High Agents enforce time budgets by killing processes (SIGTERM, then SIGKILL)
17-medium-child-process-leakage.md 🟡 Medium CLI tools frequently spawn background processes — log forwarders, file watchers, health monitors, connection pools

Detailed Metrics

Challenge Severity Frequency Detectability Token Spend Time Context
§10 🔴 Critical Common Hard High Critical Low
§11 🔴 Critical Common Hard High Critical Low
§12 🔴 Critical Common Hard High High Medium
§13 🔴 Critical Common Hard High High Medium
§14 🟠 High Common Medium Medium Medium Low
§15 🟠 High Situational Hard Medium Medium Low
§16 🟠 High Situational Hard Medium Medium Low
§17 🟡 Medium Situational Hard Low Low Low