Skip to content

14 high arg validation

Part II: Execution & Reliability | Challenge §14

14. Argument Validation Before Side Effects

Severity: High | Frequency: Common | Detectability: Medium | Token Spend: Medium | Time: Medium | Context: Low

The Problem

Many CLI tools begin executing — creating files, sending requests, modifying state — before validating all their arguments. When validation fails mid-execution, the tool has already caused partial side effects that must be undone.

Validation scattered through execution:

$ tool deploy --env prod --version 1.2.3 --notify-slack "#invalid channel"
# Step 1: connect to prod ✓
# Step 2: deploy version 1.2.3 ✓  ← side effect happened
# Step 3: notify slack → Error: invalid channel name
exit 1
# Deployment happened, notification didn't. Partial success with exit 1.
# Agent sees failure, may retry → redeploys unnecessarily

Required arg not validated until needed:

$ tool backup --destination s3://my-bucket --encrypt --key-file /missing.pem
# Starts backup (writes 2GB to temp)
# Reaches encryption step → Error: key file not found
# 2GB of temp data must be cleaned up
# Wasted time, disk I/O, and the agent's turn budget

Type validation only on use:

$ tool process --workers "abc"
# Starts processing
# Spawns worker pool → Error: invalid literal for int(): 'abc'
# Work already partially started

Impact

  • Partial side effects on validation failure require manual cleanup
  • Agent retries cause duplicate side effects (deploy ran twice)
  • Wasted execution time before hitting a predictable error
  • Error message points to validation issue, not the side effect that already occurred

Solutions

Two-phase execution: validate-then-execute:

def run(args):
    # Phase 1: validate ALL args before touching anything
    errors = validate(args)
    if errors:
        emit_validation_errors(errors)
        sys.exit(2)  # exit 2 = bad args, no side effects occurred

    # Phase 2: execute (side effects start here)
    execute(args)

Validation result in structured output:

{
  "ok": false,
  "phase": "validation",          // side effects: none
  "errors": [
    {
      "param": "--key-file",
      "code": "FILE_NOT_FOUND",
      "message": "Key file '/missing.pem' does not exist",
      "value": "/missing.pem"
    },
    {
      "param": "--workers",
      "code": "TYPE_ERROR",
      "message": "Expected integer, got 'abc'",
      "value": "abc"
    }
  ]
}

Preflight flag:

tool deploy --env prod --version 1.2.3 --validate-only
# Runs all validation, reports errors, exits without deploying
# exit 0 = would succeed
# exit 2 = validation errors (listed in JSON)

For framework design: - Framework enforces: all @validate hooks run before any @execute hooks - Exit code 2 reserved exclusively for validation failures (no side effects) - --validate-only is a framework-level flag available on all commands - Validation errors always list all problems at once (not just the first one)

Evaluation

Score Condition
0 Validation fails mid-execution after side effects have occurred; exit 1 with text error
1 Most args validated upfront but some checked lazily; exit code not distinguished from execution failure
2 All args validated before execution; exit 2 for validation failures; structured error list
3 Exit 2 with structured JSON validation errors listing all problems at once; --validate-only flag available

Check: Pass an invalid argument alongside a valid destructive flag — verify exit 2 is returned immediately with a JSON error, and that no side effect has occurred (no file created, no network call made).


Agent Workaround

Use --validate-only before executing mutating commands when available:

# Dry-run validation first — no side effects
validate_result = run([*cmd, "--validate-only"])
if validate_result.returncode == 2:
    errors = json.loads(validate_result.stdout).get("errors", [])
    # Fix argument errors before executing
    raise ValueError(f"Argument errors: {errors}")

# Only execute after validation passes
result = run(cmd)

Detect validation failure by exit code:

result = run(cmd)
if result.returncode == 2:
    # Validation failure — no side effects occurred, safe to fix and retry
    parsed = json.loads(result.stdout)
    bad_params = [e["param"] for e in parsed.get("errors", [])]
elif result.returncode != 0:
    # Execution failure — side effects may have occurred, check state before retrying
    pass

Limitation: If the tool does not distinguish exit 2 (validation) from exit 1 (execution failure), the agent cannot safely determine whether a retry would cause duplicate side effects — treat any non-zero exit from a mutating command as potentially having caused partial side effects