shopify — Issues Report
Generated: 2026-05-28
CLI version: @shopify/cli/4.1.0 darwin-arm64 node-v25.9.0
Scope: Critical
Findings in scope: 22 failure modes
Observed Bugs (from evaluation notes)
These were witnessed directly when running checks against this CLI.
§45 candidate — auth login blocks non-interactive agents
Discovered during: §45 evaluation — 2026-05-28
Symptom: shopify auth login printed a device-code URL and kept running until manually terminated.
Impact: An agent can hang indefinitely waiting for a human browser flow.
Trigger: shopify auth login
§37 candidate — Liquid REPL path hangs under non-TTY
Discovered during: §37 evaluation — 2026-05-28
Symptom: shopify theme console emitted release notes and did not exit before a 3s alarm killed it.
Impact: Agents can lose a run to an interactive command path without a machine-readable recovery hint.
Trigger: perl -e 'alarm 3; exec @ARGV' -- shopify theme console
§2 candidate — release notes and preference errors pollute command output
Discovered during: §2 evaluation — 2026-05-28
Symptom: Theme commands emitted release notes, analytics/storage errors, and stack traces before command-specific output.
Impact: Agents cannot safely parse stdout/stderr without defensive filtering.
Trigger: shopify theme pull --store invalid.myshopify.com --password [REDACTED] --theme 123 --verbose
§24 candidate — secret material can be supplied as command-line flags
Discovered during: §24 evaluation — 2026-05-28
Symptom: Theme commands accept --password=<value>.
Impact: Credentials can enter shell history or process listings on shared agent hosts.
Trigger: shopify theme pull --store invalid.myshopify.com --password [REDACTED] --theme 123 --verbose
§10 candidate — undeclared writes to user preferences
Discovered during: §10 evaluation — 2026-05-28
Symptom: Several commands attempted writes under /Users/roman/Library/Preferences/..., causing sandbox EPERM stack traces.
Impact: Restricted agent environments can fail before command-specific validation or structured errors are reached.
Trigger: theme command probes under the workspace sandbox
Failure-Mode Gaps (score 0–2, sorted: score asc, severity desc; ?/3 entries listed last)
§10 — Interactivity & TTY Requirements [Critical · score 0/3]
What fails: Auth and REPL paths block or wait in non-TTY; no universal --non-interactive/--yes flag or automatic structured failure.
Frequency: Common
Token/time cost when it triggers: Token Spend: High · Time: Critical
Workaround exists: Partial
§11 — Timeouts & Hanging Processes [Critical · score 0/3]
What fails: No generic --timeout, JSON timeout error, defined timeout exit code, heartbeat interval, or resume token was found.
Frequency: Common
Token/time cost when it triggers: Token Spend: High · Time: Critical
Workaround exists: Partial
§12 — Idempotency & Safe Retries [Critical · score 0/3]
What fails: Mutating commands do not expose --idempotency-key, universal --dry-run, or effect fields in structured responses.
Frequency: Common
Token/time cost when it triggers: Token Spend: High · Time: High
Workaround exists: Partial
§45 — Headless Authentication / OAuth Browser Flow Blocking [Critical · score 0/3]
What fails: shopify auth login in non-TTY printed a device-code URL and kept running until terminated; no structured AUTH_REQUIRED response.
Frequency: Common
Token/time cost when it triggers: Token Spend: High · Time: Critical
Workaround exists: Partial
§74 — Credential Scope Declaration Absence [Critical · score 0/3]
What fails: No --schema/manifest required-scope declaration or check-permissions machine-readable preflight exists.
Frequency: Common
Token/time cost when it triggers: Token Spend: Low · Time: Medium
Workaround exists: Partial
Passing (score 3/3 — safe to use without special handling)
§62 $EDITOR and $VISUAL Trap
Risk Summary
| Category | Count | §N list |
|---|---|---|
| Observed bugs | 5 | §45, §37, §2, §24, §10 |
| Score 0 — complete failure | 11 | §10, §11, §12, §13, §25, §37, §43, §45, §50, §60, §74 |
| Score 1 — major gap | 8 | §1, §2, §23, §24, §34, §42, §61, §64 |
| Score 2 — minor gap | 1 | §71 |
| Score 3 — passing | 1 | §62 |
| Indeterminate (?/3) | 1 | §53 |
Highest-risk combination: headless auth and interactive command paths can block agents while output pollution prevents simple parsers from recognizing the failure mode reliably.