Skip to content

57 medium locale errors

Part VII: Ecosystem, Runtime & Agent-Specific | Challenge §57

57. Locale-Dependent Error Messages

Severity: Medium | Frequency: Situational | Detectability: Easy | Token Spend: High | Time: Low | Context: Medium

The Problem

Distinct from §2 (locale-invariant serialization of numbers/dates), many CLI tools embed raw OS or runtime error messages directly in error.message. These come from errno, OS exceptions, database drivers — and they are locale-translated. On a French server, "Permission denied" becomes "Permission refusée". Agents that pattern-match on error message text fail silently in non-English environments.

try:
    os.rename(src, dst)
except OSError as e:
    return {"ok": False, "error": {"message": str(e)}}
# en_US: "Permission denied: '/etc/hosts'"
# fr_FR: "Permission refusée: '/etc/hosts'"
# Agent checking for "Permission denied" fails on French systems

Impact

  • Error-handling logic that pattern-matches on message text fails on non-English systems
  • Agents may retry indefinitely when error classification fails

Solutions

Separate machine-readable code from locale message:

{
  "error": {
    "code": "PERMISSION_DENIED",
    "message": "Permission denied: '/etc/hosts'",
    "locale_message": "Permission refusée: '/etc/hosts'",
    "locale": "fr_FR"
  }
}

Framework normalizes OS errors to English (LC_MESSAGES=C) before serialization.

For framework design: - The framework's exception handler MUST normalize all OS/runtime error messages to English before placing them in error.message - error.code is the ONLY field agents should use for error classification; error.message is human-readable context only

Evaluation

Score Condition
0 Error messages contain locale-translated OS errors; agent pattern-matching on text fails on non-English systems
1 Most errors use structured codes; some OS errors passed through as raw locale strings in message
2 Framework sets LC_MESSAGES=C for OS calls; error.code present for all errors; message may still be locale text
3 error.code is the only field used for classification; error.message normalized to English; error.locale_message available for humans

Check: Run the tool on a non-English locale (LANG=fr_FR.UTF-8 tool <cmd>) and trigger an OS error — verify error.code is present and error.message is English regardless of locale.


Agent Workaround

Always classify errors by error.code, never by error.message text; set LC_MESSAGES=C in the subprocess environment:

import subprocess, json, os

env = {
    **os.environ,
    "LC_ALL": "C",           # normalize all locale output to English
    "LC_MESSAGES": "C",      # especially error messages
    "LANG": "C.UTF-8",       # UTF-8 safe but English messages
}

result = subprocess.run(
    cmd, capture_output=True, text=True, env=env
)
parsed = json.loads(result.stdout)

if not parsed.get("ok"):
    error = parsed.get("error", {})

    # ALWAYS use code for classification — never message text
    code = error.get("code", "UNKNOWN")

    # These code checks work on any locale
    if code == "PERMISSION_DENIED":
        raise PermissionError(error.get("message"))
    elif code == "FILE_NOT_FOUND":
        raise FileNotFoundError(error.get("message"))
    else:
        raise RuntimeError(f"[{code}] {error.get('message')}")

Limitation: LC_MESSAGES=C in the subprocess environment normalizes shell and Python runtime messages but does not affect messages from tools that have already translated errors internally — if the tool wraps OS errors without normalization, error.message may still be locale-translated; use only error.code for branching logic