Skip to content

05 high pagination

Part I: Output & Parsing | Challenge §5

5. Pagination & Large Output

Severity: High | Frequency: Common | Detectability: Hard | Token Spend: High | Time: High | Context: Critical

Impact

  • Unbounded output exhausts agent context window and pipe buffers, causing the call to fail or the agent to process incomplete data
  • No truncation indicator means the agent believes it has all records when it has only the first page
  • Page-number pagination requires the agent to track state across calls; cursor-based pagination does not

The Problem

Commands that return large datasets in a single response create multiple problems: the output may be too large to parse, may exceed pipe buffers, or may contain more data than the agent can process in its context.

Unbounded output:

$ tool list-logs
[returns 50,000 lines of JSON]
# Pipe buffer overflows, agent context overflows, parsing degrades

No indication that results are truncated:

$ tool list-users
{"users": [...100 items...]}
# Is this all users? Or first 100? Agent can't tell.

Pagination that requires stateful session:

$ tool list-users --page 2
# Requires knowing that page 1 was fetched first
# No cursor-based alternative

Solutions

Always indicate truncation and total:

{
  "ok": true,
  "data": [...],
  "pagination": {
    "total": 50000,
    "returned": 100,
    "truncated": true,
    "next_cursor": "eyJpZCI6MTAwfQ==",
    "has_more": true
  }
}

Cursor-based pagination (stateless):

tool list-users --limit 100 --cursor "eyJpZCI6MTAwfQ=="

Streaming output (JSONL):

tool list-logs --output jsonl --stream
# Emits one JSON object per line
# Agent can process incrementally
{"timestamp": "...", "level": "error", "message": "..."}
{"timestamp": "...", "level": "info",  "message": "..."}

Default sensible limits:

tool list-users           # default: --limit 20
tool list-users --limit 0 # explicit: no limit

For framework design: - All list commands have --limit (default: 20) and --cursor - Response always includes pagination metadata - --stream flag for JSONL output when processing large sets

Evaluation

Score Condition
0 List commands return all results unbounded; no has_more, no total, no next_cursor
1 --limit flag exists but response contains no pagination metadata; agent cannot tell if results were truncated
2 pagination object in response with total, has_more, next_cursor; --cursor accepted for subsequent pages
3 Default limit applied automatically (≤100); --stream / JSONL mode available; truncated: true field present whenever output is cut

Check: Run a list command without --limit on a dataset with more than 100 items — verify the response includes has_more: true and a next_cursor value.


Agent Workaround

Always specify --limit and loop with next_cursor until has_more is false:

def paginate(base_cmd: list[str], limit: int = 50) -> list:
    all_items = []
    cursor = None

    while True:
        cmd = [*base_cmd, "--limit", str(limit), "--output", "json"]
        if cursor:
            cmd += ["--cursor", cursor]

        result = subprocess.run(cmd, capture_output=True, text=True)
        parsed = json.loads(result.stdout)
        data = parsed.get("data") or parsed.get("items") or []
        all_items.extend(data if isinstance(data, list) else [data])

        pagination = parsed.get("pagination") or parsed.get("meta", {})
        if not pagination.get("has_more"):
            break
        cursor = pagination.get("next_cursor")
        if not cursor:
            break  # no cursor provided — cannot paginate further

    return all_items

Limitation: If the tool provides no has_more or next_cursor field, the agent cannot determine whether results are complete — always apply an explicit --limit to prevent unbounded output, and document that results may be a subset of the full dataset