Time-Travel Debugging

Time-travel debugging lets you replay any agent pipeline execution step-by-step, inspecting the exact state at each stage. By capturing checkpoints at every agent boundary, the system enables precise root cause analysis without needing to reproduce the original conditions.

How It Works

Every pipeline execution automatically creates checkpoints at each stage transition. A checkpoint captures:

Input data passed to the agent
Output data returned by the agent
Agent configuration at the time of execution (model, prompt version, settings)
Timing information (start time, duration, queue wait time)
Token usage and cost for the step
Error details if the step failed

These checkpoints are persisted in the checkpoint store, scoped to the runner (runner_id) and session (session_id) that produced them, creating a complete audit trail of every execution.

Replay Interface

The replay interface presents a timeline of the pipeline execution with each agent represented as a step. You can:

Step Through Execution

Navigate forward and backward through the pipeline stages using the step controls. At each step, the interface displays:

The agent that executed at this stage
The complete input payload the agent received
The complete output payload the agent produced
Duration and cost of the step
Any warnings or errors encountered

Inspect Input and Output

Each checkpoint’s input and output data is displayed in a structured JSON viewer with syntax highlighting, collapsible sections, and search. For large payloads, the viewer supports:

Diff mode — highlight what changed between input and output
Path copy — click any field to copy its JSON path
Raw mode — view the unformatted JSON for copy/paste

Compare Runs Side-by-Side

Select two pipeline runs to view them side-by-side in a comparison layout. This is useful for:

Understanding why two runs produced different results
Comparing performance before and after a configuration change
Identifying which agent in the pipeline introduced a regression
Validating that a fix resolved the issue without side effects

The comparison view highlights differences in inputs, outputs, timing, and cost at each corresponding step.

Checkpoint Store

Checkpoints are stored using the MongoCheckpointStore abstraction, which supports tenant-scoped access (org_id) to ensure organizations can only view their own execution data.


from src.checkpoint.store import MongoCheckpointStore
 
store = MongoCheckpointStore()
 
# List checkpoints for a session, scoped to your org
checkpoints = await store.list_by_session(session_id="sess_abc123", org_id="org_456")
 
# Or fetch the most recent checkpoint for a session
latest = await store.get_latest(session_id="sess_abc123", org_id="org_456")
 
# Each checkpoint contains the full pipeline state
for cp in checkpoints:
    print(f"Agent: {cp.metadata.completed_agent}")
    print(f"Phase: {cp.state.phase}")
    print(f"Input: {cp.state.input_data}")
    print(f"Results: {cp.state.results}")
    print(f"Total cost: ${cp.metadata.total_cost:.4f}")
    print(f"Total latency: {cp.metadata.total_latency_ms}ms")

Checkpoints are retained according to your organization’s retention policy. The default retention period is 30 days, configurable per organization.

Use Cases

Scenario	How Time-Travel Helps
Debugging failures	Step to the exact agent that failed and inspect its input to understand the cause
Quality regression	Compare a good run with a bad run to find where output quality diverged
Cost investigation	Inspect token usage at each step to find unexpectedly expensive operations
Prompt tuning	Compare outputs across prompt versions for the same input data
Incident response	Replay the exact execution path that led to a production issue

API Endpoints

Checkpoints are runner-scoped and served by the gateway admin API. List the checkpoints for a runner, then fetch a single checkpoint by its ID.

List Checkpoints for a Runner


GET /gateway/admin/runners/{runner_id}/checkpoints


{
  "runner_id": "runner_abc123",
  "total": 2,
  "checkpoints": [
    {
      "checkpoint_id": "ckpt_0a1b2c3d4e5f",
      "runner_id": "runner_abc123",
      "org_id": "org_456",
      "name": "after-planner",
      "description": "State captured after the planning step",
      "session_id": "sess_abc123",
      "state_summary": "12 files, 3 env vars, 1 skill",
      "state_size_bytes": 48213,
      "file_count": 12,
      "created_at": "2026-02-08T14:23:11.204Z",
      "created_by": "key_live_abc",
      "expires_at": "2026-02-09T14:23:11.204Z",
      "fork_count": 0
    },
    {
      "checkpoint_id": "ckpt_5f4e3d2c1b0a",
      "runner_id": "runner_abc123",
      "org_id": "org_456",
      "name": "after-reviewer",
      "description": "State captured after the review step",
      "session_id": "sess_abc123",
      "state_summary": "14 files, 3 env vars, 1 skill",
      "state_size_bytes": 51902,
      "file_count": 14,
      "created_at": "2026-02-08T14:23:13.544Z",
      "created_by": "key_live_abc",
      "expires_at": "2026-02-09T14:23:13.544Z",
      "fork_count": 2
    }
  ]
}

Get a Single Checkpoint


GET /gateway/admin/runners/checkpoints/{checkpoint_id}

Returns the full checkpoint detail, including the captured pipeline state and any forks created from it.

These endpoints require a gateway API key (X-CM-API-Key) with runner read permission; results are scoped to the calling key’s organization.