Execution Records
Every run produces a single canonical artifact: an ExecutionRecord. It's a structured JSON file containing the complete execution timeline, inputs, outputs, and metadata.
The Artifact
When you run an agent, Paprika creates a JSON file:
~/.paprika/traces/
├── abc123def456.json
├── xyz789abc123.json
└── ...Each file is a complete ExecutionRecord. Open one:
#a3d95f]">"text-[#9ecbff]">cat ~/.paprika/traces/abc123def456.json | jq .Example (abbreviated):
{
"text-[#9ecbff]">"schema_version": "1.0",
"text-[#9ecbff]">"record_id": "abc123def456",
"text-[#9ecbff]">"agent": {
"text-[#9ecbff]">"name": "researcher",
"text-[#9ecbff]">"version": null
},
"text-[#9ecbff]">"execution": {
"text-[#9ecbff]">"started_at": "2024-01-15T14:32:10.123456Z",
"text-[#9ecbff]">"ended_at": "2024-01-15T14:32:10.248456Z",
"text-[#9ecbff]">"duration_ms": 125.0,
"text-[#9ecbff]">"status": "success",
"text-[#9ecbff]">"termination_reason": null
},
"text-[#9ecbff]">"policy": {
"text-[#9ecbff]">"config": {
"text-[#9ecbff]">"max_steps": 10,
"text-[#9ecbff]">"max_tokens": 10000,
"text-[#9ecbff]">"max_repeat_hashes": 3
},
"text-[#9ecbff]">"violation": null
},
"text-[#9ecbff]">"totals": {
"text-[#9ecbff]">"step_count": 3,
"text-[#9ecbff]">"llm_calls": 2,
"text-[#9ecbff]">"tool_calls": 1,
"text-[#9ecbff]">"total_tokens": 142,
"text-[#9ecbff]">"prompt_tokens": 95,
"text-[#9ecbff]">"completion_tokens": 47
},
"text-[#9ecbff]">"input": {},
"text-[#9ecbff]">"output": {
"text-[#9ecbff]">"question": "What is AI?",
"text-[#9ecbff]">"search_result": "...",
"text-[#9ecbff]">"summary": "..."
},
"text-[#9ecbff]">"error": null,
"text-[#9ecbff]">"steps": [
{ "text-[#9ecbff]">"step_type": "llm_call", ... },
{ "text-[#9ecbff]">"step_type": "tool_call", ... },
{ "text-[#9ecbff]">"step_type": "llm_call", ... }
]
}Top-Level Fields
| Field | Type | Purpose |
|-------|------|---------|
| schema_version | string | Always "1.0" |
| record_id | string | Unique run identifier (UUID format) |
| parent_record_id | string \| null | For derived runs (future use) |
| replay_of | string \| null | If this is a replay, the original run's record_id |
| agent | object | Agent metadata: name, version |
| execution | object | Execution timeline and status |
| policy | object | Policy config snapshot and violation (if any) |
| totals | object | Aggregate counts: steps, tokens, calls |
| input | any | Original input to the agent |
| output | any | Final output from the agent |
| error | string \| null | Error message if execution failed |
| environment | object \| null | Environment metadata (reserved) |
| steps[] | array | Typed steps (LLM calls, tool calls, policy violations) |
| extensions | object | Reserved for future extensions |
Execution Status
execution.status is one of:
- `"success"` — Agent completed normally
- `"error"` — Agent raised an exception
- `"policy_violation"` — A runtime policy was violated (agent halted mid-execution)
Steps
The steps[] array contains typed execution steps. Each step has:
{
"text-[#9ecbff]">"step_type": "llm_call" | "tool_call" | "policy_violation",
"text-[#9ecbff]">"step_index": 0,
"text-[#9ecbff]">"timestamp": "2024-01-15T14:32:10.130000Z",
"text-[#9ecbff]">"event_id": "uuid"
}LLM Call Step
{
"text-[#9ecbff]">"step_type": "llm_call",
"text-[#9ecbff]">"step_index": 0,
"text-[#9ecbff]">"timestamp": "2024-01-15T14:32:10.130000Z",
"text-[#9ecbff]">"event_id": "abc123",
"text-[#9ecbff]">"provider": "openai",
"text-[#9ecbff]">"model": "gpt-4o",
"text-[#9ecbff]">"input_data": {
"text-[#9ecbff]">"messages": [
{ "text-[#9ecbff]">"role": "user", "text-[#9ecbff]">"content": "Your prompt" }
]
},
"text-[#9ecbff]">"input_hash": "a1b2c3d4e5f6g7h8",
"text-[#9ecbff]">"output_data": {
"text-[#9ecbff]">"choices": [
{
"text-[#9ecbff]">"message": {
"text-[#9ecbff]">"role": "assistant",
"text-[#9ecbff]">"content": "Response text"
}
}
]
},
"text-[#9ecbff]">"token_usage": {
"text-[#9ecbff]">"prompt_tokens": 12,
"text-[#9ecbff]">"completion_tokens": 8,
"text-[#9ecbff]">"total_tokens": 20
},
"text-[#9ecbff]">"duration_ms": 150.0,
"text-[#9ecbff]">"side_effect": "pure",
"text-[#9ecbff]">"error": null
}Fields:
provider— LLM provider:"openai","mock", custommodel— Model identifierinput_data— Full input dict (exact same as passed toctx.llm.call())input_hash— Deterministic hash of input (for mismatch detection)output_data— Full output dict from the LLMtoken_usage— Token counts if availableduration_ms— Wall-clock durationside_effect—"pure"(LLM calls have no side effects)error— Error message if the call failed
Tool Call Step
{
"text-[#9ecbff]">"step_type": "tool_call",
"text-[#9ecbff]">"step_index": 1,
"text-[#9ecbff]">"timestamp": "2024-01-15T14:32:10.180000Z",
"text-[#9ecbff]">"event_id": "def456",
"text-[#9ecbff]">"tool_name": "search",
"text-[#9ecbff]">"args": {
"text-[#9ecbff]">"query": "AI trends"
},
"text-[#9ecbff]">"input_hash": "i9j0k1l2m3n4o5p6",
"text-[#9ecbff]">"output_data": "Search results...",
"text-[#9ecbff]">"duration_ms": 45.0,
"text-[#9ecbff]">"side_effect": null,
"text-[#9ecbff]">"error": null
}Fields:
tool_name— Name of the registered toolargs— Arguments dict (exact same as passed toctx.tools.call())input_hash— Deterministic hash of args (for repeat detection)output_data— Return value from the toolduration_ms— Wall-clock durationside_effect— Null (can be"read_only","write","irreversible"in future)error— Error message if the tool raised an exception
Policy Violation Step
{
"text-[#9ecbff]">"step_type": "policy_violation",
"text-[#9ecbff]">"step_index": 5,
"text-[#9ecbff]">"timestamp": "2024-01-15T14:32:10.240000Z",
"text-[#9ecbff]">"event_id": "ghi789",
"text-[#9ecbff]">"policy_name": "max_steps",
"text-[#9ecbff]">"message": "Maximum step count (10) exceeded",
"text-[#9ecbff]">"details": {
"text-[#9ecbff]">"limit": 10,
"text-[#9ecbff]">"current": 11
}
}Fields:
policy_name— Name of violated policy:"max_steps","max_tokens","max_repeat_hashes"message— Human-readable violation descriptiondetails— Policy-specific details (limit, current value, etc.)
When a policy violation occurs:
- Execution halts immediately (remaining steps not executed)
execution.status = "policy_violation"policy.violationcontains the violation details- The
PolicyViolationStepis added tosteps[]
Input Hash
The input_hash field is critical for mismatch detection and repeat detection.
Algorithm:
- Take input dict (e.g.,
{"messages": [...]}) - Recursively sort all keys alphabetically
- Serialize to compact JSON (no whitespace, sorted keys)
- Compute SHA256 hash
- Take first 16 hex characters
Example:
input_dict = {#a3d95f]">"messages": [{"role": "user", "content": "hi"}]}
# Sorted and serialized: '{"messages":[{"content":"hi","role":"user"}]}'
# SHA256: 'a1b2c3d4e5f6g7h8i9j0k1l2m3n4o5p6...'
# First 16 chars: 'a1b2c3d4e5f6g7h8'The hash is deterministic:
- Same input → same hash
- Different input → different hash (with overwhelming probability)
Used for:
- Replay mismatch detection — if replayed input hash ≠ original,
ReplayMismatchError - Repeat detection — if same input hash appears too many times,
max_repeat_hashespolicy fires
Storage
Default Location
~/.paprika/traces/The directory is created automatically on first run.
Override Directory
Use environment variable:
PAPRIKA_TRACE_DIR=/tmp/paprika python agent.pyOr CLI flag:
paprika runs list --trace-dir /tmp/paprikaFile Naming
Files are named by record_id:
abc123def456.jsonRun IDs must match the pattern: ^[A-Za-z0-9][A-Za-z0-9._-]*$
(Alphanumeric, dots, dashes, underscores; no path traversal risk)
Security
Path Traversal Prevention:
Run IDs are validated. You cannot escape the trace directory via a run ID like ../../../etc/passwd. Invalid run IDs raise InvalidRunIdError.
No Secrets Storage:
ExecutionRecord stores full inputs and outputs. If your LLM calls or tool calls include sensitive data (API keys, passwords, PII), they will be stored in the trace file. Do not log sensitive data to Paprika traces if you can avoid it. Use the input and output fields only for non-sensitive structured data.
Versions
schema_version is always "1.0".
If Paprika updates the schema in a breaking way, the version number will increment (e.g., "2.0"). The code will migrate old traces automatically on load.
Accessing Records Programmatically
#a3d95f]">"text-[#9ecbff]">from paprika "text-[#9ecbff]">import PaprikaRuntime
runtime = PaprikaRuntime()
# Load a record
record = runtime.trace_store.load_record(run_id=#a3d95f]">"abc123def456")
# Access fields
print(record.record_id)
print(record.agent.name)
print(record.execution.status)
print(record.totals.step_count)
# Iterate steps
#a3d95f]">"text-[#9ecbff]">for step in record.steps:
#a3d95f]">"text-[#9ecbff]">if step.step_type == "llm_call":
print(f#a3d95f]">"LLM: {step.model} in {step.duration_ms}ms")
#a3d95f]">"text-[#9ecbff]">elif step.step_type == "tool_call":
print(f#a3d95f]">"Tool: {step.tool_name}")
#a3d95f]">"text-[#9ecbff]">elif step.step_type == "policy_violation":
print(f#a3d95f]">"Violation: {step.policy_name}")
# Serialize to JSON
json_string = record.model_dump_json_pretty()Next Steps
- Inspect records via CLI: CLI
- Replay records: Replay Engine
- Set policies that affect execution status: Policies