Cost Tracking
The CostTracker class provides real-time LLM cost recording and querying. Every model call made by any agent is tracked with its token usage, calculated cost, and metadata. The tracker integrates with the dashboard cost views to give teams full visibility into their AI spend.
CostTracker
The CostTracker is the core interface for recording and querying LLM costs:
from src.utils.cost_tracker import CostTracker
tracker = CostTracker()In production, the tracker is initialized once and shared across all agents in a pipeline execution. It accumulates costs throughout the run and can be queried at any point.
Recording Costs
Record a cost event after every LLM call using tracker.record():
tracker.record(
model="gpt-4o",
input_tokens=1200,
output_tokens=450,
agent="style_analysis"
)| Parameter | Type | Description |
|---|---|---|
model | str | The model identifier (e.g., gpt-4o, claude-sonnet-4-5-20250929) |
input_tokens | int | Number of input (prompt) tokens consumed |
output_tokens | int | Number of output (completion) tokens generated |
agent | str | The agent that made the call, used for per-agent breakdowns |
The tracker automatically looks up the model’s pricing and calculates the cost. If a model is not in the pricing table, the cost is recorded as zero with a warning.
Querying Costs
Total Cost
Get the total accumulated cost across all agents and models:
total = tracker.total_cost
# 0.0834Cost by Agent
Break down costs by agent to identify which agents are most expensive:
by_agent = tracker.by_agent()
# {
# "vision_analysis": 0.0412,
# "style_analysis": 0.0298,
# "style_critic": 0.0124
# }Cost by Model
Break down costs by model to understand provider-level spend:
by_model = tracker.by_model()
# {
# "gemini-2.5-pro": 0.0412,
# "gpt-5.1": 0.0298,
# "deepseek-v3": 0.0124
# }Cost Summary
Generate a comprehensive cost summary with usage metrics and projections:
summary = tracker.get_summary()The summary includes:
{
"total_cost_usd": 0.0834,
"total_calls": 3,
"total_input_tokens": 4800,
"total_output_tokens": 1850,
"avg_input_tokens_per_call": 1600,
"avg_output_tokens_per_call": 617,
"avg_cost_per_call": 0.0278,
"by_agent": {
"vision_analysis": {"cost": 0.0412, "calls": 1, "input_tokens": 2400, "output_tokens": 800},
"style_analysis": {"cost": 0.0298, "calls": 1, "input_tokens": 1600, "output_tokens": 650},
"style_critic": {"cost": 0.0124, "calls": 1, "input_tokens": 800, "output_tokens": 400}
},
"by_model": {
"gemini-2.5-pro": {"cost": 0.0412, "calls": 1},
"gpt-5.1": {"cost": 0.0298, "calls": 1},
"deepseek-v3": {"cost": 0.0124, "calls": 1}
}
}Usage Metrics
Beyond raw cost, the tracker provides efficiency metrics:
metrics = tracker.get_usage_metrics()
# {
# "total_calls": 3,
# "total_input_tokens": 4800,
# "total_output_tokens": 1850,
# "avg_tokens_per_call": 2217,
# "token_efficiency": 0.385, # output_tokens / total_tokens
# "cost_per_1k_output_tokens": 0.045
# }Token efficiency measures the ratio of output tokens to total tokens. A low efficiency score may indicate prompts that are too verbose relative to their output.
Supported Models
The tracker includes built-in pricing for the following models:
| Model | Provider | Input (per 1M tokens) | Output (per 1M tokens) |
|---|---|---|---|
| GPT-5.1 | OpenAI | $2.00 | $8.00 |
| GPT-4.1 | OpenAI | $2.00 | $8.00 |
| GPT-4.1 mini | OpenAI | $0.40 | $1.60 |
| GPT-4.1 nano | OpenAI | $0.10 | $0.40 |
| Claude Opus 4 | Anthropic | $15.00 | $75.00 |
| Claude Sonnet 4 | Anthropic | $3.00 | $15.00 |
| Claude Haiku 3.5 | Anthropic | $0.80 | $4.00 |
| Gemini 2.5 Pro | $1.25 | $10.00 | |
| Gemini 2.5 Flash | $0.15 | $0.60 | |
| Gemini 2.0 Flash | $0.10 | $0.40 | |
| DeepSeek-V3 | DeepSeek | $0.27 | $1.10 |
| DeepSeek-R1 | DeepSeek | $0.55 | $2.19 |
| Llama 4 Scout | Meta | $0.17 | $0.40 |
| Llama 4 Maverick | Meta | $0.27 | $0.85 |
Pricing as of January 2026. Update the pricing table in cost_tracker.py when providers change rates.
Adding Custom Model Pricing
If you use a model not in the default pricing table, register it before recording costs:
tracker.register_model_pricing(
model="my-custom-model",
input_price_per_million=1.50,
output_price_per_million=5.00,
provider="custom"
)
# Now costs will be calculated correctly
tracker.record(
model="my-custom-model",
input_tokens=1000,
output_tokens=500,
agent="custom_agent"
)Integration with Dashboard
The cost tracker feeds data to the dashboard cost tracking views. In the B2B API, cost data is aggregated per organization and exposed through the admin metrics endpoint:
# services/backend/src/api/routes/admin/metrics.py
@router.get("/cost-optimization")
async def get_cost_optimization(request: Request):
org_id = request.state.org_id
costs = await cost_service.get_org_costs(org_id)
return {
"total_cost_today": costs.today,
"total_cost_month": costs.month,
"projected_monthly": costs.projection,
"by_provider": costs.by_provider,
"by_agent": costs.by_agent,
"budget": costs.budget_status
}The dashboard polls this endpoint and displays the data in the cost tracking views. See Dashboard Cost Tracking for the frontend experience.
Best Practices
- Record every LLM call — even cached or retry calls should be recorded for accurate cost visibility.
- Use specific agent names — generic names like
agent_1make cost breakdowns hard to interpret. - Set budget alerts — use the
TenantCostTrackerto enforce budget limits and prevent cost overruns. - Review token efficiency — agents with low token efficiency may benefit from prompt optimization.
- Compare model costs — the pricing table makes it easy to estimate savings from switching to a cheaper model for non-critical agents.