Skip to Content
SdkCost Tracking

Cost Tracking

The CostTracker class provides real-time LLM cost recording and querying. Every model call made by any agent is tracked with its token usage, calculated cost, and metadata. The tracker integrates with the dashboard cost views to give teams full visibility into their AI spend.

CostTracker

The CostTracker is the core interface for recording and querying LLM costs:

from src.utils.cost_tracker import CostTracker tracker = CostTracker()

In production, the tracker is initialized once and shared across all agents in a pipeline execution. It accumulates costs throughout the run and can be queried at any point.

Recording Costs

Record a cost event after every LLM call using tracker.record():

tracker.record( model="gpt-4o", input_tokens=1200, output_tokens=450, agent="style_analysis" )
ParameterTypeDescription
modelstrThe model identifier (e.g., gpt-4o, claude-sonnet-4-5-20250929)
input_tokensintNumber of input (prompt) tokens consumed
output_tokensintNumber of output (completion) tokens generated
agentstrThe agent that made the call, used for per-agent breakdowns

The tracker automatically looks up the model’s pricing and calculates the cost. If a model is not in the pricing table, the cost is recorded as zero with a warning.

Querying Costs

Total Cost

Get the total accumulated cost across all agents and models:

total = tracker.total_cost # 0.0834

Cost by Agent

Break down costs by agent to identify which agents are most expensive:

by_agent = tracker.by_agent() # { # "vision_analysis": 0.0412, # "style_analysis": 0.0298, # "style_critic": 0.0124 # }

Cost by Model

Break down costs by model to understand provider-level spend:

by_model = tracker.by_model() # { # "gemini-2.5-pro": 0.0412, # "gpt-5.1": 0.0298, # "deepseek-v3": 0.0124 # }

Cost Summary

Generate a comprehensive cost summary with usage metrics and projections:

summary = tracker.get_summary()

The summary includes:

{ "total_cost_usd": 0.0834, "total_calls": 3, "total_input_tokens": 4800, "total_output_tokens": 1850, "avg_input_tokens_per_call": 1600, "avg_output_tokens_per_call": 617, "avg_cost_per_call": 0.0278, "by_agent": { "vision_analysis": {"cost": 0.0412, "calls": 1, "input_tokens": 2400, "output_tokens": 800}, "style_analysis": {"cost": 0.0298, "calls": 1, "input_tokens": 1600, "output_tokens": 650}, "style_critic": {"cost": 0.0124, "calls": 1, "input_tokens": 800, "output_tokens": 400} }, "by_model": { "gemini-2.5-pro": {"cost": 0.0412, "calls": 1}, "gpt-5.1": {"cost": 0.0298, "calls": 1}, "deepseek-v3": {"cost": 0.0124, "calls": 1} } }

Usage Metrics

Beyond raw cost, the tracker provides efficiency metrics:

metrics = tracker.get_usage_metrics() # { # "total_calls": 3, # "total_input_tokens": 4800, # "total_output_tokens": 1850, # "avg_tokens_per_call": 2217, # "token_efficiency": 0.385, # output_tokens / total_tokens # "cost_per_1k_output_tokens": 0.045 # }

Token efficiency measures the ratio of output tokens to total tokens. A low efficiency score may indicate prompts that are too verbose relative to their output.

Supported Models

The tracker includes built-in pricing for the following models:

ModelProviderInput (per 1M tokens)Output (per 1M tokens)
GPT-5.1OpenAI$2.00$8.00
GPT-4.1OpenAI$2.00$8.00
GPT-4.1 miniOpenAI$0.40$1.60
GPT-4.1 nanoOpenAI$0.10$0.40
Claude Opus 4Anthropic$15.00$75.00
Claude Sonnet 4Anthropic$3.00$15.00
Claude Haiku 3.5Anthropic$0.80$4.00
Gemini 2.5 ProGoogle$1.25$10.00
Gemini 2.5 FlashGoogle$0.15$0.60
Gemini 2.0 FlashGoogle$0.10$0.40
DeepSeek-V3DeepSeek$0.27$1.10
DeepSeek-R1DeepSeek$0.55$2.19
Llama 4 ScoutMeta$0.17$0.40
Llama 4 MaverickMeta$0.27$0.85

Pricing as of January 2026. Update the pricing table in cost_tracker.py when providers change rates.

Adding Custom Model Pricing

If you use a model not in the default pricing table, register it before recording costs:

tracker.register_model_pricing( model="my-custom-model", input_price_per_million=1.50, output_price_per_million=5.00, provider="custom" ) # Now costs will be calculated correctly tracker.record( model="my-custom-model", input_tokens=1000, output_tokens=500, agent="custom_agent" )

Integration with Dashboard

The cost tracker feeds data to the dashboard cost tracking views. In the B2B API, cost data is aggregated per organization and exposed through the admin metrics endpoint:

# services/backend/src/api/routes/admin/metrics.py @router.get("/cost-optimization") async def get_cost_optimization(request: Request): org_id = request.state.org_id costs = await cost_service.get_org_costs(org_id) return { "total_cost_today": costs.today, "total_cost_month": costs.month, "projected_monthly": costs.projection, "by_provider": costs.by_provider, "by_agent": costs.by_agent, "budget": costs.budget_status }

The dashboard polls this endpoint and displays the data in the cost tracking views. See Dashboard Cost Tracking for the frontend experience.

Best Practices

  1. Record every LLM call — even cached or retry calls should be recorded for accurate cost visibility.
  2. Use specific agent names — generic names like agent_1 make cost breakdowns hard to interpret.
  3. Set budget alerts — use the TenantCostTracker to enforce budget limits and prevent cost overruns.
  4. Review token efficiency — agents with low token efficiency may benefit from prompt optimization.
  5. Compare model costs — the pricing table makes it easy to estimate savings from switching to a cheaper model for non-critical agents.