Skip to Content
SdkCost Tracking

Cost Tracking

See also: For gateway-level cost enforcement (budget caps, per-org aggregation, cost attribution applied in the governance chain), see Gateway Cost Tracking. This page documents the SDK’s client.costs surface for querying recorded costs and managing budgets programmatically.

The Curate-Me SDK records cost automatically. Every LLM call you route through the gateway is metered server-side — token usage, calculated cost, model, and the agent/workflow that made the call are all attributed without any per-call bookkeeping in your code. The SDK’s client.costs sub-client lets you read those costs back, break them down, and manage budgets.

Note: There is no client-side cost recorder in the SDK. Recording happens in the gateway’s cost recorder when a request is proxied. The SDK is a read/query + budget-management surface over that recorded data.

Accessing the costs client

costs is a sub-client of CurateMe. Every method has an async version and a _sync variant.

from curate_me import CurateMe client = CurateMe(api_key="cm_...", org_id="org_...") # or: client = CurateMe.from_env() costs = client.costs
import { CurateMe } from "@curate-me/sdk"; const client = new CurateMe({ apiKey: "cm_...", orgId: "org_..." }); const costs = client.costs;

Cost summary

Get an aggregated summary for a period. period accepts the CostPeriod enum (HOUR, DAY, WEEK, MONTH) or the equivalent string.

from curate_me import CostPeriod summary = await client.costs.get_summary(period=CostPeriod.DAY) print(summary.total_cost_usd) # e.g. 0.0834 print(summary.total_tokens) print(summary.total_requests) print(summary.by_agent) # {"dev_team": 0.0412, "security_audit": 0.0298, ...} print(summary.by_model) # {"claude-sonnet-4-6": 0.0412, "gpt-4o": 0.0298, ...}

The returned CostSummary has these fields:

FieldTypeDescription
periodCostPeriodThe aggregation period
start_date / end_datedatetimeWindow covered by the summary
total_cost_usdfloatTotal cost across all calls in the window
total_tokensintTotal tokens consumed
total_requestsintNumber of recorded requests
by_agentdict[str, float]Cost keyed by agent id
by_modeldict[str, float]Cost keyed by model id
by_workflowdict[str, float]Cost keyed by workflow id

You can also pass an explicit start_date / end_date (both datetime) instead of relying on the period window:

from datetime import datetime summary = await client.costs.get_summary( period=CostPeriod.MONTH, start_date=datetime(2026, 5, 1), end_date=datetime(2026, 5, 31), )

In the TypeScript SDK the summary is returned in camelCase. byAgent / byModel are arrays of { name, cost, percentage, requests } rather than maps:

const summary = await client.costs.getSummary({ period: "day" }); console.log(summary.totalCost); console.log(summary.totalRequests); for (const item of summary.byModel) { console.log(`${item.name}: $${item.cost} (${item.percentage}%)`); }

Cost breakdown by model

Get cost grouped by model id for a period:

by_model = await client.costs.get_by_model(period=CostPeriod.DAY) # { # "claude-sonnet-4-6": 0.0412, # "gpt-4o": 0.0298, # "gemini-2.5-flash": 0.0124, # }

In the TypeScript SDK, model breakdown comes from the summary’s byModel array (there is no separate getByModel method):

const summary = await client.costs.getSummary({ period: "day" }); const byModel = summary.byModel; // [{ name, cost, percentage, requests }, ...]

Cost breakdown by agent or workflow

Drill into a single agent or workflow. agent_id / workflow_id are opaque ids — for autopilot templates this is the template name (e.g. "dev_team", "security_audit").

agent_costs = await client.costs.get_by_agent("dev_team", period=CostPeriod.WEEK) workflow_costs = await client.costs.get_by_workflow("wf_123", period=CostPeriod.WEEK)

Listing individual cost entries

Each recorded call is a CostEntry. List them with optional filters and pagination:

entries = await client.costs.list_entries( agent_id="dev_team", # optional model="claude-sonnet-4-6", # optional limit=100, offset=0, ) for entry in entries: print(entry.model, entry.input_tokens, entry.output_tokens, entry.cost_usd)

A CostEntry exposes: id, agent_id, workflow_id, model, input_tokens, output_tokens, total_tokens, cost_usd, timestamp, trace_id, and metadata.

Budgets

Budgets are managed entirely through the SDK and enforced server-side by the gateway’s governance chain. Create, update, and check budgets with the *_budget methods.

from curate_me import CostPeriod budget = await client.costs.create_budget( name="Engineering monthly", limit_usd=500.0, period=CostPeriod.MONTH, alert_threshold=0.8, # alert at 80% of the limit enabled=True, ) # Check current spend against the budget status = await client.costs.check_budget(budget.id) # {"current_spend_usd": ..., "limit_usd": ..., "alert": bool, ...} # Update or disable later await client.costs.update_budget(budget.id, limit_usd=750.0) await client.costs.delete_budget(budget.id) # List all budgets budgets = await client.costs.list_budgets()

The TypeScript createBudget request shape differs from Python: it takes type (daily | weekly | monthly), a limit, optional alerts, and optional autoStop:

const budget = await client.costs.createBudget({ name: "Engineering monthly", type: "monthly", limit: 500, alerts: [{ threshold: 0.8, channels: ["email"] }], autoStop: false, });

A CostBudget has: id, name, limit_usd, period, current_spend_usd, alert_threshold, and enabled.

Projections and optimization

Project spend forward, and fetch server-generated optimization suggestions:

projection = await client.costs.get_projection(period=CostPeriod.MONTH) # {"estimated_cost_usd": ..., ...} suggestions = await client.costs.get_optimization_suggestions() # [{"type": "model_downgrade", "estimated_savings_usd": ..., ...}, ...]

Synchronous usage

Every method has a _sync counterpart for non-async code:

client = CurateMe(api_key="cm_...", org_id="org_...") summary = client.costs.get_summary_sync(period="day") budget = client.costs.create_budget_sync(name="Team", limit_usd=200.0)

Where cost data comes from

Costs are recorded by the gateway when it proxies your LLM calls. Point your existing OpenAI / Anthropic SDK at the gateway base URL with your Curate-Me key, and every call is metered:

# OpenAI SDK -> Curate-Me gateway OPENAI_BASE_URL=https://api.curate-me.ai/v1/openai # header on each request: X-CM-API-Key: cm_sk_xxx

The gateway looks up each model’s pricing, computes the cost, and attributes it to your org (and to the agent/workflow when that context is supplied). client.costs then reads that data back. See Gateway Cost Tracking for the recording side and Dashboard Cost Tracking for the UI.

Best Practices

  1. Tag calls with agent/workflow context — supplying an agent or workflow id when you proxy a call makes the by_agent / by_workflow breakdowns useful.
  2. Use specific agent ids — generic names like agent_1 make cost breakdowns hard to interpret.
  3. Set budgets with alert thresholdscreate_budget(..., alert_threshold=0.8) enforces limits server-side and alerts before you hit the cap.
  4. Review optimization suggestionsget_optimization_suggestions() surfaces model-downgrade and other savings opportunities computed from your actual usage.
  5. Compare model costs — use get_by_model() to see provider-level spend and estimate savings from switching a non-critical agent to a cheaper model.