Cost Tracking
See also: For gateway-level cost enforcement (budget caps, per-org aggregation, cost attribution applied in the governance chain), see Gateway Cost Tracking. This page documents the SDK’s
client.costssurface for querying recorded costs and managing budgets programmatically.
The Curate-Me SDK records cost automatically. Every LLM call you route through the gateway is
metered server-side — token usage, calculated cost, model, and the agent/workflow that made the
call are all attributed without any per-call bookkeeping in your code. The SDK’s client.costs
sub-client lets you read those costs back, break them down, and manage budgets.
Note: There is no client-side cost recorder in the SDK. Recording happens in the gateway’s cost recorder when a request is proxied. The SDK is a read/query + budget-management surface over that recorded data.
Accessing the costs client
costs is a sub-client of CurateMe. Every method has an async version and a _sync variant.
from curate_me import CurateMe
client = CurateMe(api_key="cm_...", org_id="org_...")
# or: client = CurateMe.from_env()
costs = client.costsimport { CurateMe } from "@curate-me/sdk";
const client = new CurateMe({ apiKey: "cm_...", orgId: "org_..." });
const costs = client.costs;Cost summary
Get an aggregated summary for a period. period accepts the CostPeriod enum
(HOUR, DAY, WEEK, MONTH) or the equivalent string.
from curate_me import CostPeriod
summary = await client.costs.get_summary(period=CostPeriod.DAY)
print(summary.total_cost_usd) # e.g. 0.0834
print(summary.total_tokens)
print(summary.total_requests)
print(summary.by_agent) # {"dev_team": 0.0412, "security_audit": 0.0298, ...}
print(summary.by_model) # {"claude-sonnet-4-6": 0.0412, "gpt-4o": 0.0298, ...}The returned CostSummary has these fields:
| Field | Type | Description |
|---|---|---|
period | CostPeriod | The aggregation period |
start_date / end_date | datetime | Window covered by the summary |
total_cost_usd | float | Total cost across all calls in the window |
total_tokens | int | Total tokens consumed |
total_requests | int | Number of recorded requests |
by_agent | dict[str, float] | Cost keyed by agent id |
by_model | dict[str, float] | Cost keyed by model id |
by_workflow | dict[str, float] | Cost keyed by workflow id |
You can also pass an explicit start_date / end_date (both datetime) instead of relying on the
period window:
from datetime import datetime
summary = await client.costs.get_summary(
period=CostPeriod.MONTH,
start_date=datetime(2026, 5, 1),
end_date=datetime(2026, 5, 31),
)In the TypeScript SDK the summary is returned in camelCase. byAgent / byModel are arrays of
{ name, cost, percentage, requests } rather than maps:
const summary = await client.costs.getSummary({ period: "day" });
console.log(summary.totalCost);
console.log(summary.totalRequests);
for (const item of summary.byModel) {
console.log(`${item.name}: $${item.cost} (${item.percentage}%)`);
}Cost breakdown by model
Get cost grouped by model id for a period:
by_model = await client.costs.get_by_model(period=CostPeriod.DAY)
# {
# "claude-sonnet-4-6": 0.0412,
# "gpt-4o": 0.0298,
# "gemini-2.5-flash": 0.0124,
# }In the TypeScript SDK, model breakdown comes from the summary’s byModel array (there is no
separate getByModel method):
const summary = await client.costs.getSummary({ period: "day" });
const byModel = summary.byModel; // [{ name, cost, percentage, requests }, ...]Cost breakdown by agent or workflow
Drill into a single agent or workflow. agent_id / workflow_id are opaque ids — for autopilot
templates this is the template name (e.g. "dev_team", "security_audit").
agent_costs = await client.costs.get_by_agent("dev_team", period=CostPeriod.WEEK)
workflow_costs = await client.costs.get_by_workflow("wf_123", period=CostPeriod.WEEK)Listing individual cost entries
Each recorded call is a CostEntry. List them with optional filters and pagination:
entries = await client.costs.list_entries(
agent_id="dev_team", # optional
model="claude-sonnet-4-6", # optional
limit=100,
offset=0,
)
for entry in entries:
print(entry.model, entry.input_tokens, entry.output_tokens, entry.cost_usd)A CostEntry exposes: id, agent_id, workflow_id, model, input_tokens, output_tokens,
total_tokens, cost_usd, timestamp, trace_id, and metadata.
Budgets
Budgets are managed entirely through the SDK and enforced server-side by the gateway’s governance
chain. Create, update, and check budgets with the *_budget methods.
from curate_me import CostPeriod
budget = await client.costs.create_budget(
name="Engineering monthly",
limit_usd=500.0,
period=CostPeriod.MONTH,
alert_threshold=0.8, # alert at 80% of the limit
enabled=True,
)
# Check current spend against the budget
status = await client.costs.check_budget(budget.id)
# {"current_spend_usd": ..., "limit_usd": ..., "alert": bool, ...}
# Update or disable later
await client.costs.update_budget(budget.id, limit_usd=750.0)
await client.costs.delete_budget(budget.id)
# List all budgets
budgets = await client.costs.list_budgets()The TypeScript createBudget request shape differs from Python: it takes type
(daily | weekly | monthly), a limit, optional alerts, and optional autoStop:
const budget = await client.costs.createBudget({
name: "Engineering monthly",
type: "monthly",
limit: 500,
alerts: [{ threshold: 0.8, channels: ["email"] }],
autoStop: false,
});A CostBudget has: id, name, limit_usd, period, current_spend_usd, alert_threshold,
and enabled.
Projections and optimization
Project spend forward, and fetch server-generated optimization suggestions:
projection = await client.costs.get_projection(period=CostPeriod.MONTH)
# {"estimated_cost_usd": ..., ...}
suggestions = await client.costs.get_optimization_suggestions()
# [{"type": "model_downgrade", "estimated_savings_usd": ..., ...}, ...]Synchronous usage
Every method has a _sync counterpart for non-async code:
client = CurateMe(api_key="cm_...", org_id="org_...")
summary = client.costs.get_summary_sync(period="day")
budget = client.costs.create_budget_sync(name="Team", limit_usd=200.0)Where cost data comes from
Costs are recorded by the gateway when it proxies your LLM calls. Point your existing OpenAI / Anthropic SDK at the gateway base URL with your Curate-Me key, and every call is metered:
# OpenAI SDK -> Curate-Me gateway
OPENAI_BASE_URL=https://api.curate-me.ai/v1/openai
# header on each request:
X-CM-API-Key: cm_sk_xxxThe gateway looks up each model’s pricing, computes the cost, and attributes it to your org (and to
the agent/workflow when that context is supplied). client.costs then reads that data back. See
Gateway Cost Tracking for the recording side and
Dashboard Cost Tracking for the UI.
Best Practices
- Tag calls with agent/workflow context — supplying an agent or workflow id when you proxy a
call makes the
by_agent/by_workflowbreakdowns useful. - Use specific agent ids — generic names like
agent_1make cost breakdowns hard to interpret. - Set budgets with alert thresholds —
create_budget(..., alert_threshold=0.8)enforces limits server-side and alerts before you hit the cap. - Review optimization suggestions —
get_optimization_suggestions()surfaces model-downgrade and other savings opportunities computed from your actual usage. - Compare model costs — use
get_by_model()to see provider-level spend and estimate savings from switching a non-critical agent to a cheaper model.