AI Gateway Overview
The Curate-Me AI Gateway is the control plane between your application and LLM providers. You keep your existing SDK, swap the base URL, add a Curate-Me key, and every request gets governance, routing, cost tracking, and auditability before it reaches the model.
How it works
# Before
OPENAI_BASE_URL=https://api.openai.com/v1
# After
OPENAI_BASE_URL=https://api.curate-me.ai/v1/openai
X-CM-API-Key: cm_sk_xxxThe gateway accepts provider-native payloads, runs them through policy and routing, then proxies the request upstream. Responses stay compatible with the SDK you already use, including streaming.
What the gateway adds
| Layer | What it does |
|---|---|
| Governance | Applies rate limits, budget controls, model access checks, PII scanning, content safety, and HITL approval gates |
| Routing | Resolves model aliases, matches the correct provider, and supports provider-scoped base URLs |
| Observability | Adds request IDs, spend headers, retry metadata, usage logs, and health visibility |
| Resilience | Retries transient upstream failures, exposes idempotency keys, and protects against provider outages |
| Operations | Supports stored provider secrets, admin APIs, usage dashboards, approval queues, and runner-aware cost controls |
Endpoint patterns
| Pattern | Example | Best for |
|---|---|---|
| Provider-namespaced base URL | https://api.curate-me.ai/v1/openai | OpenAI-compatible SDKs |
| Anthropic base URL | https://api.curate-me.ai/v1/anthropic | Anthropic SDKs |
| Generic OpenAI-compatible endpoint | POST /v1/chat/completions | Direct HTTP or custom clients |
| Generic Anthropic endpoint | POST /v1/messages | Direct HTTP or Anthropic-style payloads |
| Model discovery | GET /v1/models | Listing routable models through the gateway |
Supported providers
| Provider family | Built-in providers |
|---|---|
| Core | OpenAI, Anthropic, Google, DeepSeek, Perplexity |
| OpenClaw favorites | Moonshot, MiniMax, ZAI, Cerebras, Qwen |
| Developer staples | Groq, Mistral, xAI, Together, Fireworks, Cohere, OpenRouter |
Authentication
| Credential | Purpose |
|---|---|
X-CM-API-Key | Authenticates your request to Curate-Me |
X-Provider-Key or Authorization: Bearer <provider-key> | Authenticates the upstream call to the model provider |
Stored provider secrets are supported when you do not want provider keys living in application code.
Common response headers
| Header | Meaning |
|---|---|
X-CM-Request-ID | Gateway request ID for tracing, support, and log lookup |
X-CM-Cost | Estimated request cost in USD |
X-CM-Daily-Cost | Current daily spend for the org |
X-CM-Daily-Budget | Active daily budget used for governance decisions |
X-RateLimit-Limit | Requests allowed in the current rate-limit window |
X-RateLimit-Remaining | Requests left in the window |
X-RateLimit-Reset | Unix timestamp when the window resets |
X-Idempotency-Key | Stable key used for retry-safe upstream execution |
X-Process-Time | Gateway processing time in seconds |
Quick start
from openai import OpenAI
client = OpenAI(
base_url="https://api.curate-me.ai/v1/openai",
api_key="sk-your-openai-key",
default_headers={"X-CM-API-Key": "cm_sk_xxx"},
)
response = client.chat.completions.create(
model="gpt-4o",
messages=[{"role": "user", "content": "Explain what the gateway does."}],
)Local development
cd services/backend
poetry install
poetry run uvicorn src.main_gateway:app --reload --port 8002Then point your SDK at http://localhost:8002/v1/openai, http://localhost:8002/v1/anthropic, or another provider path.