Providers and Routing
The gateway routes requests to the correct LLM provider based on the endpoint you call and the model you request. It supports 51 built-in providers across 7 tiers, plus org-scoped provider targets for custom routing.
Supported providers
Tier 1 — Core (first-class namespaced routes)
| Provider | Typical base URL | Upstream |
|---|---|---|
| OpenAI | /v1/openai | https://api.openai.com |
| Anthropic | /v1/anthropic | https://api.anthropic.com |
/v1/google | https://generativelanguage.googleapis.com | |
| DeepSeek | /v1/deepseek | https://api.deepseek.com |
| Perplexity | /v1/perplexity | https://api.perplexity.ai |
Tier 2 — OpenClaw Favorites
| Provider | Upstream |
|---|---|
| Moonshot | https://api.moonshot.ai |
| MiniMax | https://api.minimax.io |
| ZAI | https://api.z.ai |
| Cerebras | https://api.cerebras.ai |
| Qwen | https://dashscope-intl.aliyuncs.com/compatible-mode |
Tier 3 — Developer Staples
| Provider | Upstream |
|---|---|
| Groq | https://api.groq.com/openai |
| Mistral | https://api.mistral.ai |
| xAI | https://api.x.ai |
| Together | https://api.together.xyz |
| Fireworks | https://api.fireworks.ai/inference |
| Cohere | https://api.cohere.com/compatibility |
| OpenRouter | https://openrouter.ai/api |
Tiers 4-7 — Extended Providers
An additional 34 providers are supported via auto-detection and custom targets, including AI21, Aleph Alpha, Anyscale, AWS Bedrock, Azure OpenAI, Baseten, Cloudflare Workers AI, Databricks, Hugging Face, Lambda, Lepton, NVIDIA NIM, OctoAI, Ollama, Replicate, Sambanova, and more.
Use GET /v1/models for the full catalog available in your environment.
Common model examples
These are representative examples of models the router understands today. For the current catalog in your environment, use GET /v1/models.
| Provider | Examples |
|---|---|
| OpenAI | gpt-4o, gpt-4o-mini, o1, o3 |
| Anthropic | claude-sonnet-4-5-20250929, claude-haiku-3-5-20241022 |
gemini-2.5-pro, gemini-2.5-flash, gemini-2.0-flash | |
| DeepSeek | deepseek-chat, deepseek-reasoner |
| Perplexity | sonar-pro, sonar-deep-research |
| OpenRouter | anthropic/claude-sonnet-4-6, google/gemini-2.0-flash-001 |
Provider auto-detection
When you use the generic gateway endpoints, the provider is determined from the model name or alias:
| Prefix | Provider |
|---|---|
gpt-*, o1-*, o3-*, o4-*, chatgpt-*, ft:gpt-* | OpenAI |
claude-* | Anthropic |
gemini-* | |
deepseek-* | DeepSeek |
sonar* | Perplexity |
kimi-*, moonshot-* | Moonshot |
minimax-*, abab-* | MiniMax |
glm-* | ZAI |
cerebras-* | Cerebras |
qwen* | Qwen |
llama-* | Groq |
mistral-*, mixtral-*, codestral-*, pixtral-* | Mistral |
grok-* | xAI |
command-* | Cohere |
vendor/model | OpenRouter |
If the model name does not match any known prefix, the request is rejected with an error indicating the supported prefixes.
Model aliases
The gateway ships with built-in convenience aliases so teams can pin stable names in application code:
| Alias | Resolves to |
|---|---|
gpt-4 | gpt-4o |
gpt-4-turbo | gpt-4o |
claude-3 | claude-sonnet-4-20250514 |
claude-3.5-sonnet | claude-sonnet-4-20250514 |
claude-sonnet | claude-sonnet-4-6-20250918 |
claude-opus | claude-opus-4-6-20250918 |
claude-haiku | claude-haiku-4-5-20251001 |
gemini-pro | gemini-2.5-pro |
gemini-flash | gemini-2.5-flash |
deepseek | deepseek-chat |
deepseek-r1 | deepseek-reasoner |
perplexity | sonar-pro |
kimi | kimi-k2.5 |
minimax | MiniMax-M2.5 |
groq | llama-3.3-70b-versatile |
mistral | mistral-large-latest |
grok | grok-3 |
Aliases are resolved before provider detection and governance checks.
Org-scoped model aliases
Organizations can define custom aliases through the dashboard or admin APIs. This lets teams keep a stable alias such as production-model while switching the underlying model later.
{
"alias": "production-model",
"target_model_ref": "gpt-4o",
"description": "Production model for customer-facing features",
"enabled": true
}Org-scoped aliases resolve before built-in aliases.
Provider key handling
The gateway needs access to the provider’s API key to forward requests. There are two ways to supply it:
Option 1: Pass the key per request
Include the provider key in the request alongside your gateway key:
curl https://api.curate-me.ai/v1/openai/chat/completions \
-H "Authorization: Bearer $OPENAI_API_KEY" \
-H "X-CM-API-Key: cm_sk_xxx" \
-d '{"model": "gpt-4o", "messages": [...]}'Or use the explicit provider key header:
curl https://api.curate-me.ai/v1/openai/chat/completions \
-H "X-Provider-Key: $OPENAI_API_KEY" \
-H "X-CM-API-Key: cm_sk_xxx" \
-d '{"model": "gpt-4o", "messages": [...]}'Option 2: Store keys in the dashboard
Configure provider API keys as org-scoped secrets in the dashboard under Settings > Provider Secrets. When a request arrives without an explicit provider key, the gateway retrieves the stored key from encrypted secret custody storage.
This approach is recommended for production deployments because it keeps provider keys out of application code and client-side configurations.
Provider-specific auth formats
Each provider uses a different authentication mechanism upstream:
| Provider | Auth Method |
|---|---|
| OpenAI | Authorization: Bearer {key} header |
| Anthropic | x-api-key: {key} header with anthropic-version: 2023-06-01 |
| Google (Gemini) | ?key={key} query parameter |
| DeepSeek | Authorization: Bearer {key} header (OpenAI-compatible) |
| OpenRouter | Authorization: Bearer {key} header |
The gateway handles this translation automatically. You always pass your provider key in the same way regardless of which provider you are targeting.
Upstream resilience
The gateway includes automatic retry logic for transient provider errors. Each provider has a tuned resilience policy:
| Provider | Max Retries | Base Delay | Max Delay | Retryable Status Codes |
|---|---|---|---|---|
| OpenAI | 3 | 2.0s | 60s | 429, 500, 502, 503 |
| Anthropic | 3 | 1.0s | 30s | 429, 500, 502, 503, 529 |
| 3 | 1.5s | 45s | 429, 500, 502, 503 | |
| DeepSeek | 3 | 2.0s | 60s | 429, 500, 502, 503 |
Retry behavior:
- Uses jittered exponential backoff to avoid thundering herd effects
- Respects
Retry-Afterheaders from providers (up to 60 seconds) - Governance checks are NOT re-evaluated on retries — once a request passes governance, retries are transparent
- After all retries are exhausted, the gateway returns the last error with retry metadata in the response
Circuit breakers:
Each provider has an independent circuit breaker. When a provider returns repeated failures, the circuit breaker opens and subsequent requests are rejected immediately with HTTP 503 instead of waiting for timeouts. The circuit breaker automatically closes after the provider recovers.
Dynamic provider routing
The gateway also supports org-scoped provider targets. This allows teams to route requests to custom endpoints such as Azure OpenAI, self-hosted inference, or a private model gateway while preserving the same governance surface.
Backend implementation
Key source files:
| File | Purpose |
|---|---|
src/gateway/provider_router.py | Model-to-provider routing, alias resolution, SSRF validation |
src/gateway/model_alias_registry.py | Org-scoped model alias CRUD and resolution |
src/gateway/provider_registry.py | Org-scoped provider target management |
src/gateway/upstream_resilience.py | Retry logic and backoff strategies |
src/gateway/circuit_breaker.py | Per-provider circuit breaker state machine |
src/gateway/proxy.py | httpx reverse proxy with streaming SSE passthrough |
src/gateway/providers/google.py | Google Gemini URL construction |