Providers and Routing

The gateway routes requests to the correct LLM provider based on the endpoint you call and the model you request. It supports 50+ providers across built-in and custom routes, plus org-scoped provider targets for custom routing.

Supported providers

Tier 1 — Core (first-class namespaced routes)

Provider	Typical base URL	Upstream
OpenAI	`/v1/openai`	`https://api.openai.com`
Anthropic	`/v1/anthropic`	`https://api.anthropic.com`
Google	`/v1/google`	`https://generativelanguage.googleapis.com`
DeepSeek	`/v1/deepseek`	`https://api.deepseek.com`
Perplexity	`/v1/perplexity`	`https://api.perplexity.ai`

Tier 2 — OpenClaw Favorites

Provider	Upstream
Moonshot	`https://api.moonshot.ai`
MiniMax	`https://api.minimax.io`
ZAI	`https://api.z.ai`
Cerebras	`https://api.cerebras.ai`
Qwen	`https://dashscope-intl.aliyuncs.com/compatible-mode`

Tier 3 — Developer Staples

Provider	Upstream
Groq	`https://api.groq.com/openai`
Mistral	`https://api.mistral.ai`
xAI	`https://api.x.ai`
Together	`https://api.together.xyz`
Fireworks	`https://api.fireworks.ai/inference`
Cohere	`https://api.cohere.com/compatibility`
OpenRouter	`https://openrouter.ai/api`

Tier 4 — Extended Providers

Additional production providers are supported via auto-detection and custom targets, including SambaNova, AI21, Reka, Baichuan, Yi (01.AI), StepFun, Jina, plus the local-inference targets Ollama and vLLM. You can also reach 200+ models through OpenRouter or any OpenAI-compatible endpoint.

Use GET /v1/models for the full catalog available in your environment.

Common model examples

These are representative examples of models the router understands today. For the current catalog in your environment, use GET /v1/models.

Provider	Examples
OpenAI	`gpt-4o`, `gpt-4o-mini`, `o1`, `o3`
Anthropic	`claude-sonnet-4-5-20250929`, `claude-haiku-3-5-20241022`
Google	`gemini-2.5-pro`, `gemini-2.5-flash`, `gemini-2.0-flash`
DeepSeek	`deepseek-chat`, `deepseek-reasoner`
Perplexity	`sonar-pro`, `sonar-deep-research`
OpenRouter	`anthropic/claude-sonnet-4-6`, `google/gemini-2.0-flash-001`

Provider auto-detection

When you use the generic gateway endpoints, the provider is determined from the model name or alias:

Prefix	Provider
`gpt-`, `o1-`, `o3-`, `o4-`, `chatgpt-`, `ft:gpt-`	OpenAI
`claude-*`	Anthropic
`gemini-*`	Google
`deepseek-*`	DeepSeek
`sonar*`	Perplexity
`kimi-`, `moonshot-`	Moonshot
`minimax-`, `abab-`	MiniMax
`glm-*`	ZAI
`cerebras-*`	Cerebras
`qwen*`	Qwen
`llama-*`	Groq
`mistral-`, `mixtral-`, `codestral-`, `pixtral-`	Mistral
`grok-*`	xAI
`step-*`	StepFun
`command-*`	Cohere
`vendor/model`	OpenRouter

If the model name does not match any known prefix, the request is rejected with an error indicating the supported prefixes.

Model aliases

The gateway ships with built-in convenience aliases so teams can pin stable names in application code:

Alias	Resolves to
`gpt-4`	`gpt-4o`
`gpt-4-turbo`	`gpt-4o`
`claude-3`	`claude-sonnet-4-20250514`
`claude-3.5-sonnet`	`claude-sonnet-4-20250514`
`claude-sonnet`	`claude-sonnet-4-6-20250918`
`claude-opus`	`claude-opus-4-7` (new Apr 2026 default)
`claude-opus-4-6`	`claude-opus-4-6-20250918` (legacy opt-in)
`claude-haiku`	`claude-haiku-4-5-20251001`
`gemini-pro`	`gemini-3.1-pro`
`gemini-flash`	`gemini-3.1-flash-lite`
`deepseek`	`deepseek-chat`
`deepseek-r1`	`deepseek-reasoner`
`perplexity`	`sonar-pro`
`kimi`	`kimi-k2.5`
`minimax`	`MiniMax-M2.5`
`groq`	`llama-3.3-70b-versatile`
`mistral`	`mistral-large-latest`
`grok`	`grok-4`
`stepfun`	`step-3.5-flash`

Aliases are resolved before provider detection and governance checks.

Org-scoped model aliases

Organizations can define custom aliases through the dashboard or admin APIs. This lets teams keep a stable alias such as production-model while switching the underlying model later.


{
  "alias": "production-model",
  "target_model_ref": "gpt-4o",
  "description": "Production model for customer-facing features",
  "enabled": true
}

Org-scoped aliases resolve before built-in aliases.

Provider key handling

The gateway needs access to the provider’s API key to forward requests. There are two ways to supply it:

Option 1: Pass the key per request

Include the provider key in the request alongside your gateway key:


curl https://api.curate-me.ai/v1/openai/chat/completions \
  -H "Authorization: Bearer $OPENAI_API_KEY" \
  -H "X-CM-API-Key: cm_sk_xxx" \
  -d '{"model": "gpt-4o", "messages": [...]}'

Or use the explicit provider key header:


curl https://api.curate-me.ai/v1/openai/chat/completions \
  -H "X-Provider-Key: $OPENAI_API_KEY" \
  -H "X-CM-API-Key: cm_sk_xxx" \
  -d '{"model": "gpt-4o", "messages": [...]}'

Option 2: Store keys in the dashboard

Configure provider API keys as org-scoped secrets in the dashboard under Settings > Provider Secrets. When a request arrives without an explicit provider key, the gateway retrieves the stored key from encrypted secret custody storage.

This approach is recommended for production deployments because it keeps provider keys out of application code and client-side configurations.

Provider-specific auth formats

Each provider uses a different authentication mechanism upstream:

Provider	Auth Method
OpenAI	`Authorization: Bearer {key}` header
Anthropic	`x-api-key: {key}` header with `anthropic-version: 2023-06-01`
Google (Gemini)	`?key={key}` query parameter
DeepSeek	`Authorization: Bearer {key}` header (OpenAI-compatible)
OpenRouter	`Authorization: Bearer {key}` header

The gateway handles this translation automatically. You always pass your provider key in the same way regardless of which provider you are targeting.

Upstream resilience

The gateway includes automatic retry logic for transient provider errors. Each provider has a tuned resilience policy:

Provider	Max Retries	Base Delay	Max Delay	Retryable Status Codes
OpenAI	3	2.0s	60s	429, 500, 502, 503
Anthropic	3	1.0s	30s	429, 500, 502, 503, 529
Google	3	1.5s	45s	429, 500, 502, 503
DeepSeek	3	2.0s	60s	429, 500, 502, 503

Retry behavior:

Uses jittered exponential backoff to avoid thundering herd effects
Respects Retry-After headers from providers (up to 60 seconds)
Governance checks are NOT re-evaluated on retries — once a request passes governance, retries are transparent
After all retries are exhausted, the gateway returns the last error with retry metadata in the response

Circuit breakers:

Each provider has an independent circuit breaker. When a provider returns repeated failures, the circuit breaker opens and subsequent requests are rejected immediately with HTTP 503 instead of waiting for timeouts. The circuit breaker automatically closes after the provider recovers.

Dynamic provider routing

The gateway also supports org-scoped provider targets. This allows teams to route requests to custom endpoints such as self-hosted inference, an OpenAI-compatible endpoint, or a private model gateway while preserving the same governance surface.

Backend implementation

Key source files:

File	Purpose
`src/gateway/provider_router.py`	Model-to-provider routing, alias resolution, SSRF validation
`src/gateway/model_alias_registry.py`	Org-scoped model alias CRUD and resolution
`src/gateway/provider_registry.py`	Org-scoped provider target management
`src/gateway/upstream_resilience.py`	Retry logic and backoff strategies
`src/gateway/circuit_breaker.py`	Per-provider circuit breaker state machine
`src/gateway/proxy.py`	httpx reverse proxy with streaming SSE passthrough
`src/gateway/providers/google.py`	Google Gemini URL construction