5xx Server Errors
5xx errors indicate that the gateway (or an upstream provider) encountered an unexpected condition. The source field in the error body identifies whether the failure originated in Curate-Me or the provider.
Error body
{
"error": {
"code": "upstream_error",
"message": "Upstream provider returned 503. The provider may be experiencing downtime.",
"request_id": "req_01hwz3kj4p5qm8n9v2t6yt",
"source": "provider",
"provider": "openai",
"provider_status": 503,
"retryable": true
}
}Error codes
upstream_error (source: provider)
The provider returned an error. CM received the response but cannot fulfill your request.
retryable: true — The provider is likely experiencing a transient issue (503, 504, 500 with provider error message). Safe to retry with backoff.
retryable: false — The provider returned a 4xx that CM is surfacing (e.g., the model was deprecated, invalid parameters). Fix the request before retrying.
# Check provider status pages
# OpenAI: https://status.openai.com
# Anthropic: https://status.anthropic.comgateway_error (source: curate-me)
{
"error": {
"code": "gateway_error",
"message": "An internal error occurred. Our team has been notified.",
"request_id": "req_01hwz3kj4p5qm8n9v2t6yu",
"source": "curate-me",
"retryable": true
}
}CM itself encountered an unexpected error. These are rare — our error rate target is <0.1% for 5xx.
Always retryable. Wait 1–2 seconds and retry. If the error persists for more than 60 seconds, check status.curate-me.ai and contact support@curate-me.ai with the request_id.
cost_recorder_error (source: curate-me)
{
"error": {
"code": "cost_recorder_error",
"message": "Failed to record request cost. The request was NOT forwarded to the provider.",
"request_id": "req_01hwz3kj4p5qm8n9v2t6yv",
"source": "curate-me",
"retryable": true
}
}This is a fail-closed safety behavior. If CM cannot record the cost, it does not forward the request — this prevents unbounded spend when observability is degraded. Retry is safe; the provider was never charged.
proxy_timeout
{
"error": {
"code": "proxy_timeout",
"message": "Request to provider timed out after 120s.",
"request_id": "req_01hwz3kj4p5qm8n9v2t6yw",
"source": "provider",
"retryable": true
}
}The provider took longer than CM’s proxy timeout (120 seconds). This can happen with very large completions or during provider slowdowns.
Fix options:
- Retry — timeouts are often transient
- Reduce
max_tokensto get a shorter (faster) response - Use a different model or provider during the outage
Retry strategy for 5xx
import time
from openai import OpenAI, APIStatusError
client = OpenAI(
base_url="https://api.curate-me.ai/v1/openai",
api_key="cm_sk_your_key",
max_retries=3, # Handles 5xx automatically
)
# Or manual with exponential backoff
def call_with_exponential_backoff(messages, max_attempts=4):
delay = 1.0
for attempt in range(max_attempts):
try:
return client.chat.completions.create(
model="gpt-4o",
messages=messages,
)
except APIStatusError as e:
if e.status_code >= 500 and attempt < max_attempts - 1:
print(f"5xx error (attempt {attempt + 1}). Retrying in {delay:.1f}s...")
time.sleep(delay)
delay = min(delay * 2, 30) # cap at 30s
else:
raiseDistinguishing CM errors from provider errors
| Indicator | CM error | Provider error |
|---|---|---|
source field | "curate-me" | "provider" |
X-CM-Request-Id header | Present | Present |
X-CM-Governance-Time-Ms header | Present | Present |
| Response time | <100ms usually | Variable |
| Status page | status.curate-me.ai | Provider’s status page |
Always log X-CM-Request-Id from every response. You’ll need it for support tickets — it maps to the exact trace in our logs.
Contact support
For persistent 5xx errors:
- Note the
request_idfrom the error body - Check status.curate-me.ai for active incidents
- Email support@curate-me.ai with the
request_idand timestamp