Skip to Content
Errors429 Too Many Requests

429 Too Many Requests

Your request was blocked by the rate limiter before reaching the provider. No tokens were consumed.

Error body

{ "error": { "code": "rate_limit_exceeded", "message": "Rate limit exceeded: 60 requests/min for key cm_sk_xxx. 46 requests remaining after retry window.", "request_id": "req_01hwz3kj4p5qm8n9v2t6ys", "governance_stage": "rate_limit", "retry_after": 14, "limit": 60, "remaining": 0, "reset_at": "2026-05-25T14:32:14Z" } }

Response headers

Every request (success or 429) includes IETF-standard rate limit headers:

HeaderValueMeaning
RateLimit-Limit60Max requests per minute for this key
RateLimit-Remaining0Requests remaining in current window
RateLimit-Reset14Seconds until the window resets
Retry-After14Same as RateLimit-Reset for 429 responses

Handling 429s in code

import time import httpx from openai import OpenAI, RateLimitError client = OpenAI( base_url="https://api.curate-me.ai/v1/openai", api_key="cm_sk_your_key", max_retries=3, # OpenAI SDK retries 429s with backoff automatically timeout=30.0, ) # Manual retry with Retry-After header def call_with_retry(messages, max_attempts=3): for attempt in range(max_attempts): try: return client.chat.completions.create( model="gpt-4o", messages=messages, ) except RateLimitError as e: if attempt == max_attempts - 1: raise retry_after = int(e.response.headers.get("Retry-After", 10)) print(f"Rate limited. Retrying in {retry_after}s...") time.sleep(retry_after)

Or use the Curate-Me Python SDK which has built-in retry with RetryPolicy:

from curate_me.gateway import CurateGateway, RetryPolicy gw = CurateGateway( api_key="cm_sk_your_key", retry_policy=RetryPolicy( max_retries=3, initial_delay=1.0, backoff_factor=2.0, retryable_status_codes={429, 500, 502, 503, 504}, ), ) client = gw.openai()

Raising the rate limit

Default rate limits by plan:

PlanRequests per minute
Free10 RPM
Starter60 RPM
Pro300 RPM
EnterpriseCustom

To raise your limit:

# Check current limit curl https://api.curate-me.ai/v1/admin/rate-limits \ -H "X-CM-API-Key: cm_sk_your_key" # Request a higher limit (Starter+ only) # Contact support@curate-me.ai or upgrade your plan

Or configure a per-key rate limit lower than the org limit (useful for restricting individual integrations):

curl -X PATCH https://api.curate-me.ai/v1/admin/api-keys/key_xxx \ -H "X-CM-API-Key: cm_sk_admin_key" \ -H "Content-Type: application/json" \ -d '{"rate_limit_rpm": 30}'

Rate limit scope

Rate limits apply at two levels simultaneously:

  1. Per-key limit — requests from this specific cm_sk_... key (configurable per key)
  2. Per-org limit — total requests across all keys for the organization (plan-level)

A 429 on either level blocks the request. The error body specifies which limit was hit via the message field.

The OpenAI and Anthropic SDKs both retry 429 errors automatically with exponential backoff. If you’re using the SDK directly (not the CM Python SDK), you get retry logic for free.