Concepts

Rate limits

API throttling, 429 behavior, and best practices for retries

All endpoints are protected by rate limiting.

Limits

  • Global default: 3 requests/second per IP
  • Some endpoints define stricter/explicit overrides

Example overrides:

EndpointLimit
POST /model-swap5/minute
POST /flat-2-model5/minute
POST /identity/upload5/minute
POST /notifications12/minute
PUT /webhooks10/minute
POST /webhooks/test5/minute

Rate-limit response headers

When a rate-limit is hit, responses include:

HeaderDescription
X-RateLimit-Limitthe total number of requests allowed for the current window and endpoint
X-RateLimit-Remainingthe number of requests remaining in the current window
X-RateLimit-Resetthe timestamp (epoch) at which the current window resets
Retry-Afterthe time (in seconds) to wait before making a new request

Handling 429 responses

Respect Retry-After and retry with backoff:

import time
import requests

def call_with_retry(url, headers, payload, retries=3):
    for attempt in range(retries):
        response = requests.post(url, headers=headers, json=payload)
        if response.status_code != 429:
            return response
        wait_seconds = int(response.headers.get("Retry-After", 2 ** attempt))
        # NOTE: this is an example. Do not sleep in production code! 
        # Use an async backoff strategy instead.
        time.sleep(wait_seconds)
    return response

Note on account caps

Some actions also have account-level limits (for example, the maximum number of identities you can create). These are separate from request-rate limits and depend on your plan.

On this page