LogoSoraVideo 문서
API Reference

Rate Limits

API usage limits and quotas

Overview

The Sora 2 Video API implements rate limiting to ensure fair usage and maintain service stability. Understanding these limits will help you design robust applications.

Default Limits

Limit TypeValueDescription
Requests per minute20Maximum requests in a rolling 60-second window
Concurrent tasks3Maximum generation tasks running simultaneously
Daily requests500Maximum requests per 24-hour period

These are default limits for standard API keys. Enterprise users can request higher limits.

Rate Limit Headers

Every API response includes headers that indicate your current rate limit status:

X-RateLimit-Limit: 20
X-RateLimit-Remaining: 15
X-RateLimit-Reset: 1705312800
X-Concurrent-Limit: 3
X-Concurrent-Active: 1
HeaderDescription
X-RateLimit-LimitMaximum requests per minute
X-RateLimit-RemainingRemaining requests in current window
X-RateLimit-ResetUnix timestamp when the limit resets
X-Concurrent-LimitMaximum concurrent tasks allowed
X-Concurrent-ActiveCurrently active tasks

Rate Limit Error

When you exceed rate limits, you'll receive a 429 Too Many Requests response:

{
  "code": 2001,
  "msg": "Rate limit exceeded. Please slow down.",
  "retryAfter": 15
}

The retryAfter field indicates seconds to wait before retrying.

Upstream Service Limits

In addition to our API rate limits, the underlying video generation service may also impose its own rate limits. When this happens, you'll receive a 429 response with an error indicating the upstream service is rate limiting.

Upstream rate limits are separate from our API limits. Even if you haven't reached our limits, the video generation service may temporarily limit requests during peak usage.

When you encounter an upstream rate limit:

  1. Wait 10-30 seconds before retrying
  2. Implement exponential backoff for repeated failures
  3. Consider queuing your requests to avoid overwhelming the service

Handling Rate Limits

Implement Exponential Backoff

When you receive a 429 error, wait before retrying:

async function makeRequest(url, options, maxRetries = 3) {
  for (let attempt = 0; attempt < maxRetries; attempt++) {
    const response = await fetch(url, options);

    if (response.status === 429) {
      const retryAfter = response.headers.get('Retry-After') ||
                         Math.pow(2, attempt) * 1000;
      await sleep(retryAfter);
      continue;
    }

    return response;
  }
  throw new Error('Max retries exceeded');
}

Monitor Rate Limit Headers

Track your usage using response headers:

function checkRateLimits(response) {
  const remaining = response.headers.get('X-RateLimit-Remaining');
  const resetTime = response.headers.get('X-RateLimit-Reset');

  if (remaining < 10) {
    console.warn(`Low rate limit: ${remaining} requests remaining`);
    console.warn(`Resets at: ${new Date(resetTime * 1000)}`);
  }
}

Queue Requests

For high-volume applications, implement a request queue:

class RequestQueue {
  constructor(maxRequestsPerSecond = 1) {
    this.queue = [];
    this.interval = 1000 / maxRequestsPerSecond;
    this.processing = false;
  }

  async add(requestFn) {
    return new Promise((resolve, reject) => {
      this.queue.push({ requestFn, resolve, reject });
      this.process();
    });
  }

  async process() {
    if (this.processing || this.queue.length === 0) return;

    this.processing = true;
    const { requestFn, resolve, reject } = this.queue.shift();

    try {
      const result = await requestFn();
      resolve(result);
    } catch (error) {
      reject(error);
    }

    setTimeout(() => {
      this.processing = false;
      this.process();
    }, this.interval);
  }
}

Concurrent Task Limits

The concurrent task limit (3 by default) applies to generation endpoints:

  • /api/v1/sora2/text-to-video
  • /api/v1/sora2/image-to-video
  • /api/v1/sora2-pro/text-to-video
  • /api/v1/sora2-pro/image-to-video
  • /api/v1/sora2/watermark-remove
  • /api/v1/sora2-pro/storyboard

Checking Active Tasks

Use the credits endpoint to see active tasks:

curl https://soravideo.art/api/v1/credits \
  -H "Authorization: Bearer sk_your_api_key"

Waiting for Task Completion

Before starting new tasks, poll existing ones:

async function waitForCompletion(taskId) {
  while (true) {
    const response = await fetch(`/api/v1/tasks/${taskId}`, {
      headers: { 'Authorization': `Bearer ${API_KEY}` }
    });

    const { data } = await response.json();

    if (data.status === 'completed' || data.status === 'failed') {
      return data;
    }

    await sleep(2000); // Poll every 2 seconds
  }
}

Daily Limits

The daily limit (500 requests) resets at midnight UTC. To check your current usage:

curl https://soravideo.art/api/v1/credits \
  -H "Authorization: Bearer sk_your_api_key"

Best Practices

Batch Operations

Combine multiple operations when possible to reduce request count.

Cache Results

Cache task results and API responses to avoid redundant requests.

Use Webhooks

If available, use webhooks instead of polling for task completion.

Monitor Usage

Regularly check your API key statistics in the dashboard.

Enterprise Limits

For applications requiring higher limits, contact us for enterprise plans with:

  • Higher requests per minute (up to 100)
  • More concurrent tasks (up to 10)
  • Increased daily limits (up to 10,000)
  • Dedicated support

Redis Fallback

Our rate limiting uses Redis for distributed state. If Redis is unavailable:

Fallback StrategyBehavior
deny (default)All requests are denied (503 error)
allowAll requests are allowed (no rate limiting)

The default fallback strategy is deny to prevent abuse during outages.

Next Steps