Rate limiting is a mechanism that caps how many requests a client can make to an API within a given time window. If you exceed the limit, the API rejects your request with HTTP 429 Too Many Requests until the window resets.
Rate limits protect APIs from abuse, prevent individual clients from monopolizing shared resources, and ensure fair access for all users. Without them, a single misbehaving script could overwhelm a service and degrade it for everyone.
Rate limits are also a business tool. Free tiers get lower limits. Paid plans get higher ones. The limit signals: "you can use this much before you need to upgrade."
| Type | Example | Common in |
|---|---|---|
| Per second | 10 requests/second | Real-time APIs, search endpoints |
| Per minute | 60 requests/minute | General-purpose APIs |
| Per hour | 1,000 requests/hour | Data-heavy endpoints |
| Per day | 10,000 requests/day | Free tier APIs, email services |
| Concurrent | 5 simultaneous requests | Heavy processing endpoints |
When you exceed a rate limit, the API returns a 429 status code. The response usually includes a Retry-After header telling you how many seconds to wait before retrying. Some APIs also include rate limit headers on every response:
X-RateLimit-Limit — the maximum number of requests allowedX-RateLimit-Remaining — how many requests you have leftX-RateLimit-Reset — when the window resets (Unix timestamp or seconds)Always check these headers. They let you proactively slow down before hitting the limit, rather than reacting after you've been rejected.
Configure your job queue to process jobs at a rate below the API's limit. If the API allows 60 requests per minute, process one job per second. This is the simplest and most reliable approach.
When a job receives a 429, retry it with exponential backoff. If the response includes a Retry-After header, use that value as the delay instead of calculating your own.
If the API supports batch endpoints, combine multiple operations into a single request. One batch request for 100 items uses one rate limit slot instead of 100.
Instead of enqueuing 10,000 jobs at once, spread them across a longer window. Schedule jobs with small delays between them. This avoids hitting the rate limit in the first place.
If you expose an API, implement rate limiting to protect against abuse. Common approaches:
A rate limit is a cap on how many requests a client can make to an API in a given time period. It protects the API from overload and ensures fair access for all users.
HTTP 429 is the "Too Many Requests" status code. It means you've exceeded the API's rate limit and need to wait before sending more requests. Check the Retry-After header for how long to wait.
Monitor rate limit headers on every response. When you see remaining requests dropping, slow down. If you receive a 429, back off for the duration specified in Retry-After. For background jobs, throttle your queue to stay below the limit proactively.
Use exponential backoff when you hit a rate limit. Queues help you stay within rate limits by controlling throughput. Rate limits and timeouts are the two most common causes of background job failures.
Recuro handles cron scheduling, retries, alerts, and execution logs -- so you can focus on building your product.
No credit card required