Recuro.

Exponential Backoff

Exponential backoff is a retry strategy where the delay between failed attempts doubles each time. Instead of retrying immediately (which hammers an already-struggling server), you wait 1 second, then 2, then 4, then 8 — giving the failing system time to recover.

Why linear retries are dangerous

Imagine an API goes down for 30 seconds. If your system retries every second, it fires 30 requests into a server that's already overloaded. Multiply that by every client doing the same thing and you get a thundering herd — a flood of retry traffic that prevents the server from recovering, even after the original problem is fixed.

Exponential backoff spreads retries over time. Instead of 30 retries in 30 seconds, you get 5 retries over several minutes. The server gets breathing room to recover.

The math

The standard formula is: delay = base × 2(attempt - 1)

With a 1-second base delay:

Attempt Delay Total elapsed
11 second1s
22 seconds3s
34 seconds7s
48 seconds15s
516 seconds31s
632 seconds~1 min
764 seconds~2 min
8128 seconds~4 min

Delays grow fast. Eight attempts cover about four minutes. Ten attempts reach roughly 17 minutes. This is by design — if a service hasn't recovered in 30 seconds, it probably needs minutes, not more requests.

Jitter: adding randomness

Pure exponential backoff has a problem: if 100 clients all start retrying at the same time, they'll all retry at the same intervals — 1s, 2s, 4s — creating synchronized traffic spikes. Jitter fixes this by adding random variation to each delay.

The most common approach is full jitter: instead of waiting exactly 4 seconds on attempt 3, wait a random duration between 0 and 4 seconds. This spreads the retries across the entire interval, eliminating synchronized spikes.

The formula becomes: delay = random(0, base × 2(attempt - 1))

When to stop retrying

Practical example

Your app has a background job that sends a Slack notification via webhook. The Slack API returns 503 (service unavailable). Here's what happens with exponential backoff (base 2s, max 5 attempts):

Total elapsed: 6 seconds. The notification was delivered with minimal delay and zero manual intervention.

FAQ

What is exponential backoff?

Exponential backoff is a retry strategy where the wait time between retries doubles after each failure. It prevents overwhelming a struggling service with retry traffic and gives it time to recover.

What is jitter in retry logic?

Jitter is random variation added to the backoff delay. Without jitter, many clients retrying at the same time create synchronized traffic spikes. With jitter, retries are spread out randomly, reducing contention.

How many retries should I configure?

It depends on the job's criticality and failure mode. For HTTP jobs hitting external APIs, 3 to 5 retries is common. For critical business operations (payments, data sync), 5 to 10 retries with longer base delays gives more time for recovery. Always pair with a max delay ceiling.

Retries only work safely if jobs are idempotent. After max retries with backoff, jobs go to the dead letter queue. Job queues implement backoff automatically, and backoff is especially important when failures are caused by timeouts or rate limits.

Stop managing infrastructure. Start scheduling jobs.

Recuro handles cron scheduling, retries, alerts, and execution logs -- so you can focus on building your product.

No credit card required