Webhook 502 Bad Gateway — Causes & Fixes

A webhook 502 Bad Gateway means a reverse proxy in front of your handler couldn't get a response from the upstream. Cloudflare, nginx, ALB, Caddy — the proxy was up, but your application server wasn't. Webhook deliveries during a 502 window are retried by the source, so you usually don't lose events — but the operational cause is worth fixing.

Root Causes

1. Application server restart during a deploy

Your deploy stops the old container, starts the new one, and the proxy sees 502s during the gap. With webhook traffic averaging 100 req/min, even a 5-second gap means a handful of failed deliveries — most of which retry, but it's noise in dashboards and customer-visible delays.

2. Upstream connection refused / timeout

Your application crashed, or its port isn't accepting connections. The proxy tries, fails, returns 502. Common after OOM kills, panics, or container exit.

3. Idle timeout mismatch between proxy and upstream

nginx's proxy_read_timeout defaults to 60s. Cloudflare's idle timeout is 100s. If your handler holds the connection longer (slow business logic, long DB query), the proxy gives up and returns 502 to the client even though your handler eventually responds.

4. Health check failure

The load balancer's health check fails (your /health endpoint is broken or slow), the LB marks all instances unhealthy, all traffic 502s.

Fix It

Graceful shutdown during deploys

// Express — drain in-flight requests before exiting
const server = app.listen(3000)

process.on('SIGTERM', () => {
  log.info('SIGTERM received, draining…')
  server.close((err) => {
    if (err) log.error({ err }, 'shutdown error')
    process.exit(err ? 1 : 0)
  })
  // Force exit after 10s if drain hangs
  setTimeout(() => process.exit(1), 10_000).unref()
})

Match proxy timeouts to handler latency

# nginx — give webhook routes a longer timeout
location /webhooks/ {
    proxy_read_timeout 120s;
    proxy_send_timeout 120s;
    proxy_pass http://upstream;
}

Health check that actually reflects readiness

// /health should fail BEFORE the process exits
let isShuttingDown = false

process.on('SIGTERM', () => { isShuttingDown = true })

app.get('/health', (req, res) => {
  if (isShuttingDown) {
    return res.status(503).send('shutting down')
  }
  // Optional: check DB and Redis pings here
  res.status(200).send('ok')
})

Webhook-Specific Mitigations

Webhook handlers should return 200 in <100ms — long handlers are why 502s happen. Queue first, process async.
Run multiple replicas behind the LB so one container restart never takes all traffic offline.
Use blue-green or rolling deploys — never restart all replicas simultaneously.
Webhook senders retry, so a brief 502 window is recoverable. But don't ignore patterns of 502 — they're a signal of operational fragility.

How to Reproduce

Deliberately kill -9 your application process while a webhook delivery is in flight. Watch the proxy's response — should be 502. Then fix the graceful shutdown handler and SIGTERM the process; the in-flight request should complete cleanly with no 502.

Frequently Asked Questions

How do I deploy without any 502s?

Rolling deploy with health-check-aware load balancing. New container starts, LB waits for it to be healthy, then drains the old container (giving it 30s to finish in-flight requests), then kills the old container. No 502 window.

Cloudflare 502s sometimes even when my server is fine. Why?

Cloudflare's edge can fail to reach your origin for 0.5% of requests on a normal day — network blips, edge node failures. Acceptable noise. Alert on rate (>1% sustained), not individual 502s.

Should I worry about webhook events lost during 502s?

Usually no — sources retry. Verify by checking the source's webhook dashboard for 'recent failed deliveries' after a 502 incident. If retries succeed, you didn't lose events.

Debug This Error in Real Time

WebhookWhisper captures every webhook request with full headers, body, and timing — so you can see exactly what the provider sent and reproduce the error instantly.

Start Debugging Free

Related Webhook Errors

503 service unavailable 504 gateway timeout Webhook best practices for 2026 What is a webhook handler? At-least-once delivery All Webhook Errors →