A webhook 504 Gateway Timeout means a reverse proxy gave up waiting for your handler to respond. Either your handler is too slow, or a proxy in front of it has too short an idle timeout. Webhook handlers that 504 are usually doing too much synchronously — the structural fix is to ack fast and process asynchronously.
Root Causes
1. Handler doing too much inline
Your handler verifies the signature, queries the database, calls a third-party API, sends an email, and finally returns 200. Each step adds latency. The total exceeds your proxy's timeout, you 504. The fix is structural: do only the must-be-synchronous work inline (verify, idempotency check, queue write) and run everything else asynchronously.
2. Downstream API call without timeout
Your handler calls fetch(thirdParty) with no timeout. The third party is slow that day. Your handler waits 30s. Cloudflare's 100s edge timeout fires before your handler returns. Always set explicit timeouts on outbound calls (5s for fast services, 30s for slow ones, never unbounded).
3. Database query is slow
Your idempotency check does a SELECT on a 50M-row table without an index on event_id. Each request takes 8 seconds. Add the index. Use EXPLAIN ANALYZE on the query to confirm.
4. Proxy timeout shorter than handler latency
Your handler legitimately takes 70 seconds (some heavy ML enrichment, say). Cloudflare Free plan has a 100s timeout. Cloudflare Enterprise allows up to 6000s. nginx defaults to 60s for proxy_read_timeout. Match the timeout config to your actual handler latency — but better: don't have a 70-second handler.
Fix It
The two-tier handler pattern
// Tier 1: receive, verify, queue, return 200 (must complete in <1s)
app.post('/webhooks/stripe',
express.raw({ type: 'application/json' }),
async (req, res) => {
const event = stripe.webhooks.constructEvent(
req.body, req.headers['stripe-signature'], process.env.STRIPE_WEBHOOK_SECRET
)
// Idempotent insert
await db.query(
'INSERT INTO webhook_events (event_id, type, body) VALUES ($1, $2, $3) ON CONFLICT (event_id) DO NOTHING',
[event.id, event.type, req.body]
)
res.status(200).send('queued')
// Tier 2: process async — does NOT block the response
queue.publish('webhook.process', { eventId: event.id })
}
)
Outbound call with explicit timeout
// Always set a timeout — never call fetch unbounded
const ctrl = new AbortController()
const timeout = setTimeout(() => ctrl.abort(), 5000)
try {
const result = await fetch(thirdPartyUrl, { signal: ctrl.signal })
} finally {
clearTimeout(timeout)
}
Proxy Timeout Reference
| Layer | Default timeout | Adjustable? |
|---|---|---|
| Cloudflare Free / Pro / Business | 100s | No |
| Cloudflare Enterprise | 100s default, up to 6000s | Yes |
| AWS API Gateway | 29s | No (hard cap) |
| AWS ALB | 60s | Yes (1s-4000s) |
| nginx | 60s (proxy_read_timeout) | Yes |
| Caddy | 0 (no timeout, unless set) | Yes |
| Vercel Functions (Hobby) | 10s | Plan-dependent |
How to Reproduce
Deliberately add await new Promise(r => setTimeout(r, 65_000)) in your handler. Fire a test webhook. Cloudflare or your proxy returns 504. Now move the slow work out of the handler into a queue worker — re-fire — handler returns 200 in milliseconds.
Frequently Asked Questions
How fast should a webhook handler be?
<100ms for the receive path. Anything more and you're risking timeouts under load. The work that takes longer should be in a worker, not the handler.
Can I use background promises in serverless to extend execution?
On AWS Lambda, no — when you return, the runtime can suspend immediately. On Vercel, no — promises don't extend the function lifetime. Use queues (SQS, Cloud Tasks) instead.
My handler hits Cloudflare's 100s timeout but the work is genuinely slow. Now what?
Either upgrade to Cloudflare Enterprise (extends to 6000s) or restructure: ack fast, process async via a worker that has no proxy timeout.