A webhook endpoint is a public POST that triggers business logic. It's also one of the most under-defended surfaces I see in code reviews — teams that would never expose an unauthenticated REST endpoint cheerfully ship webhook handlers with verification disabled, no rate limiting, no IP allowlist, and a handler that does irreversible work synchronously. Most webhook security incidents I've seen weren't sophisticated; they were attackers walking through the front door because nobody locked it.
This post is the security-specific companion to general webhook best practices. It covers the actual threat model, the layers of defense in depth, and the failure modes each layer prevents. I run WebhookWhisper, which means I've designed both sides of the verification flow (we verify inbound provider signatures and we sign our own outbound forwards — see our HMAC signing docs), so the patterns here are battle-tested.
The actual threat model
Before defenses, the threats. What's an attacker actually trying to do against your webhook endpoint?
Forge legitimate-looking events to trigger business logic. The most common and most consequential. POST a fake payment_intent.succeeded with a real-looking session ID; your fulfilment code ships a product. POST a fake customer.subscription.created with the attacker's email; they get a paid-tier account. POST a fake charge.refunded on a real charge ID; some receivers issue a credit on their own ledger before checking with the provider's API. None of these require sophistication — they require knowing the endpoint URL and the payload shape, both of which leak constantly.
Replay captured legitimate events. Even with signature verification, a captured signed webhook (from a leaked log file, an intercepted dev environment, a screenshot in Slack) stays valid forever unless your verification includes a timestamp tolerance check. The same successful charge gets fulfilled five times, three days apart.
Flood the endpoint with verification-failing requests. Even if every request fails signature verification, your verifier still has to run, which costs CPU. An attacker doesn't need to forge a valid signature to take your endpoint down — they just need enough invalid requests to exhaust your handler's capacity.
Probe for misconfigurations. Endpoints that return different errors for "wrong signature" vs "missing signature" vs "endpoint not found" leak information that helps attackers map your stack. Verbose error messages from misconfigured handlers reveal framework versions, route structures, and sometimes secret prefixes.
Exploit downstream of the handler. A signature-verifying handler that passes raw payloads to a downstream service without re-validation can chain into SQL injection, SSRF, or path-traversal at the next hop. The signature only proves "this came from the legitimate provider" — it doesn't sanitize the payload contents.
Each defense layer below addresses one or more of these. The right setup is layered: no single layer is sufficient.
Layer 1 — HMAC signature verification, done correctly
The foundational defense. Every major webhook provider signs with HMAC-SHA256 and a per-endpoint secret. Your handler verifies the signature on every request before doing anything else. If verification fails, return 401 (or 400) and stop. The full troubleshooting reference for that path is signature mismatch.
The canonical generic verifier:
import crypto from 'crypto'
function verifyHmac(rawBody, receivedSig, secret) {
const expected = crypto
.createHmac('sha256', secret)
.update(rawBody)
.digest('hex')
// Constant-time comparison — naive == leaks the secret over many requests
const expectedBuf = Buffer.from(expected, 'hex')
const receivedBuf = Buffer.from(receivedSig, 'hex')
if (expectedBuf.length !== receivedBuf.length) return false
return crypto.timingSafeEqual(expectedBuf, receivedBuf)
}
Three mistakes that look fine and aren't:
=== instead of timingSafeEqual. The first version of every verifier I've ever written, including in WebhookWhisper itself, used ===. The naive comparison returns false on the first differing character; the time it takes is observable on the wire. A patient attacker can fish out the secret byte-by-byte over millions of requests. The Node primitive is crypto.timingSafeEqual; in Python it's hmac.compare_digest; in Go it's hmac.Equal. Always use them.
JSON-stringifying the body before signing. If you pass JSON.stringify(req.body) instead of the raw bytes, you've re-serialized: keys in different order, whitespace normalized, Unicode escapes expanded. The signature won't match what the provider computed. The fix is framework-specific and is the single most common cause of webhook signature failures in the wild — see the Stripe-specific deep dive for the language-by-language escape hatches.
Skipping the timestamp tolerance. Stripe and a few others include a timestamp in the signature; the SDK enforces a 5-minute tolerance by default. Disabling it leaves you exposed to replay attacks. Keep the default in production.
Layer 2 — HTTPS-only, with a valid certificate
Every major provider refuses to deliver to HTTP. That's enforced by the provider, not by you, but it's worth saying explicitly: never accept webhook traffic over HTTP, not even in development for "convenience." The signature header is computed over the body, not the connection — so HTTP exposes the entire payload to passive interception. An attacker sniffing local traffic can replay anything they captured.
Certificate-validity matters too. A few providers will continue delivering even with an expired or self-signed cert (depending on settings), but most will fail. If your TLS cert silently expires, your webhook delivery silently stops.
Operational defenses: use Let's Encrypt or a similar auto-renewing CA, monitor cert expiry separately from the endpoint health check, and alert at 14-day and 7-day windows before expiry.
Layer 3 — IP allowlisting at the edge
Several providers publish their outbound IP ranges. Allow only those at your firewall, ALB, or WAF. Everything else gets rejected before your application code runs.
The current published-IP providers (verify against their docs at deploy time — these change):
- Stripe — publishes a regularly-updated list at their docs site, distributed across their infrastructure.
- GitHub — publishes IP ranges in
https://api.github.com/metaunder thehookskey. - Shopify — does not publish a fixed range; use signature verification only.
- Twilio — publishes IP ranges per region; check their docs.
- SendGrid — publishes ranges in their docs.
For providers that don't publish ranges (Shopify, Slack, many others), you can't IP-allowlist — fall back to signature verification and rate limiting alone.
The IP allowlist is defense in depth, not a replacement for signature verification. A spoofed source IP doesn't get past TCP handshake on the modern internet, but a compromised provider IP would. Layer the defenses.
Layer 4 — Rate limiting per source and per endpoint
Even with signature verification, an unverified-flood is bad: every request runs the verifier, which costs CPU. An attacker spraying invalid signatures at your endpoint can DoS you without ever passing verification.
Apply rate limits at the network edge — before your application code runs:
// Express + express-rate-limit
import rateLimit from 'express-rate-limit'
const webhookLimiter = rateLimit({
windowMs: 60_000, // 1 minute
max: 1000, // generous for legitimate provider traffic
keyGenerator: req => req.ip, // per source IP
handler: (req, res) => {
res.status(429).set('Retry-After', '60').json({ error: 'rate_limited' })
},
standardHeaders: true,
legacyHeaders: false,
})
app.post('/webhooks/stripe', webhookLimiter, verifyAndProcess)
The dimensions to consider:
- Per source IP. The default. A single compromised provider IP or a single attacker IP gets rate-capped.
- Per endpoint path. Different webhooks have different expected traffic — Stripe in normal mode is much higher volume than GitHub. Tune limits per route.
- Per event type, after verification. Some events trigger expensive downstream work; cap their rate even within the legitimate provider stream.
Don't set the limit so low that legitimate provider traffic gets rejected. Stripe can deliver hundreds of events per minute during a payment surge. Tune to your real peak plus headroom — for production, I usually set the limit at 5x expected peak.
Layer 5 — Idempotency that's also a security feature
Idempotency is most often discussed as a correctness feature: don't fulfill orders twice. It's also a security feature: a captured signed webhook that's replayed only ever processes once. Combined with timestamp validation, the replay window is bounded; combined with idempotency, the replay impact is zero.
The minimal pattern:
CREATE TABLE processed_webhook_events (
provider TEXT NOT NULL,
event_id TEXT NOT NULL,
started_at TIMESTAMPTZ NOT NULL,
completed_at TIMESTAMPTZ,
PRIMARY KEY (provider, event_id)
);
async function handleEvent(provider, eventId, payload) {
// Atomic insert-if-not-exists
const result = await db.query(`
INSERT INTO processed_webhook_events (provider, event_id, started_at)
VALUES ($1, $2, NOW())
ON CONFLICT (provider, event_id) DO NOTHING
RETURNING event_id
`, [provider, eventId])
if (result.rowCount === 0) {
return { received: true, deduplicated: true }
}
await processEvent(payload)
await db.query(`UPDATE processed_webhook_events SET completed_at = NOW()
WHERE provider = $1 AND event_id = $2`, [provider, eventId])
}
The atomic INSERT ... ON CONFLICT DO NOTHING is the safety guarantee. Don't use a check-then-insert pattern in two queries — there's a race window.
Layer 6 — Validate the payload schema, not just the signature
Signature verification proves the request came from the legitimate provider. It does not prove the payload is what you expected. A legitimately-signed event from the provider can still trigger handler bugs if you assume fields exist that don't, or trust nested payload contents without re-validation.
Defensive payload parsing matters: validate with a schema (Zod, Pydantic, struct tags) on entry, log + skip events that don't match the shape, and never pass raw payload values into downstream queries or system calls without sanitization.
// Zod schema for Stripe payment_intent.succeeded
import { z } from 'zod'
const paymentIntentSucceeded = z.object({
type: z.literal('payment_intent.succeeded'),
data: z.object({
object: z.object({
id: z.string().regex(/^pi_/),
amount: z.number().int().positive(),
currency: z.string().length(3),
customer: z.string().regex(/^cus_/).nullable(),
metadata: z.record(z.string(), z.string()),
}),
}),
})
// In your handler, after signature verification:
const parsed = paymentIntentSucceeded.safeParse(JSON.parse(req.body))
if (!parsed.success) {
log.warn({ error: parsed.error, eventId: rawEvent.id },
'webhook payload schema mismatch')
return res.status(202).json({ received: true, ignored: true })
}
// parsed.data is now type-safe and validated
Note the 202 status on schema mismatch — the request is acknowledged (so the provider doesn't retry) but the work is skipped. Returning 4xx would tell the provider to retry, which compounds the issue.
Layer 7 — Per-provider, per-environment secrets
One signing secret per provider per environment. Test mode and live mode have separate secrets at every provider that supports them. CLI tools (Stripe CLI's stripe listen) issue their own secrets distinct from Dashboard endpoints. A leaked dev secret should never grant access to production.
Rotation discipline:
- Rotate on offboarding (any teammate who had access to the secret leaves).
- Rotate on suspected compromise (a commit accidentally captured the secret, a CI log captured it, a screenshot leaked).
- Rotate on a schedule even without a known incident — every 90-180 days is reasonable.
The grace-period rotation pattern (supported natively by Stripe and others):
- Generate the new secret. Both old and new are valid for the configured grace window.
- Deploy your handler to accept either secret. Verify against new first, fall back to old. Log which matched.
- Watch the logs: traffic should shift from "old matched" to "new matched" over the grace window.
- After the window, expire the old secret in the dashboard. Deploy your handler to only accept the new one.
Without the grace period, you drop every event in flight when the rotation hits. That's an outage during a routine maintenance task. The full operational runbook lives in the glossary entry for signing secret rotation.
Layer 8 — Don't trust the payload to escape into other systems
The signature proves the request is from the provider. It doesn't prove the payload is safe to pass into a SQL query, a shell command, an HTTP fetch, or a templating engine. The classic webhook-derived vulnerabilities all chain like this:
SQL injection. Provider sends an event with a metadata field set by the user (e.g. customer's chosen username). Your handler concatenates the metadata into a SQL query. The provider doesn't sanitize; you do, or you have a vulnerability.
SSRF via callback URLs. Some webhook payloads include URLs (next-page tokens, image references, attachment URLs). If your handler fetches them without validation, an attacker who can influence the provider's payload (e.g. by setting metadata in their own account) can make your server fetch internal URLs.
Path traversal in stored files. Filename fields in payloads can include ../ sequences. If you write to disk with a path derived from the payload, you've written outside your intended directory.
HTML injection in admin dashboards. Webhook payload contents that get displayed in your admin UI without escaping become stored XSS vectors.
The defenses are the same as for any user-input handling: parameterized queries, URL allowlists, path normalization with prefix checks, output encoding. The signature lets you trust the source; the contents are still user-influenced data.
Layer 9 — Log without leaking
Webhook payloads contain customer data, sometimes payment metadata, sometimes secrets. Naive "log everything" approaches generate compliance and breach exposure. The right pattern: log structured fields plus a hash of the body, log the body itself only at debug level with bounded retention.
import crypto from 'crypto'
const bodyHash = crypto.createHash('sha256').update(req.body).digest('hex')
log.info({
provider: 'stripe',
eventId: event.id,
eventType: event.type,
bodyHash, // hash, not body, at info level
hasSignature: !!req.headers['stripe-signature'],
verificationStatus: 'ok',
timeToAckMs: Date.now() - startTime,
}, 'webhook_received')
// Body itself only at debug, bounded retention
log.debug({ bodyHash, body: req.body.toString('utf8') }, 'webhook_body')
Never log the signing secret, even partial. Never log the full Stripe-Signature header — log presence (boolean) only. The header isn't the secret, but it gives an attacker enough information to know which payloads succeeded so they can target replay attempts more effectively.
Layer 10 — Alert on the security signals that matter
You only know the layers above are working if you measure them. The alerts that catch real attacks:
- Signature verification failure rate spike. A sudden jump from baseline (typically <0.1%) to several percent means either an attacker probing or a deploy that broke the verifier. Either way, page someone.
- Signature verification failures from unexpected source IPs. Cross-reference with your IP allowlist; failures from outside the allowlist are noise (rate-limited away), failures from inside it are likely a deployment break.
- Rate limit hits. Legitimate provider traffic hitting the rate limit means tune up; sustained rate-limit-hitting from the same IP is an attack pattern.
- DLQ inserts. Already covered in best practices, but worth repeating in security context: a flood of DLQ inserts can be a sign that an attacker is sending events that crash your handler in a specific way.
- Schema-mismatch warnings. If your payload-validation layer logs a warn for unparseable events, watch for spikes — could be a provider schema change you missed, could be probe attempts.
The webhook security defense-in-depth checklist
- ☑ HMAC signature verification on every request, with constant-time comparison.
- ☑ Raw bytes passed to the verifier; never a re-serialized object.
- ☑ Timestamp tolerance check enforced (where the provider supports it).
- ☑ HTTPS-only; valid certificate; auto-renewing CA; expiry monitoring.
- ☑ IP allowlist where the provider publishes ranges.
- ☑ Rate limiting at the edge by source IP, with limits tuned to real peak plus headroom.
- ☑ Idempotency table keyed on
(provider, event_id)with atomic insert. - ☑ Schema validation on parsed payloads; defensive parsing with skip-and-log on mismatch.
- ☑ Per-environment, per-provider secrets; grace-period rotation.
- ☑ Sanitize payload contents before passing to downstream systems (SQL, fetch, file paths, HTML).
- ☑ Structured logs with body hash; body itself only at debug with bounded retention.
- ☑ Alerts on verification-failure-rate spikes, rate-limit hits, DLQ inserts, schema mismatches.
Frequently asked questions
If I verify signatures, do I still need rate limiting?
Yes. Signature verification proves the request is authentic but it has a cost — the verifier still runs on every request, including the ones that fail. An attacker spraying invalid signatures at your endpoint can DoS you without ever passing verification. Rate limiting at the edge rejects them before the verifier runs. The two defenses are complementary, not redundant.
How do I rotate a webhook secret without dropping events?
Use the grace-period rotation pattern: generate a new secret, deploy your handler to accept either old or new (try new first, fall back to old), watch your logs as traffic shifts to the new secret, then expire the old one in the provider dashboard. Stripe and most major providers support this natively. Without the grace period, you drop in-flight events at the moment of rotation.
Should I use a separate webhook endpoint for each provider, or one combined endpoint?
Separate endpoints, almost always. Each provider has its own signing secret, its own signature header format, its own retry behavior, and often its own IP range. A combined endpoint forces you to switch on the request to figure out which provider it's from before you can verify, which is fragile. Separate routes are clean.
Where should I store webhook signing secrets?
In a secrets manager (AWS Secrets Manager, HashiCorp Vault, Doppler, 1Password Connect, Google Secret Manager) loaded into environment variables at runtime. Not in your code repository, not in a .env.example with real values, not in CI logs. Public commits are scanned by GitHub and the major cloud providers within minutes — if a secret hits a public repo, treat it as compromised the same hour, even if you delete the commit. Force-push doesn't help; the commit is in the GitHub event API and clones already exist.
What's the most overlooked webhook security mistake?
Trusting payload contents to escape into downstream systems. Teams that diligently verify signatures still pass payload data into SQL queries, shell commands, and HTML templates without sanitization. The signature proves the source; the contents are still attacker-influenced (the attacker is often a legitimate user of the provider who controls fields in the payload). Treat payload data the same as any user input from your own forms.
How do I prevent SSRF when my webhook handler fetches URLs from the payload?
The pattern that works: an explicit URL allowlist (only domains you trust) plus a DNS-resolution check before the actual fetch (resolve the hostname, refuse if it resolves to a private/loopback/link-local IP). The naive defense — pattern-matching URLs against a regex of disallowed schemes — fails against DNS rebinding and against URLs that resolve to internal IPs at fetch time. Use a vetted SSRF-protection library or a fetcher that does the resolve-and-check yourself. Connect-time IP pinning (resolve, check, then connect to the resolved IP rather than re-resolving) closes the rebinding window. This is the same pattern we ship in WebhookWhisper's outbound forwarder for the same reason.
Closing
The compressed defense-in-depth stack: HMAC verification with constant-time compare, HTTPS, IP allowlisting where available, rate limiting at the edge, idempotency tables, schema validation, per-environment secrets with grace-period rotation, and downstream sanitization. None of these are exotic; all of them together make a webhook endpoint a hard target rather than a soft one.
If you want the inspector + forwarding + structured-log layer in five minutes, that's WebhookWhisper — paste an endpoint URL into your provider, point forwarding at your handler, and every event is captured with full headers, body hash, and verification status. The signature playground at /webhook-signature-playground is useful for synthesizing test signatures during security testing.
For the related operational guides: the full production checklist, Stripe signature verification in depth, and retry behavior across providers.