Webhook API: reliable delivery end-to-end
Webhooks are simple HTTP requests — until retries, timeouts, signature verification, and duplicate side effects show up in production.
This guide walks through the delivery lifecycle and the minimal patterns that keep webhooks safe under load.
No credit card required
TL;DR
- Treat webhooks as at-least-once delivery: retries and duplicates are guaranteed.
- Respond fast (ideally < 1s; always under the provider timeout) with a 2xx/202.
- Verify authenticity over the raw request body (signature + timestamp) before parsing/mutating.
- Persist the raw payload + headers/received_at; do side effects asynchronously in a worker.
- Make processing idempotent using provider event IDs (or a stable hash) with a dedupe window.
- Log request IDs, event IDs, and latency breakdown (parse/verify/enqueue/worker).
- Build a replay path: re-run processing from stored payloads after a fix.
Next: retries & idempotency and security best practices.
Anti-patterns
- Doing slow work in the HTTP handler (network calls, heavy DB queries, fan-out).
- Parsing JSON before signature verification (breaks Stripe/GitHub-style HMAC schemes).
- Assuming 200 means “processed” — it only means “accepted by your endpoint”.
If you need concrete examples, see payment webhooks and CI/CD triggers.
Core concepts
A webhook API is “provider pushes HTTP requests.” The hard parts are delivery semantics and safety under retries.
Delivery attempt
Providers send an event, wait for a response, and retry on timeout, 5xx, and sometimes 429. Assume duplicates.
Signature
Most providers sign the raw request body. Verify before parsing and use constant-time compare to avoid timing leaks.
Idempotency
Your processing must tolerate duplicates. Use provider event IDs + a dedupe store so side effects run once.
Simple flow
Provider
Stripe / GitHub / Shopify
POST event payload
Endpoint
Webhook receiver
Verify + accept fast
Queue / Worker
Async processing
Idempotent side effects
If you already have an inline handler, see migrating to queue-based processing.
Production checklist
Copy/paste this into your PR or runbook. It covers the “boring” details that prevent dropped events.
- [ ] Webhook endpoint responds 2xx/202 quickly (no inline side effects)
- [ ] Body size limits + safe content-type handling (reject unexpected types)
- [ ] Signature verification uses the raw body + constant-time compare
- [ ] Timestamp validation to mitigate replay (max age + clock skew tolerance)
- [ ] Persist raw payload, headers, and received_at for replay/debugging
- [ ] Idempotency: dedupe on provider event ID (or stable hash) with an expiry window
- [ ] Distinguish retryable vs permanent failures; do not retry on 4xx validation errors
- [ ] Worker has bounded concurrency + backoff + jitter
- [ ] Alert on sustained non-2xx responses and growing backlog
- [ ] Replays are safe: processing is deterministic and side effects are idempotent
- [ ] Every delivery has correlated logs (provider request ID, event ID, internal trace ID) Reference implementation
Minimal worker code that consumes webhook payloads from a queue and uses explicit Ack/Nack control.
For end-to-end reliability patterns, compare with retries & backoff.
Node
REST pull + Ack/Nack
// Pull from a Hooque consumer queue and Ack/Nack explicitly
// Node 18+ (fetch built-in)
const QUEUE_NEXT_URL =
process.env.HOOQUE_QUEUE_NEXT_URL ??
"https://app.hooque.io/queues/cons_webhook_events/next";
const TOKEN = process.env.HOOQUE_TOKEN ?? "hq_tok_replace_me";
const headers = { Authorization: `Bearer ${TOKEN}` };
async function processPayload(payload) {
// Use a provider event ID if available for idempotency.
// Example: Stripe uses payload.id (evt_...), GitHub uses headers like X-GitHub-Delivery.
const eventId = payload?.id ?? null;
if (eventId) {
// TODO: enforce idempotency (unique constraint / SETNX / dedupe table)
}
// TODO: apply your business logic here
return { ok: true };
}
while (true) {
const resp = await fetch(QUEUE_NEXT_URL, { headers });
if (resp.status === 204) break; // queue is empty
if (!resp.ok) throw new Error(`Hooque next() failed: ${resp.status}`);
const payload = await resp.json();
const meta = JSON.parse(resp.headers.get("X-Hooque-Meta") ?? "{}");
try {
await processPayload(payload);
await fetch(meta.ackUrl, { method: "POST", headers });
} catch (err) {
await fetch(meta.nackUrl, {
method: "POST",
headers: { ...headers, "Content-Type": "application/json" },
body: JSON.stringify({ reason: String(err) }),
});
}
} Python
REST pull + Ack/Nack
# Pull from a Hooque consumer queue and Ack/Nack explicitly
import json
import os
import requests
QUEUE_NEXT_URL = os.getenv(
"HOOQUE_QUEUE_NEXT_URL",
"https://app.hooque.io/queues/cons_webhook_events/next",
)
TOKEN = os.getenv("HOOQUE_TOKEN", "hq_tok_replace_me")
headers = {"Authorization": f"Bearer {TOKEN}"}
def process_payload(payload: dict) -> None:
event_id = payload.get("id") # use provider event ID where possible
if event_id:
# TODO: enforce idempotency (DB unique constraint / Redis SETNX / dedupe store)
pass
# TODO: apply your business logic here
return None
while True:
resp = requests.get(QUEUE_NEXT_URL, headers=headers, timeout=30)
if resp.status_code == 204:
break
if resp.status_code >= 400:
raise RuntimeError(f"Hooque next() failed: {resp.status_code} {resp.text}")
payload = resp.json()
meta = json.loads(resp.headers.get("X-Hooque-Meta", "{}"))
try:
process_payload(payload)
requests.post(meta["ackUrl"], headers=headers, timeout=30)
except Exception as err:
requests.post(
meta["nackUrl"],
headers={**headers, "Content-Type": "application/json"},
json={"reason": str(err)},
timeout=30,
) Common failure modes
When webhooks break, the symptoms are often misleading. Start with reachability, then authenticity, then latency.
Missing events
Likely causes
- Provider is not sending (disabled, wrong environment, filtered events).
- Wrong URL/path/method or blocked inbound traffic.
- Non-2xx responses causing retries + eventual disable.
Next checks
- Confirm provider delivery logs and response codes.
- Check TLS/DNS, redirects, and firewall/WAF.
- Inspect recent 4xx/5xx and timeouts.
Signature failures
Likely causes
- You are verifying over parsed JSON, not the raw body.
- Clock skew breaks timestamp validation.
- Using the wrong secret (prod vs test) or rotated secret.
Next checks
- Log the exact raw bytes used for verification (safely).
- Allow small clock skew and validate max age.
- Support {current, previous} secrets during rotation.
Duplicates / double side effects
Likely causes
- Provider retries due to timeout/5xx/429.
- Manual replays/test events reuse payloads.
- Processing is not idempotent (no dedupe key).
Next checks
- Add dedupe store keyed by event ID with TTL.
- Make external calls idempotent where possible.
- Move work to a queue/worker and Ack fast.
How Hooque helps
Map the checklist to primitives you can rely on: hosted ingest, durable queues, and explicit delivery lifecycle controls.
- Hosted webhook endpoints that persist payloads instantly (decouple ingest from processing).
- Provider-specific signature verification at ingest, before messages enter the queue.
- Consumers pull (REST) or stream (SSE) with explicit Ack / Nack / Reject lifecycle control.
- Replays and inspection from a dashboard when something fails in production.
- Per-webhook and per-consumer metrics to power alerting and SLOs.
If you are evaluating build-vs-buy, start with pricing and then compare reliability patterns in use cases.
FAQ
Common implementation questions that show up when you move from “it works locally” to production traffic.
What is a webhook API?
A webhook API is a push-based integration: the provider sends your system HTTP requests (events). Your job is to authenticate the request, acknowledge it quickly, and process it safely (idempotently, with retries and observability). With Hooque, webhook ingest is hosted and payloads are persisted into a durable queue so your workers can process independently.
What status code should a webhook endpoint return?
Return a 2xx status code as soon as you have safely accepted the event (often 200 or 202). Returning 2xx does not mean you finished processing; it means the provider can stop retrying that delivery attempt. With Hooque, the hosted endpoint can return 202 immediately after persisting, then your worker handles processing asynchronously.
Why am I getting duplicate webhook events?
Duplicates usually come from provider retries (timeouts, network errors, 5xx responses) or replays/manual retries. Webhooks are typically at-least-once; you need receiver-side idempotency to make duplicates harmless. With Hooque, you get a queue interface plus per-delivery metadata, and your worker can Ack/Nack/Reject explicitly while enforcing idempotency.
How do I verify a webhook signature correctly?
Verify the signature over the raw request body (before JSON parsing) using a constant-time comparison. If the provider includes a timestamp, validate it within a small max-age window to reduce replay risk. With Hooque, provider-specific signature verification can happen at ingest before messages enter the queue.
How do I safely replay webhooks after a bug fix?
Persist the raw payload and relevant headers, then replay from storage through the same processing code. Ensure processing is idempotent (dedupe keys, unique constraints, idempotent external calls) so replays do not duplicate side effects. With Hooque, you can inspect and replay queued/delivered messages while keeping explicit delivery outcomes (Ack/Nack/Reject).
Do I need a queue for webhooks?
If processing can exceed provider timeouts or involves downstream dependencies, use a queue/worker to decouple ingestion from processing. This prevents dropped events and reduces operational risk during spikes and outages. With Hooque, the queue is built-in: a hosted endpoint persists immediately and your consumers pull or stream at worker speed.
Start processing webhooks reliably
Capture every webhook, persist instantly, and consume via REST or SSE with full Ack/Nack/Reject control.
No credit card required