Guide

Webhook not working? a production debugging playbook

Webhook bugs feel urgent because they are: missing events and duplicates usually mean broken business logic.
Use this playbook to triage systematically: provider logs → reachability → auth → latency → idempotency.

Start for free See use cases

No credit card required

TL;DR

Start with the provider delivery log: is it sending, and what response codes does it see?
Confirm endpoint reachability: TLS, DNS, redirects, firewall/WAF, and correct URL/method.
If signatures fail, verify over raw body bytes and validate timestamp (clock skew matters).
If events are missing, check filters, environments (test vs prod), and provider retry/disable settings.
If duplicates happen, assume retries and add idempotency (dedupe on event ID/delivery ID).
Capture raw payloads safely and replay deterministically after fixes.

With Hooque, most of the above is handled for you — jump to “How Hooque helps” .

If auth is the problem, read webhook security.

Anti-patterns

Debugging only from your app logs without checking provider delivery logs.
Logging secrets or full PII payloads during incidents.
Fixing symptoms (timeouts) without addressing the root cause (inline processing).

If you’re missing events during incidents, see monitoring & alerting.

Core concepts Triage checklist Reference implementation (Node + Python) Common failure modes How Hooque helps FAQ

Core concepts

Debugging webhooks is mostly “follow the delivery.” You need both provider and receiver visibility.

Provider truth

Provider delivery logs tell you whether the provider is attempting delivery and what it observed (code/latency).

Receiver truth

Receiver logs tell you whether the request arrived, whether auth passed, and where time was spent (parse/verify/enqueue).

Replay safety

If you can store raw payloads and replay safely, you can recover from most incidents without losing data.

Simple flow

Provider

Delivery log

attempts + codes

Receiver

Logs + traces

auth + latency

Replay

Raw payloads

re-run safely

The fastest path to root cause is: provider log → HTTP status/latency → receiver auth/latency → worker outcomes.

Triage checklist

A repeatable, production-friendly sequence that reduces guesswork under pressure.

- [ ] Provider: is the webhook enabled? correct environment? correct event types?
- [ ] Provider delivery log: request URL/method correct? response code? latency? retries?
- [ ] Reachability: TLS certificate valid? DNS correct? no redirects? firewall/WAF blocks?
- [ ] Handler: respond 2xx/202 quickly (no slow work in request path)
- [ ] Signature verification:
  - [ ] verify over raw body bytes (before JSON parse)
  - [ ] constant-time compare
  - [ ] timestamp max-age + clock skew
- [ ] Duplicates: add dedupe store keyed by event ID/delivery ID (+ TTL/window)
- [ ] Payload parsing: content-type expected? body size limits? gzip/encoding handled?
- [ ] Observability: correlate provider request ID + internal trace ID + processing outcome
- [ ] Replay: store raw payload + headers; replay through the same processing code

Reference implementation

When you need to inspect what is happening, pull one message and print the metadata and payload.

Node

Debug pull + inspect meta

// Minimal “debug pull”: fetch one message, print meta + payload, then choose Ack/Nack/Reject.
const QUEUE_NEXT_URL =
  process.env.HOOQUE_QUEUE_NEXT_URL ??
  "https://app.hooque.io/queues/cons_webhook_events/next";
const TOKEN = process.env.HOOQUE_TOKEN ?? "hq_tok_replace_me";
const headers = { Authorization: `Bearer ${TOKEN}` };

const resp = await fetch(QUEUE_NEXT_URL, { headers });
if (resp.status === 204) {
  console.log("queue empty");
  process.exit(0);
}
if (!resp.ok) throw new Error(`Hooque next() failed: ${resp.status}`);

const payload = await resp.json();
const meta = JSON.parse(resp.headers.get("X-Hooque-Meta") ?? "{}");

console.log("meta:", meta);
console.log("payload:", payload);

// Choose an outcome:
// - Ack when you are confident it processed successfully
// - Nack when the failure is transient (retry later)
// - Reject when the payload is permanently bad
await fetch(meta.ackUrl, { method: "POST", headers });

Python

Debug pull + inspect meta

# Minimal “debug pull”: fetch one message, print meta + payload, then choose Ack/Nack/Reject.
import json
import os
import requests

QUEUE_NEXT_URL = os.getenv(
    "HOOQUE_QUEUE_NEXT_URL",
    "https://app.hooque.io/queues/cons_webhook_events/next",
)
TOKEN = os.getenv("HOOQUE_TOKEN", "hq_tok_replace_me")
headers = {"Authorization": f"Bearer {TOKEN}"}

resp = requests.get(QUEUE_NEXT_URL, headers=headers, timeout=30)
if resp.status_code == 204:
    print("queue empty")
    raise SystemExit(0)
if resp.status_code >= 400:
    raise RuntimeError(f"Hooque next() failed: {resp.status_code} {resp.text}")

payload = resp.json()
meta = json.loads(resp.headers.get("X-Hooque-Meta", "{}"))

print("meta:", meta)
print("payload:", payload)

# Choose an outcome:
requests.post(meta["ackUrl"], headers=headers, timeout=30)

For local iteration workflows, see local webhook development.

Common failure modes

Use symptoms to narrow the search space quickly, then follow the checklist to confirm.

Provider shows timeouts

Likely causes

Handler does slow work inline.
Network path issues (WAF, cold starts).
Downstream calls block the response.

Next checks

Return 2xx/202 immediately after acceptance.
Move work into a queue/worker.
Instrument latency breakdown and add timeouts.

401/403 signature failures

Likely causes

Wrong secret or wrong environment.
Verifying parsed JSON not raw bytes.
Timestamp validation failing due to skew.

Next checks

Verify secret versions + overlap rotation.
Capture raw bytes and compare locally.
Allow small skew and set a max age window.

Duplicates and out-of-order state

Likely causes

Provider retries.
No idempotency/dedupe store.
Worker concurrency violates ordering assumptions.

Next checks

Add dedupe keys + TTL and unique constraints.
Fetch authoritative state via API when needed.
Sequence processing per object/tenant.

For retry semantics, see retries & backoff.

How Hooque helps

Debugging is easier when you have a durable event history, a replay path, and explicit delivery controls.

Hosted ingest + durable persistence so payloads aren’t lost during outages.
Provider-specific signature verification at ingest reduces auth debugging surface.
Queue consumers with Ack/Nack/Reject so you can quarantine bad payloads.
Inspection and replay tools so you can reproduce and validate fixes.
Metrics per webhook/consumer to correlate incidents with spikes and failures.

If you’re comparing solutions, check pricing and see patterns in use cases.

FAQ

Short answers for high-pressure debugging situations.

How do I know if the provider is sending webhooks?

Check the provider’s webhook delivery log. It should show attempts, response codes, and latency. If it is not sending, the issue is usually configuration (disabled endpoint, wrong environment, wrong event types). With Hooque, you also get an ingest history and queue metrics to confirm whether events arrived and what happened next.

Why do I see 401/403 on webhooks?

Authentication failed: missing signature headers, wrong secret (test vs prod), timestamp validation failing, or signature verification performed over parsed body instead of raw bytes. With Hooque, provider-specific signature verification happens at ingest, and failures can be inspected without touching your worker.

Why do webhooks work locally but not in production?

Common causes are TLS/DNS issues, firewalls/WAF blocks, redirects, missing raw-body verification in the deployed stack, or environment mismatch (secrets, URLs, event subscriptions). With Hooque, inbound exposure is centralized in a hosted endpoint so your app only needs outbound access to consume events.

How do I debug signature verification failures?

Capture the exact raw request bytes, signature headers, and timestamp. Re-run verification locally with the same inputs. Ensure you are verifying before parsing and using constant-time comparison. With Hooque, signature verification is handled at ingest (provider-specific) and only verified messages enter your consumer queue.

How do I debug duplicates and out-of-order events?

Assume retries. Add idempotency with dedupe keys, track attempt counts, and ensure your processing is tolerant to out-of-order updates (fetch authoritative state via API when needed). With Hooque, you can Nack to retry, Reject permanent failures, and use metadata and inspection to trace why duplicates happened.

How do I safely replay a failed webhook?

Replay from persisted raw payloads through the same processing code. Only do this if side effects are idempotent and you can correlate outcomes (logs/trace IDs) to validate correctness. With Hooque, inspection and replay are built-in so you can re-run after a fix without losing payloads.

Start processing webhooks reliably

Capture events durably and debug with inspection, replay, and explicit delivery controls.

Start for free

No credit card required

Webhook not working? a production debugging playbook

TL;DR

Anti-patterns

Table of contents

Core concepts

Provider truth

Receiver truth

Replay safety

Triage checklist

Reference implementation

Node

Python

Common failure modes

Provider shows timeouts

401/403 signature failures

Duplicates and out-of-order state

How Hooque helps

FAQ

How do I know if the provider is sending webhooks?

Why do I see 401/403 on webhooks?

Why do webhooks work locally but not in production?

How do I debug signature verification failures?

How do I debug duplicates and out-of-order events?

How do I safely replay a failed webhook?

Start processing webhooks reliably