Skip to content

Going-live checklist

What to verify before you flip your integration from sandbox to production traffic.

A pre-launch checklist for hosted-checkout + webhook integrations. None of these are bureaucratic — each one corresponds to a real failure mode we’ve seen.

  • Production API key minted from the production merchant, not the dev one. The plaintext API key starts tpk_ — confirm merchantId in the portal matches your production merchant.
  • Signing secret stored in your production secret manager (Vault, AWS Secrets Manager, Doppler — not .env checked into git).
  • Dev-only credentials are removed from your production config. A leftover dev API key with prod data attached is a footgun.
  • Signing happens server-side, never in the browser. Search your front-end bundle for tss_ to confirm.
  • exp is short (≤30 min) to limit the window for nonce grinding.
  • nonce is ≥16 bytes of crypto.randomBytes, not Math.random() or a timestamp.
  • ret is an HTTPS URL you actually control.
  • ref is your own customer id, not the user’s email or anything else attacker-supplied.
  • The “Subscribe” button is not click-spammable — debounce it client-side so a fast clicker doesn’t mint two deep links and create two subscriptions.
  • The user can return from a partially-completed flow. If they got to capture and then dropped, you should be able to start a fresh checkout with the same ref.
  • status=ok does not mark the customer as subscribed by itself — it only triggers an interstitial that checks your own database (populated by the webhook).
  • status=cancelled shows a “try again” path, not a dead end.
  • Unknown status / missing status falls through to “we’ll be in touch” rather than 500ing.
  • Endpoint is HTTPS with a valid certificate. Self-signed certs work in dev but Topiic will reject them in production.
  • Signature verification is in place and rejects with 401 on mismatch. Test with a deliberately-corrupted Topiic-Signature.
  • Raw body bytes are used for HMAC, not a re-serialised parse. The single most common production bug.
  • Topiic-Event-Id is deduplicated in the same transaction as the side-effect.
  • Handler returns 2xx within 10 seconds even when the side-effect is slow — queue the slow part if needed.
  • Unknown event types are accepted with 200, not rejected with 400. New event types may be added without warning.
  • You have a backstop for missed webhooks. A daily job that lists refs with no webhook recorded after 24 hours and either prompts the user to retry or flags them for support.
  • You can replay a delivery from the portal and your handler is genuinely idempotent (test it by replaying a recent event and verifying nothing duplicates downstream).
  • Each webhook receipt is logged with the event id, type, signature-verification outcome, and processing latency.
  • Outbound calls to Topiic are logged with the response status and any traceId in the body.
  • An alert fires on:
    • Webhook handler 5xx rate > 1%
    • Webhook handler p95 latency > 5s (you’re approaching the 10s timeout)
    • Signature-verification failures > 0 (something is misconfigured or being probed)
    • Any 5xx from Topiic’s API > 1% of calls
  • You know who to contact at Topiic if production deliveries stop landing. (Email/Slack/page handle on file.)
  • You have the merchant id, API key id, and webhook URL written down somewhere your on-call can find at 3am.
  • You’ve documented “what to do if a customer says they paid but isn’t activated” — check the portal’s checkout session for that ref, replay the webhook if needed, escalate to engineering if not.
  • Watch the first ten real customers go through. Eyeball each one in the portal end-to-end.
  • Confirm a webhook arrives for the first real subscription before the first customer asks why they’re not activated.
  • Run a deliberate failure (e.g. test card that declines) and confirm checkout.failed lands and your UX surfaces a retry path.