Runbooks.
The short answers to recurring "how do I…" and "why is this red?" moments between commit and production — composite indexes, a poisoned build cache, adding a secret, a webhook that won't fire, and reading prod logs. These are pointers to get you unstuck fast; the code and each AGENTS.md carry the full context.
A query needs a composite index
- Ship the query. At runtime it fails with
failed-precondition. - The error contains a direct create-index link — open it and create the index in the Firebase console.
- Note the index in your PR's deploy step. Don't re-add it to
firestore.indexes.json— that file is intentionally empty; indexes are console-managed.
CI is red on code I didn't touch
- Because the main job uses
--continue, a failure elsewhere still surfaces in your run. Check whether the failing file is ingit diff --name-only origin/main...HEAD— if not, it's not yours to fix. - Proto drift → run
pnpm protoc:buildand commit the regenerated tree. - A stale branch behind
maincan fail required checks on untouched code (Turbo cache miss + reviewdog). Mergeorigin/mainto clear it.
Busting the Go build cache
The Go build cache lives in gs://perkup-nix-cache, keyed by (sha256(backend/app/go.mod)[:16], UTC-date) with a 30-day TTL. If a poisoned cache entry is causing bad builds, delete the day's object:
gcloud storage rm gs://perkup-nix-cache/<key>
The next build repopulates it from scratch.
Adding a Firebase secret
defineSecret('MY_SECRET')infunctions/src/consts/secrets.ts.- Add it to
bin/setup-firebase-secrets.sh. - Add an
op://reference in.env.local.default. - Flag in the PR — KEVIN provisions the real value in Google Secret Manager.
Webhook isn't firing locally
functions/.secret.localmust hold the real secret, not the dummy placeholder ("Invalid credentials" = dummy).- Inspect the inbound request at the ngrok dashboard (
localhost:4040). - Signature mismatch → the verification is computed over the raw request body; see Integrations for the shape and the code for the exact scheme.
- OAuth callback failing → the redirect URI must match the provider config exactly.
- Slack token decrypt failing → check
SLACK_ENCRYPTION_SECRET.
Reading prod logs
Services run on Cloud Run in projects perkup-app (production) and perkup-app-test (staging). Key services: v2services, v2lazy, eventrouter, frontend-proxy, amazon-punchout, slack-notifier. Start broad, then narrow:
gcloud logging read "severity>=ERROR" \ --limit=50 --format=json --freshness=1h \ --project=perkup-app
Aggregated errors land in GCP Cloud Error Reporting (with a Slack alert on deploy failure); frontend errors are in LogRocket. Use the trace field to stitch a single request across services.
Incident postmortems
Source: docs/integrations/AGENTS.md · CLAUDE_CICD.md · functions/AGENTS.md · perkup-app/CLAUDE.md. Compiled 2026-06-07.