Skip to Content
RunbooksRunbook: Self-Service Onboarding

Runbook: Self-Service Onboarding

Owner: Platform Team Backup owner: On-call engineer Last validated: 2026-05-15 (Phase 0 turnkey audit) Validation method: Smoke test (tests/integration/test_signup_smoke.py) + manual signup Severity trigger: SEV2 (blocked signups directly impact PLG funnel) Customer impact: New users cannot create accounts or activate; existing users unaffected Required access: MongoDB (read), Redis (read), VPS shell (write) Related services: curateme-backend-b2b, dashboard


Self-service onboarding is the entry point of the PLG funnel. A new user posts to POST /api/v1/platform/onboard, the request passes anti-spam layers, creates a user + org + governance policy + API key, and returns active credentials. Any breakage here silently kills signup conversion. This runbook covers the architecture, the anti-spam layers, the auto-activate feature flag, and how to diagnose / fix common failures.


Architecture overview

The signup endpoint lives at services/backend/src/api/routes/platform/onboard.py. Three relevant routes:

RoutePurpose
POST /api/v1/platform/onboardSelf-service signup — creates user + org + membership + governance policy + API key
POST /api/v1/platform/onboard/approve/{user_id}Admin activates a pending user (used when SIGNUP_AUTO_ACTIVATE=False)
GET /api/v1/platform/onboard/verify-email?token={token}Marks email verified; activates account if auto-activate is on

The behavior depends on the SIGNUP_AUTO_ACTIVATE feature flag (see below).


The SIGNUP_AUTO_ACTIVATE flag

This is the most critical config in the funnel. Defined at services/backend/src/config/feature_flags.py:444, default value at feature_flags.py:784:

FeatureFlag.SIGNUP_AUTO_ACTIVATE: True, # Auto-activate — ON for self-service free tier (no credit card)

Default: True. New signups immediately get status="active" + an issued API key + a governance policy.

If set to False (override via env or org_feature_flag_overrides): new signups land in status="pending" and must be activated manually by an admin via POST /api/v1/platform/onboard/approve/{user_id}.

Regression guard: unit test tests/unit/test_feature_flags.py:99 (test_signup_auto_activate_enabled) asserts the default is True. The integration smoke test at tests/integration/test_signup_smoke.py exercises the end-to-end golden path.

Do NOT flip this off in production unless you’re intentionally rolling out a waitlist. The dashboard does not currently surface “pending” accounts well — pending users will land on /welcome and see a confusing state.


Anti-spam layers (checked in order, cheapest first)

The endpoint stacks 5 anti-spam layers before any DB write. Each can produce a fake-success (return 200 to confuse the bot) or a 400/422 error:

LayerCheckFailure modeLogged event
1. HoneypotHidden website field; bots auto-fillFake success returned (status=pending, fake IDs)onboard_rejected_honeypot
2. Submit-too-fastForm t timestamp; under 2s = botFake success returnedonboard_rejected_too_fast
3. IP rate limit5 signups/IP/hour429 Too Many Requestssignup_rate_limit_exceeded
4. Cloudflare TurnstileProof-of-humanity token400 with “Verification failed”turnstile_verification_failed
5. Disposable emailPydantic validator422 Unprocessable Entity (rejected at schema layer)(no signup attempt logged)

Local dev: if TURNSTILE_SECRET_KEY is empty, layer 4 short-circuits to success — Turnstile is skipped. This is intentional for dev but means tests must explicitly handle Turnstile mocking.


Common failures and diagnosis

Symptom: every signup returns 400 “Verification failed”

Cause: Turnstile is misconfigured.

# On VPS: echo $TURNSTILE_SECRET_KEY # must be set in production echo $TURNSTILE_SITE_KEY # must match dashboard env # Check Cloudflare dashboard for Turnstile widget status — disabled widget returns failure

Fix: rotate keys in Cloudflare → update .env on VPS → restart curateme-backend-b2b.

Symptom: every signup succeeds with status="pending" but never activates

Cause: SIGNUP_AUTO_ACTIVATE is OFF.

# On VPS: poetry run python -c " from src.config.feature_flags import FeatureFlag, is_feature_enabled print('SIGNUP_AUTO_ACTIVATE:', is_feature_enabled(FeatureFlag.SIGNUP_AUTO_ACTIVATE)) "

Fix: unset the override env var (FF_SIGNUP_AUTO_ACTIVATE=true or remove the entry from org_feature_flag_overrides Mongo collection), restart service. Verify the unit test still passes (pytest tests/unit/test_feature_flags.py::TestFeatureDefaults::test_signup_auto_activate_enabled).

Symptom: bot traffic is bypassing layers 1-2

Cause: Honeypot field accidentally removed from the form, or t timestamp not being sent.

# On the dashboard: curl -s https://dashboard.curate-me.ai/signup | grep -E 'name="website"|name="t"' # Should find both inputs

Fix: check apps/dashboard/app/signup/page.tsx — both fields must be present (website hidden, t populated with Date.now() at mount).

Symptom: duplicate users created on retry

Cause: Honeypot or too-fast layer returned a fake success with random IDs; user clicked submit again.

The endpoint should be idempotent: returning a real error on duplicate email is handled at line 432-440 of onboard.py. But fake-success paths bypass this check.

Fix: check the users collection for duplicates by email; deduplicate manually. Long-term fix tracked in #2095.

Symptom: signup succeeds but no API key in response

Cause: _generate_api_key() at onboard.py:157 succeeded but the Mongo write failed silently.

# Check the org has an api_key: mongosh "$MONGO_URI" --eval 'db.api_keys.find({org_id: "org_XXX"}).pretty()'

Fix: if missing, run the recreate script (see docs/runbooks/api-key-recovery.md — not yet written; raise an issue) OR have the user re-onboard.


Manually activate a pending user

If SIGNUP_AUTO_ACTIVATE=False was set (intentionally or by accident) and accounts are stuck:

# Find pending users: mongosh "$MONGO_URI" --eval 'db.users.find({status: "pending"}, {_id: 1, email: 1, created_at: 1}).pretty()' # Activate via admin API (requires platform-admin token): curl -X POST "https://api.curate-me.ai/api/v1/platform/onboard/approve/usr_XXX" \ -H "Authorization: Bearer $PLATFORM_ADMIN_TOKEN"

The approval endpoint at onboard.py:558+ activates user + org + membership, generates the API key, and creates the governance policy.


Smoke test

The end-to-end smoke test at services/backend/tests/integration/test_signup_smoke.py:

  • POSTs to /api/v1/platform/onboard with a valid free-tier payload
  • Bypasses Turnstile (sets TURNSTILE_SECRET_KEY="" via autouse fixture)
  • Asserts response 200, status="active", api_key.startswith("cm_sk_")
  • Mocks heavy deps (auth service, DB, Stripe, governance policy, entitlements)

Run locally:

cd services/backend poetry run pytest tests/integration/test_signup_smoke.py -v

In CI: runs on every PR that touches services/backend/src/api/routes/platform/onboard.py (per .github/workflows/verify-claims.yml related triggers).


  • #2095 — original signup-flow audit + acceptance criteria
  • #2113 — lifecycle email (welcome, day-1, day-3, dormant) integrates with this flow once shipped
  • #2114 — free trial mechanics — when the user clicks “Start trial”, the same onboarding endpoint fires; this runbook covers the underlying mechanism
  • #2122 — agent identity (M365) is downstream consumer of signup events

Out of scope

This runbook does not cover:

  • Org provisioning (handled by b2b_auth_service.create_org)
  • API key rotation (separate runbook needed — file as follow-up)
  • Stripe customer creation (handled by metered_billing_service.create_customer)
  • Email verification flow specifics (see verify-email route handler)