Runbook: PII Scan Blocking Requests

Owner: Platform Team Backup owner: On-call engineer Last validated: Not yet validated Validation method: Manual drill Severity trigger: SEV3 Customer impact: Legitimate requests blocked by false-positive PII detection Required access: SSH (VPS), MongoDB Related services: curateme-backend-gateway

This runbook covers diagnosing and resolving false positive PII detections that block legitimate requests through the Curate-Me AI Gateway.

Symptoms

403 responses with error code GW_PII_001 or pii_scan
Legitimate content (code snippets, medical data, test data) being blocked
Users reporting that requests work without the gateway but fail through it

Typical error response:


{
  "error": {
    "message": "PII detected in request: credit_card, email, ssn",
    "type": "permission_error",
    "code": "pii_scan"
  }
}

The error response includes a hint_data field with details about which patterns matched:


{
  "error": {
    "message": "PII detected in request: credit_card",
    "type": "permission_error",
    "code": "pii_scan"
  },
  "hint_data": {
    "findings": [
      {
        "entity_type": "credit_card",
        "score": 0.95,
        "start": 142,
        "end": 161,
        "matched_text": "4111-XXXX-XXXX-1111"
      }
    ]
  }
}

Step 1: Identify what was detected

Read the hint_data.findings array in the error response. Each finding includes:

Field	Description
`entity_type`	The PII category (e.g., `credit_card`, `email`, `ssn`, `api_key`)
`score`	Confidence score (0.0 to 1.0)
`start` / `end`	Character positions in the request body
`matched_text`	Redacted version of the matched content

If you do not have the original error response, check the gateway audit log in the dashboard:

Dashboard > Gateway > Usage Log — filter by status 403 and error code pii_scan.

Step 2: Determine if it is a false positive

Common false positive scenarios

Scenario	PII Type Triggered	Why It Happens
Code snippets containing regex patterns	`ssn`, `credit_card`	Regex test data like `\d{3}-\d{2}-\d{4}` matches SSN patterns
Medical / healthcare prompts	`ssn`, `email`	Patient IDs, MRN numbers look like SSNs; physician emails in context
Financial analysis prompts	`credit_card`, `iban`	Example card numbers in documentation or test fixtures
API documentation in prompts	`api_key`	Example keys like `sk-example123...` match API key patterns
Email templates being drafted	`email`	Legitimate email addresses in the prompt content
International phone numbers	`phone_number`	Numbers in code or data that look like phone numbers

Step 3: Apply the appropriate fix

Fix A: Disable specific PII entity types for the org

If the organization legitimately handles a specific type of data (e.g., a healthcare app that processes medical records), disable detection for those entity types:


curl -X PATCH https://api.curate-me.ai/api/v1/admin/gateway/policy/$ORG_ID \
  -H "Authorization: Bearer $ADMIN_TOKEN" \
  -H "Content-Type: application/json" \
  -d '{
    "pii_entity_allowlist": ["email", "phone_number"]
  }'

The pii_entity_allowlist specifies entity types that should be ignored. All other types are still scanned.

Available entity types:

Entity Type	Description
`email`	Email addresses
`phone_number`	Phone numbers (US and international)
`ssn`	US Social Security Numbers
`credit_card`	Credit card numbers (Luhn-validated)
`iban`	International Bank Account Numbers
`api_key`	API keys and secrets (OpenAI, Anthropic, AWS, Stripe, GitHub)
`password`	Password patterns (`password=...`, `passwd:...`)
`bearer_token`	Bearer tokens in content
`eu_passport`	EU passport numbers
`eu_vat`	EU VAT numbers
`uk_nino`	UK National Insurance Numbers
`de_id`	German ID numbers
`icd10`	ICD-10 medical codes
`medication`	Medication dosage patterns

Fix B: Switch PII action to log-only mode

If the organization wants PII detection for visibility but not enforcement, switch the action from BLOCK to ALLOW:


curl -X PATCH https://api.curate-me.ai/api/v1/admin/gateway/policy/$ORG_ID \
  -H "Authorization: Bearer $ADMIN_TOKEN" \
  -H "Content-Type: application/json" \
  -d '{"pii_action": "ALLOW"}'

In ALLOW mode, PII findings are logged to the audit trail but requests are not blocked. The response includes a X-CM-PII-Warning header listing detected entities.

Fix C: Adjust the PII severity threshold

By default, the scanner blocks on any finding with a confidence score above 0.5. Raise the threshold to reduce false positives:


curl -X PATCH https://api.curate-me.ai/api/v1/admin/gateway/policy/$ORG_ID \
  -H "Authorization: Bearer $ADMIN_TOKEN" \
  -H "Content-Type: application/json" \
  -d '{"pii_score_threshold": 0.85}'

This means only high-confidence detections (score >= 0.85) will trigger blocking. Lower-confidence matches are logged but allowed through.

Fix D: Disable PII scanning entirely for the org

For organizations that have their own PII handling and do not need gateway-level scanning:


curl -X PATCH https://api.curate-me.ai/api/v1/admin/gateway/policy/$ORG_ID \
  -H "Authorization: Bearer $ADMIN_TOKEN" \
  -H "Content-Type: application/json" \
  -d '{"pii_scan_enabled": false}'

Step 4: Verify the fix

After adjusting the PII policy, replay the request that was previously blocked:


curl -v https://api.curate-me.ai/v1/openai/chat/completions \
  -H "X-CM-API-Key: $API_KEY" \
  -H "Authorization: Bearer $OPENAI_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "model": "gpt-4o",
    "messages": [
      {"role": "user", "content": "The same content that was previously blocked"}
    ]
  }' \
  2>&1 | grep -E "HTTP|x-cm-pii"

Expected outcomes:

PII Action	Expected Behavior
`BLOCK` with entity allowlist	`200 OK` — previously blocked entity type is now ignored
`ALLOW` (log-only)	`200 OK` with `X-CM-PII-Warning` header listing findings
Higher threshold	`200 OK` if the finding’s score is below the new threshold
Scanning disabled	`200 OK` — no PII checking performed

Prevention

Test with representative data early

Before going to production, send sample prompts through the gateway that represent your actual workload. This catches false positives before they affect real traffic:


# Test a batch of representative prompts
for prompt_file in test-prompts/*.json; do
  status=$(curl -s -o /dev/null -w "%{http_code}" \
    https://api.curate-me.ai/v1/openai/chat/completions \
    -H "X-CM-API-Key: $API_KEY" \
    -H "Authorization: Bearer $OPENAI_KEY" \
    -H "Content-Type: application/json" \
    -d @"$prompt_file")
  echo "$prompt_file: $status"
done

Configure the PII entity allowlist proactively

If you know your application handles specific data types (emails in a CRM, card numbers in a payment processor), add them to the allowlist before going live.

Use log-only mode during onboarding

Start with pii_action: "ALLOW" during the first week of integration. Review the PII findings in the audit log, then switch to BLOCK after confirming no false positives.

When PII detection is correct (not a false positive)

If the PII detection is legitimate — real secrets, real PII in prompts that should not be sent to an LLM provider:

Do not disable the scanner. The detection is working as intended.

Sanitize the input before sending it to the gateway:


# Replace real data with placeholders
"Customer email: john@example.com" --> "Customer email: [EMAIL_REDACTED]"
"SSN: 123-45-6789"                --> "SSN: [SSN_REDACTED]"

Enable the Presidio NER engine (via the PII_PRESIDIO feature flag, or the legacy DLP_GUARDRAILS umbrella flag) for more accurate detection with fewer false positives than regex alone.

Rollback

Revert the changes described in the Procedure section. If a configuration change was made, restore the previous value from the MongoDB audit log or Redis backup.

Verification

After applying the fix, verify:

The symptoms listed above are no longer present
No new errors in gateway logs: docker logs curateme-backend-gateway --tail=50
Health check passes: curl -s http://localhost:8002/health | jq .status

Escalation

If PII false positives cannot be resolved with the above configuration options:

Collect the full error response including hint_data

Pull PII-specific error logs:


./scripts/errors by-source gateway | grep "pii_detected"
./scripts/analytics snapshot today

Note the exact content that triggered the false positive (redact if sensitive)
Report the false positive pattern so the PII scanner regex can be refined
Contact the platform team with the org ID, entity type, and a redacted example of the false positive

Governance Cascading Denials — PII blocking is one governance denial type that can cascade with other policies
Gateway High Latency — PII scanning on large payloads can contribute to elevated gateway latency