Skip to Content
RunbooksRunbook: PII Scan Blocking Requests

Runbook: PII Scan Blocking Requests

This runbook covers diagnosing and resolving false positive PII detections that block legitimate requests through the Curate-Me AI Gateway.


Symptoms

  • 403 responses with error code GW_PII_001 or pii_scan
  • Legitimate content (code snippets, medical data, test data) being blocked
  • Users reporting that requests work without the gateway but fail through it

Typical error response:

{ "error": { "message": "PII detected in request: credit_card, email, ssn", "type": "permission_error", "code": "pii_scan" } }

The error response includes a hint_data field with details about which patterns matched:

{ "error": { "message": "PII detected in request: credit_card", "type": "permission_error", "code": "pii_scan" }, "hint_data": { "findings": [ { "entity_type": "credit_card", "score": 0.95, "start": 142, "end": 161, "matched_text": "4111-XXXX-XXXX-1111" } ] } }

Step 1: Identify what was detected

Read the hint_data.findings array in the error response. Each finding includes:

FieldDescription
entity_typeThe PII category (e.g., credit_card, email, ssn, api_key)
scoreConfidence score (0.0 to 1.0)
start / endCharacter positions in the request body
matched_textRedacted version of the matched content

If you do not have the original error response, check the gateway audit log in the dashboard:

Dashboard > Gateway > Usage Log — filter by status 403 and error code pii_scan.


Step 2: Determine if it is a false positive

Common false positive scenarios

ScenarioPII Type TriggeredWhy It Happens
Code snippets containing regex patternsssn, credit_cardRegex test data like \d{3}-\d{2}-\d{4} matches SSN patterns
Medical / healthcare promptsssn, emailPatient IDs, MRN numbers look like SSNs; physician emails in context
Financial analysis promptscredit_card, ibanExample card numbers in documentation or test fixtures
API documentation in promptsapi_keyExample keys like sk-example123... match API key patterns
Email templates being draftedemailLegitimate email addresses in the prompt content
International phone numbersphone_numberNumbers in code or data that look like phone numbers

Step 3: Apply the appropriate fix

Fix A: Disable specific PII entity types for the org

If the organization legitimately handles a specific type of data (e.g., a healthcare app that processes medical records), disable detection for those entity types:

curl -X PATCH https://api.curate-me.ai/api/v1/admin/gateway/policy/$ORG_ID \ -H "Authorization: Bearer $ADMIN_TOKEN" \ -H "Content-Type: application/json" \ -d '{ "pii_entity_allowlist": ["email", "phone_number"] }'

The pii_entity_allowlist specifies entity types that should be ignored. All other types are still scanned.

Available entity types:

Entity TypeDescription
emailEmail addresses
phone_numberPhone numbers (US and international)
ssnUS Social Security Numbers
credit_cardCredit card numbers (Luhn-validated)
ibanInternational Bank Account Numbers
api_keyAPI keys and secrets (OpenAI, Anthropic, AWS, Stripe, GitHub)
passwordPassword patterns (password=..., passwd:...)
bearer_tokenBearer tokens in content
eu_passportEU passport numbers
eu_vatEU VAT numbers
uk_ninoUK National Insurance Numbers
de_idGerman ID numbers
icd10ICD-10 medical codes
medicationMedication dosage patterns

Fix B: Switch PII action to log-only mode

If the organization wants PII detection for visibility but not enforcement, switch the action from BLOCK to ALLOW:

curl -X PATCH https://api.curate-me.ai/api/v1/admin/gateway/policy/$ORG_ID \ -H "Authorization: Bearer $ADMIN_TOKEN" \ -H "Content-Type: application/json" \ -d '{"pii_action": "ALLOW"}'

In ALLOW mode, PII findings are logged to the audit trail but requests are not blocked. The response includes a X-CM-PII-Warning header listing detected entities.

Fix C: Adjust the PII severity threshold

By default, the scanner blocks on any finding with a confidence score above 0.5. Raise the threshold to reduce false positives:

curl -X PATCH https://api.curate-me.ai/api/v1/admin/gateway/policy/$ORG_ID \ -H "Authorization: Bearer $ADMIN_TOKEN" \ -H "Content-Type: application/json" \ -d '{"pii_score_threshold": 0.85}'

This means only high-confidence detections (score >= 0.85) will trigger blocking. Lower-confidence matches are logged but allowed through.

Fix D: Disable PII scanning entirely for the org

For organizations that have their own PII handling and do not need gateway-level scanning:

curl -X PATCH https://api.curate-me.ai/api/v1/admin/gateway/policy/$ORG_ID \ -H "Authorization: Bearer $ADMIN_TOKEN" \ -H "Content-Type: application/json" \ -d '{"pii_scan_enabled": false}'

Step 4: Verify the fix

After adjusting the PII policy, replay the request that was previously blocked:

curl -v https://api.curate-me.ai/v1/openai/chat/completions \ -H "X-CM-API-Key: $API_KEY" \ -H "Authorization: Bearer $OPENAI_KEY" \ -H "Content-Type: application/json" \ -d '{ "model": "gpt-4o", "messages": [ {"role": "user", "content": "The same content that was previously blocked"} ] }' \ 2>&1 | grep -E "HTTP|x-cm-pii"

Expected outcomes:

PII ActionExpected Behavior
BLOCK with entity allowlist200 OK — previously blocked entity type is now ignored
ALLOW (log-only)200 OK with X-CM-PII-Warning header listing findings
Higher threshold200 OK if the finding’s score is below the new threshold
Scanning disabled200 OK — no PII checking performed

Prevention

Test with representative data early

Before going to production, send sample prompts through the gateway that represent your actual workload. This catches false positives before they affect real traffic:

# Test a batch of representative prompts for prompt_file in test-prompts/*.json; do status=$(curl -s -o /dev/null -w "%{http_code}" \ https://api.curate-me.ai/v1/openai/chat/completions \ -H "X-CM-API-Key: $API_KEY" \ -H "Authorization: Bearer $OPENAI_KEY" \ -H "Content-Type: application/json" \ -d @"$prompt_file") echo "$prompt_file: $status" done

Configure the PII entity allowlist proactively

If you know your application handles specific data types (emails in a CRM, card numbers in a payment processor), add them to the allowlist before going live.

Use log-only mode during onboarding

Start with pii_action: "ALLOW" during the first week of integration. Review the PII findings in the audit log, then switch to BLOCK after confirming no false positives.


When PII detection is correct (not a false positive)

If the PII detection is legitimate — real secrets, real PII in prompts that should not be sent to an LLM provider:

  1. Do not disable the scanner. The detection is working as intended.
  2. Sanitize the input before sending it to the gateway:
    # Replace real data with placeholders "Customer email: john@example.com" --> "Customer email: [EMAIL_REDACTED]" "SSN: 123-45-6789" --> "SSN: [SSN_REDACTED]"
  3. Enable the Presidio NER engine (via the DLP_GUARDRAILS feature flag) for more accurate detection with fewer false positives than regex alone.

Escalation

If PII false positives cannot be resolved with the above configuration options:

  1. Collect the full error response including hint_data
  2. Note the exact content that triggered the false positive (redact if sensitive)
  3. Report the false positive pattern so the PII scanner regex can be refined
  4. Contact the platform team with the org ID, entity type, and a redacted example of the false positive