Runbook: PII Scan Blocking Requests
This runbook covers diagnosing and resolving false positive PII detections that block legitimate requests through the Curate-Me AI Gateway.
Symptoms
403responses with error codeGW_PII_001orpii_scan- Legitimate content (code snippets, medical data, test data) being blocked
- Users reporting that requests work without the gateway but fail through it
Typical error response:
{
"error": {
"message": "PII detected in request: credit_card, email, ssn",
"type": "permission_error",
"code": "pii_scan"
}
}The error response includes a hint_data field with details about which patterns matched:
{
"error": {
"message": "PII detected in request: credit_card",
"type": "permission_error",
"code": "pii_scan"
},
"hint_data": {
"findings": [
{
"entity_type": "credit_card",
"score": 0.95,
"start": 142,
"end": 161,
"matched_text": "4111-XXXX-XXXX-1111"
}
]
}
}Step 1: Identify what was detected
Read the hint_data.findings array in the error response. Each finding includes:
| Field | Description |
|---|---|
entity_type | The PII category (e.g., credit_card, email, ssn, api_key) |
score | Confidence score (0.0 to 1.0) |
start / end | Character positions in the request body |
matched_text | Redacted version of the matched content |
If you do not have the original error response, check the gateway audit log in the dashboard:
Dashboard > Gateway > Usage Log — filter by status 403 and error code pii_scan.
Step 2: Determine if it is a false positive
Common false positive scenarios
| Scenario | PII Type Triggered | Why It Happens |
|---|---|---|
| Code snippets containing regex patterns | ssn, credit_card | Regex test data like \d{3}-\d{2}-\d{4} matches SSN patterns |
| Medical / healthcare prompts | ssn, email | Patient IDs, MRN numbers look like SSNs; physician emails in context |
| Financial analysis prompts | credit_card, iban | Example card numbers in documentation or test fixtures |
| API documentation in prompts | api_key | Example keys like sk-example123... match API key patterns |
| Email templates being drafted | email | Legitimate email addresses in the prompt content |
| International phone numbers | phone_number | Numbers in code or data that look like phone numbers |
Step 3: Apply the appropriate fix
Fix A: Disable specific PII entity types for the org
If the organization legitimately handles a specific type of data (e.g., a healthcare app that processes medical records), disable detection for those entity types:
curl -X PATCH https://api.curate-me.ai/api/v1/admin/gateway/policy/$ORG_ID \
-H "Authorization: Bearer $ADMIN_TOKEN" \
-H "Content-Type: application/json" \
-d '{
"pii_entity_allowlist": ["email", "phone_number"]
}'The pii_entity_allowlist specifies entity types that should be ignored. All other types are still scanned.
Available entity types:
| Entity Type | Description |
|---|---|
email | Email addresses |
phone_number | Phone numbers (US and international) |
ssn | US Social Security Numbers |
credit_card | Credit card numbers (Luhn-validated) |
iban | International Bank Account Numbers |
api_key | API keys and secrets (OpenAI, Anthropic, AWS, Stripe, GitHub) |
password | Password patterns (password=..., passwd:...) |
bearer_token | Bearer tokens in content |
eu_passport | EU passport numbers |
eu_vat | EU VAT numbers |
uk_nino | UK National Insurance Numbers |
de_id | German ID numbers |
icd10 | ICD-10 medical codes |
medication | Medication dosage patterns |
Fix B: Switch PII action to log-only mode
If the organization wants PII detection for visibility but not enforcement, switch the action from BLOCK to ALLOW:
curl -X PATCH https://api.curate-me.ai/api/v1/admin/gateway/policy/$ORG_ID \
-H "Authorization: Bearer $ADMIN_TOKEN" \
-H "Content-Type: application/json" \
-d '{"pii_action": "ALLOW"}'In ALLOW mode, PII findings are logged to the audit trail but requests are not blocked. The response includes a X-CM-PII-Warning header listing detected entities.
Fix C: Adjust the PII severity threshold
By default, the scanner blocks on any finding with a confidence score above 0.5. Raise the threshold to reduce false positives:
curl -X PATCH https://api.curate-me.ai/api/v1/admin/gateway/policy/$ORG_ID \
-H "Authorization: Bearer $ADMIN_TOKEN" \
-H "Content-Type: application/json" \
-d '{"pii_score_threshold": 0.85}'This means only high-confidence detections (score >= 0.85) will trigger blocking. Lower-confidence matches are logged but allowed through.
Fix D: Disable PII scanning entirely for the org
For organizations that have their own PII handling and do not need gateway-level scanning:
curl -X PATCH https://api.curate-me.ai/api/v1/admin/gateway/policy/$ORG_ID \
-H "Authorization: Bearer $ADMIN_TOKEN" \
-H "Content-Type: application/json" \
-d '{"pii_scan_enabled": false}'Step 4: Verify the fix
After adjusting the PII policy, replay the request that was previously blocked:
curl -v https://api.curate-me.ai/v1/openai/chat/completions \
-H "X-CM-API-Key: $API_KEY" \
-H "Authorization: Bearer $OPENAI_KEY" \
-H "Content-Type: application/json" \
-d '{
"model": "gpt-4o",
"messages": [
{"role": "user", "content": "The same content that was previously blocked"}
]
}' \
2>&1 | grep -E "HTTP|x-cm-pii"Expected outcomes:
| PII Action | Expected Behavior |
|---|---|
BLOCK with entity allowlist | 200 OK — previously blocked entity type is now ignored |
ALLOW (log-only) | 200 OK with X-CM-PII-Warning header listing findings |
| Higher threshold | 200 OK if the finding’s score is below the new threshold |
| Scanning disabled | 200 OK — no PII checking performed |
Prevention
Test with representative data early
Before going to production, send sample prompts through the gateway that represent your actual workload. This catches false positives before they affect real traffic:
# Test a batch of representative prompts
for prompt_file in test-prompts/*.json; do
status=$(curl -s -o /dev/null -w "%{http_code}" \
https://api.curate-me.ai/v1/openai/chat/completions \
-H "X-CM-API-Key: $API_KEY" \
-H "Authorization: Bearer $OPENAI_KEY" \
-H "Content-Type: application/json" \
-d @"$prompt_file")
echo "$prompt_file: $status"
doneConfigure the PII entity allowlist proactively
If you know your application handles specific data types (emails in a CRM, card numbers in a payment processor), add them to the allowlist before going live.
Use log-only mode during onboarding
Start with pii_action: "ALLOW" during the first week of integration. Review the PII findings in the audit log, then switch to BLOCK after confirming no false positives.
When PII detection is correct (not a false positive)
If the PII detection is legitimate — real secrets, real PII in prompts that should not be sent to an LLM provider:
- Do not disable the scanner. The detection is working as intended.
- Sanitize the input before sending it to the gateway:
# Replace real data with placeholders "Customer email: john@example.com" --> "Customer email: [EMAIL_REDACTED]" "SSN: 123-45-6789" --> "SSN: [SSN_REDACTED]" - Enable the Presidio NER engine (via the
DLP_GUARDRAILSfeature flag) for more accurate detection with fewer false positives than regex alone.
Escalation
If PII false positives cannot be resolved with the above configuration options:
- Collect the full error response including
hint_data - Note the exact content that triggered the false positive (redact if sensitive)
- Report the false positive pattern so the PII scanner regex can be refined
- Contact the platform team with the org ID, entity type, and a redacted example of the false positive