Cost Accumulation Falls Behind
Symptoms
- Dashboard shows lower daily spend than expected
- Budget warnings trigger late or not at all
- Metered billing reports don’t match gateway logs
Likely Causes
- Redis connection issues — cost accumulation uses
INCRBYFLOATwhich requires Redis - Non-atomic fallback active — if Redis
INCRBYFLOATis unavailable, the fallback read-then-write path can lose increments under concurrency - MongoDB write failures — usage records failed to persist, triggering dead-letter queue
- High request volume — cost recording is async but can back up under extreme load
Triage Steps
1. Check Redis connectivity
./scripts/analytics health
# Look for: redis_connected: true, redis_latency_ms < 102. Check for non-atomic fallback warnings
# Search backend logs for the fallback warning
./scripts/errors by-source gateway | grep "non_atomic_fallback"3. Check dead-letter queue
# Check if failed usage records are accumulating
redis-cli LLEN gateway:dlq:usage_records4. Compare Redis vs MongoDB totals
./scripts/analytics costs today
# Compare redis_daily_total vs mongodb_daily_totalResolution
Redis connection restored
The counters will self-heal as new requests increment correctly. For the gap period, recalculate from MongoDB:
./scripts/analytics costs reconcileDead-letter queue processing
Failed records can be replayed:
./scripts/analytics costs replay-dlqEscalation
If cost drift exceeds 10% of daily spend, escalate to on-call.