Skip to Content
DeploymentByovmTroubleshooting Your Machines

Troubleshooting Your Machines

Common issues you may encounter when setting up and running connected machines, and how to fix them.

Registration Issues

Machine stuck in REGISTERING state

Symptom: Machine appears in the dashboard but never transitions to ONLINE.

Causes and fixes:

  1. Network connectivity — Verify the machine can reach the gateway:

    docker exec curateme-agent curl -s https://api.curate-me.ai/health

    If this fails, check your firewall rules allow outbound HTTPS (port 443).

  2. Expired token — Registration tokens expire after 1 hour. Generate a new one from the dashboard and restart the container with the new token.

  3. DNS resolution — Ensure DNS works inside the container:

    docker exec curateme-agent nslookup api.curate-me.ai

“Token already used” error

Each registration token is single-use. If the container crashed after using the token but before completing registration, you need to:

  1. Remove the old container: docker rm -f curateme-agent
  2. Generate a new token from the dashboard
  3. Start a new container with the fresh token

”Token expired” error

Tokens are valid for 1 hour. Generate a new one and try again. You can extend the TTL via API:

curl -X POST https://api.curate-me.ai/gateway/admin/byovm/register-token \ -H "X-CM-API-Key: cm_sk_your_key_here" \ -H "Content-Type: application/json" \ -d '{"ttl_seconds": 7200}'

Connection Issues

Machine shows STALE or DEAD

Symptom: Machine was ONLINE but is now STALE (5+ min no heartbeat) or DEAD (15+ min).

Fixes:

  1. Check container is running:

    docker ps | grep curateme-agent
  2. Check container logs:

    docker logs curateme-agent --tail 50
  3. Restart the container:

    docker restart curateme-agent
  4. Check network — Temporary network issues cause missed heartbeats. The agent auto-recovers when connectivity returns.

WebSocket connection drops

Symptom: Real-time features stop working, dashboard shows stale data.

Cause: Corporate proxies or firewalls may terminate long-lived WebSocket connections.

Fix: The agent falls back to HTTP polling automatically. For better real-time performance, ensure WebSocket connections to wss://api.curate-me.ai are allowed through your network.

Resource Issues

High CPU/memory from agent container

The OpenClaw container is lightweight (~200 MB idle). High resource usage usually means an agent is actively processing a job.

To check:

docker stats curateme-agent --no-stream

To limit resources:

docker run -d \ --name curateme-agent \ --cpus=2 \ --memory=4g \ -e CM_REGISTRATION_TOKEN="..." \ -e CM_GATEWAY_URL="https://api.curate-me.ai" \ ghcr.io/curate-me-ai/openclaw-base:latest

Disk space running low

The agent stores session data and logs locally. Clean up old data:

# Remove stopped containers docker container prune # Remove unused images docker image prune # Check container disk usage docker system df

Job Execution Issues

Jobs stuck in “queued” state

Cause: No ONLINE machines available to pick up the job.

Fixes:

  1. Verify at least one machine shows ONLINE in the dashboard
  2. Check the agent is polling: docker logs curateme-agent | grep "poll"
  3. Ensure the job’s required_labels match a machine’s capability labels

Jobs failing repeatedly

Cause: Command execution errors on the machine.

To debug:

  1. Check the job’s last_error field in the dashboard (Sessions tab)
  2. Check agent logs: docker logs curateme-agent --tail 100
  3. Verify the required tools are available in your container image:
    • openclaw-base: Shell, git, Node.js
    • openclaw-web: Browser (Playwright)
    • openclaw-locked: No external tools

Job timeout

Jobs in dispatched state for over 5 minutes are automatically requeued. If jobs consistently timeout:

  1. Increase timeout in the dispatch request
  2. Check machine health — a slow heartbeat means the machine may be overloaded
  3. Scale the fleet — add more machines to distribute load

Governance Issues

”Rate limit exceeded” errors

Your machine’s LLM requests are being throttled by the governance chain.

Fix: Increase the rate limit in Runners > Your Machines > Machine > Policies:

curl -X PUT https://api.curate-me.ai/gateway/admin/byovm/agents/{agent_id}/policies \ -H "X-CM-API-Key: cm_sk_your_key_here" \ -H "Content-Type: application/json" \ -d '{"rate_limit_rpm": 300}'

“Daily budget exceeded” errors

The machine has hit the daily spending cap.

Fix: Increase the daily budget or wait for midnight UTC reset:

curl -X PUT https://api.curate-me.ai/gateway/admin/byovm/agents/{agent_id}/policies \ -H "X-CM-API-Key: cm_sk_your_key_here" \ -H "Content-Type: application/json" \ -d '{"daily_budget_usd": 50.0}'

PII detected in requests

The governance chain is blocking requests that contain potential PII or secrets.

To investigate: Check the gateway logs (Gateway > Logs) for the specific PII patterns detected. Common triggers:

  • API keys or tokens in prompts
  • Email addresses
  • Phone numbers
  • Social security numbers

To allow: If these are intentional (e.g., processing customer data), adjust PII scanning rules in Gateway > Policies.

Container-Specific Issues

macOS Docker Desktop performance

Docker Desktop on macOS uses a Linux VM which adds overhead.

Tips:

  • Allocate at least 4 GB RAM to Docker Desktop
  • Use --platform linux/amd64 if on Apple Silicon for best compatibility
  • Consider running the agent natively if Docker overhead is an issue

Windows container issues

Windows containers require Docker Desktop with Windows containers enabled.

Common fixes:

  • Switch Docker Desktop to Windows containers mode
  • Use ghcr.io/curate-me-ai/openclaw-windows:latest image
  • Ensure Hyper-V is enabled in Windows features

Getting Help

If none of these solutions resolve your issue:

  1. Check the Runners API Reference for endpoint details
  2. Review Runners Security for policy configuration
  3. Visit the Dashboard to check machine status and logs
  4. Contact support at support@curate-me.ai with your machine ID and error details