Skip to Content
TroubleshootingMachine Offline

Machine Offline

Symptom

A machine that was previously Online now shows Offline (or Stale) in Runners → Your Machines. Active sessions on that machine fail to launch with agent_unreachable.

Likely causes

CauseWhat you’d see in agent logsFix
Network blipHeartbeat HTTP errors (connection_reset, dial_timeout)Usually self-heals on next heartbeat (30s). If it lasts > 5 min, check egress.
Container exitedNo agent logs at all (docker ps doesn’t show cm-runner)Restart the container — see below.
Host crashed / rebootedHost SSH refuses connections tooBoot the host, then docker start cm-runner.
Docker daemon hungAgent process up but heartbeats fail with EOF from socketRestart Docker (systemctl restart docker) then the agent.
Agent OOM-killedOOMKilled in docker inspect cm-runnerBump host RAM or remove other heavy containers.
Gateway-side state driftAgent thinks it’s healthy; dashboard says offlineSee State drift below.

Fix

Step 1 — Confirm the agent process

docker ps -a --filter name=cm-runner --format "table {{.Names}}\t{{.Status}}\t{{.RunningFor}}"

If Status is Exited, restart it:

docker start cm-runner docker logs cm-runner --tail 50 -f

If Status is Up but the dashboard still says offline, check egress:

docker exec cm-runner curl -s -o /dev/null -w "%{http_code}\n" \ https://api.curate-me.ai/health

A non-200 means the gateway is unreachable from inside the container — inspect host firewall / DNS / TLS.

Step 2 — Force a fresh heartbeat

docker exec cm-runner cm-runner agent --send-heartbeat-now

The dashboard should flip back to Online within ~5s.

Step 3 — Last resort: re-register

If the agent is healthy but the control plane no longer knows about it (e.g. the agent’s persistent state was wiped), generate a fresh registration token and re-register:

docker rm -f cm-runner docker volume rm cm-runner-data # clears the saved agent_id # ... then run the install command from /quickstart/connect-your-machine

State drift

If the agent log shows heartbeat_sent every 30s but the dashboard still says offline, the gateway’s stale-detection job is misclassifying the agent. File a support ticket with:

  • Agent ID (visible in docker logs cm-runner | grep agent_id)
  • Org ID
  • Timestamp of the last heartbeat the agent thinks it sent
  • Last 50 lines of docker logs cm-runner

Where to find logs

# Agent side docker logs cm-runner --tail 200 -f | grep -E "heartbeat|registered|disconnect" # Server side (support team) ./scripts/errors by-source gateway | grep byovm_agent