Skip to Content
DeploymentByovmFleet Deployment

Fleet Deployment

Once you’ve connected your first machine, you can scale to a fleet of machines across multiple devices with centralized management, batch operations, and load-balanced job dispatch.

Connecting Multiple Machines

Each machine needs its own registration token. Generate tokens in bulk:

Via API

# Generate tokens for 5 machines for i in $(seq 1 5); do curl -s -X POST https://api.curate-me.ai/gateway/admin/byovm/register-token \ -H "X-CM-API-Key: cm_sk_your_key_here" \ -H "Content-Type: application/json" | jq -r '.token' done

Via Dashboard

Navigate to Runners > Your Machines and click Connect Machine for each device. Each token is unique and single-use.

1-Click Hetzner Deployment

For teams that want managed infrastructure without self-hosting, the dashboard includes 1-click Hetzner Cloud deployment.

Via Dashboard

  1. In Runners > Your Machines, click Deploy New VM
  2. Select server type and region:
TypevCPURAMDiskPrice
CX2224 GB40 GB~$5/mo
CX3248 GB80 GB~$9/mo
CX42816 GB160 GB~$17/mo
CX521632 GB320 GB~$33/mo
  1. Select region: Nuremberg, Falkenstein, Helsinki, Ashburn, Hillsboro, or Singapore
  2. Click Deploy — the platform provisions the VM, installs Docker, and registers the agent automatically

Progress indicators show: DeployingBootingInstallingReady

Via API

curl -X POST https://api.curate-me.ai/gateway/admin/byovm/cloud-deploy \ -H "X-CM-API-Key: cm_sk_your_key_here" \ -H "Content-Type: application/json" \ -d '{ "server_type": "cx32", "region": "ashburn", "hostname": "prod-runner-01" }'

Fleet Dashboard

The Your Machines page shows a unified fleet view with:

Summary Bar

  • Total Machines — Count of all connected machines
  • Status Breakdown — ONLINE, BUSY, OFFLINE counts
  • Aggregate Resources — Total CPU cores, RAM, disk across fleet
  • Average Utilization — CPU and memory usage averages
  • Fleet Health — Healthy (all online), Degraded (some offline), Critical (majority offline)

Machine Grid

Each machine shows:

  • Hostname with status indicator (green pulse = ONLINE, yellow = BUSY, gray = OFFLINE)
  • OS type and cloud provider
  • Resource usage (CPU, RAM, disk)
  • Last heartbeat timestamp
  • Expandable detail panel with tabs:
    • Resources — CPU and RAM usage charts (1-hour history)
    • Sessions — Active and recent sessions
    • Policies — Applied governance policies
    • Audit Log — Action history with timestamps

Sorting and Filtering

  • Sort by: Hostname, state, uptime, CPU usage, last heartbeat
  • Filter by: Status (ONLINE, OFFLINE, BUSY, REGISTERING), cloud provider, OS type

Batch Operations

Manage your entire fleet with batch operations:

Via Dashboard

Select multiple machines (or click Select All), then choose an operation:

OperationDescription
Restart AllRestart selected machines
Update AllPull latest container image and restart
Apply ConfigPush governance policy changes to all

Via API

# Dispatch job to all ONLINE machines curl -X POST https://api.curate-me.ai/gateway/admin/byovm/dispatch \ -H "X-CM-API-Key: cm_sk_your_key_here" \ -H "Content-Type: application/json" \ -d '{ "agent_id": "all_online", "command": ["session.create"], "template_id": "default" }'

Job Dispatch and Load Balancing

When dispatching jobs to a fleet, the control plane selects the best machine based on:

  1. Availability — Only ONLINE machines receive jobs (not BUSY, OFFLINE, or STALE)
  2. Resource utilization — Prefers machines with lower CPU/memory usage
  3. Capability labels — Match job requirements to machine capabilities

Capability Labels

Tag your machines with capability labels during registration:

docker run -d \ --name curateme-agent \ -e CM_REGISTRATION_TOKEN="your_reg_token" \ -e CM_GATEWAY_URL="https://api.curate-me.ai" \ -e CM_AGENT_HOSTNAME="gpu-server-01" \ -e CM_CAPABILITY_LABELS="gpu,pytorch,cuda" \ ghcr.io/curate-me-ai/openclaw-base:latest

Then dispatch jobs targeting specific capabilities:

curl -X POST https://api.curate-me.ai/gateway/admin/byovm/dispatch \ -H "X-CM-API-Key: cm_sk_your_key_here" \ -H "Content-Type: application/json" \ -d '{ "command": ["session.exec", "--", "python", "train.py"], "required_labels": ["gpu", "pytorch"] }'

Job Lifecycle

queued ──> dispatched ──> running ──> completed | | v v timed out failed (5 min) (with error msg)
StatusDescription
queuedJob created, waiting for a machine to pick it up
dispatchedMachine picked up the job
runningMachine is executing the command
completedJob finished successfully
failedJob failed (error message in last_error)

Jobs support automatic retries (default: 3 attempts) with idempotency keys for safe retry behavior.

WebSocket Real-Time Communication

For interactive sessions, machines maintain a WebSocket connection for bidirectional messaging:

# Machine-side (handled by the agent automatically) wscat -c "wss://api.curate-me.ai/gateway/admin/byovm/agents/byovm_abc123/ws" \ -H "X-CM-Agent-Token: agent_token_here"

The dashboard uses WebSocket for:

  • Real-time status updates (no polling delay)
  • Live session output streaming
  • Interactive agent chat

Monitoring

Health Checks

The platform automatically monitors fleet health:

  • Heartbeat monitoring — Machines missing heartbeats for 5 minutes transition to STALE
  • Dead machine detection — Stale machines with no heartbeat for 15 minutes transition to DEAD
  • Auto-recovery — Machines automatically return to ONLINE when heartbeats resume
  • Stale dispatch cleanup — Jobs stuck in dispatched state for 5+ minutes are requeued

Cost Tracking

All LLM costs from your connected machines are tracked in the dashboard:

  • Per-machine cost breakdown
  • Per-session cost tracking
  • Daily budget enforcement (jobs rejected when budget exceeded)
  • Historical cost trends

Next Steps