Fleet Deployment

Once you’ve connected your first machine, you can scale to a fleet of machines across multiple devices with centralized management, batch operations, and load-balanced job dispatch.

Connecting Multiple Machines

Each machine needs its own registration token. Generate tokens in bulk:

Via API


# Generate tokens for 5 machines
for i in $(seq 1 5); do
  curl -s -X POST https://api.curate-me.ai/gateway/admin/byovm/register-token \
    -H "X-CM-API-Key: cm_sk_your_key_here" \
    -H "Content-Type: application/json" | jq -r '.token'
done

Via Dashboard

Navigate to Runners > Your Machines and click Connect Machine for each device. Each token is unique and single-use.

1-Click Hetzner Deployment

For teams that want managed infrastructure without self-hosting, the dashboard includes 1-click Hetzner Cloud deployment.

Via Dashboard

In Runners > Your Machines, click Deploy New VM
Select server type and region:

Type	vCPU	RAM	Disk	Price
CX22	2	4 GB	40 GB	~$5/mo
CX32	4	8 GB	80 GB	~$9/mo
CX42	8	16 GB	160 GB	~$17/mo
CX52	16	32 GB	320 GB	~$33/mo

Select region: Nuremberg, Falkenstein, Helsinki, Ashburn, Hillsboro, or Singapore
Click Deploy — the platform provisions the VM, installs Docker, and registers the agent automatically

Progress indicators show: Deploying → Booting → Installing → Ready

Via API


curl -X POST https://api.curate-me.ai/gateway/admin/byovm/cloud-deploy \
  -H "X-CM-API-Key: cm_sk_your_key_here" \
  -H "Content-Type: application/json" \
  -d '{
    "server_type": "cx32",
    "region": "ashburn",
    "hostname": "prod-runner-01"
  }'

Fleet Dashboard

The Your Machines page shows a unified fleet view with:

Summary Bar

Total Machines — Count of all connected machines
Status Breakdown — ONLINE, BUSY, OFFLINE counts
Aggregate Resources — Total CPU cores, RAM, disk across fleet
Average Utilization — CPU and memory usage averages
Fleet Health — Healthy (all online), Degraded (some offline), Critical (majority offline)

Machine Grid

Each machine shows:

Hostname with status indicator (green pulse = ONLINE, yellow = BUSY, gray = OFFLINE)
OS type and cloud provider
Resource usage (CPU, RAM, disk)
Last heartbeat timestamp
Expandable detail panel with tabs:
- Resources — CPU and RAM usage charts (1-hour history)
- Sessions — Active and recent sessions
- Policies — Applied governance policies
- Audit Log — Action history with timestamps

Sorting and Filtering

Sort by: Hostname, state, uptime, CPU usage, last heartbeat
Filter by: Status (ONLINE, OFFLINE, BUSY, REGISTERING), cloud provider, OS type

Batch Operations

Manage your entire fleet with batch operations:

Via Dashboard

Select multiple machines (or click Select All), then choose an operation:

Operation	Description
Restart All	Restart selected machines
Update All	Pull latest container image and restart
Apply Config	Push governance policy changes to all

Via API


# Dispatch job to all ONLINE machines
curl -X POST https://api.curate-me.ai/gateway/admin/byovm/dispatch \
  -H "X-CM-API-Key: cm_sk_your_key_here" \
  -H "Content-Type: application/json" \
  -d '{
    "agent_id": "all_online",
    "command": ["session.create"],
    "template_id": "default"
  }'

Job Dispatch and Load Balancing

When dispatching jobs to a fleet, the control plane selects the best machine based on:

Availability — Only ONLINE machines receive jobs (not BUSY, OFFLINE, or STALE)
Resource utilization — Prefers machines with lower CPU/memory usage
Capability labels — Match job requirements to machine capabilities

Capability Labels

Tag your machines with capability labels during registration:


docker run -d \
  --name curateme-agent \
  -e CM_REGISTRATION_TOKEN="your_reg_token" \
  -e CM_GATEWAY_URL="https://api.curate-me.ai" \
  -e CM_AGENT_HOSTNAME="gpu-server-01" \
  -e CM_CAPABILITY_LABELS="gpu,pytorch,cuda" \
  ghcr.io/curate-me-ai/openclaw-base:latest

Then dispatch jobs targeting specific capabilities:


curl -X POST https://api.curate-me.ai/gateway/admin/byovm/dispatch \
  -H "X-CM-API-Key: cm_sk_your_key_here" \
  -H "Content-Type: application/json" \
  -d '{
    "command": ["session.exec", "--", "python", "train.py"],
    "required_labels": ["gpu", "pytorch"]
  }'

Job Lifecycle


queued ──> dispatched ──> running ──> completed
                |                       |
                v                       v
             timed out              failed
             (5 min)            (with error msg)

Status	Description
`queued`	Job created, waiting for a machine to pick it up
`dispatched`	Machine picked up the job
`running`	Machine is executing the command
`completed`	Job finished successfully
`failed`	Job failed (error message in `last_error`)

Jobs support automatic retries (default: 3 attempts) with idempotency keys for safe retry behavior.

WebSocket Real-Time Communication

For interactive sessions, machines maintain a WebSocket connection for bidirectional messaging:


# Machine-side (handled by the agent automatically)
wscat -c "wss://api.curate-me.ai/gateway/admin/byovm/agents/byovm_abc123/ws" \
  -H "X-CM-Agent-Token: agent_token_here"

The dashboard uses WebSocket for:

Real-time status updates (no polling delay)
Live session output streaming
Interactive agent chat

Monitoring

Health Checks

The platform automatically monitors fleet health:

Heartbeat monitoring — Machines missing heartbeats for 5 minutes transition to STALE
Dead machine detection — Stale machines with no heartbeat for 15 minutes transition to DEAD
Auto-recovery — Machines automatically return to ONLINE when heartbeats resume
Stale dispatch cleanup — Jobs stuck in dispatched state for 5+ minutes are requeued

Cost Tracking

All LLM costs from your connected machines are tracked in the dashboard:

Per-machine cost breakdown
Per-session cost tracking
Daily budget enforcement (jobs rejected when budget exceeded)
Historical cost trends

Next Steps

Troubleshooting — Common issues and fixes
Your Machines Overview — Architecture and concepts
Runners Security — Security model
API Reference — Complete endpoint documentation