Fleet Deployment
Once you’ve connected your first machine, you can scale to a fleet of machines across multiple devices with centralized management, batch operations, and load-balanced job dispatch.
Connecting Multiple Machines
Each machine needs its own registration token. Generate tokens in bulk:
Via API
# Generate tokens for 5 machines
for i in $(seq 1 5); do
curl -s -X POST https://api.curate-me.ai/gateway/admin/byovm/register-token \
-H "X-CM-API-Key: cm_sk_your_key_here" \
-H "Content-Type: application/json" | jq -r '.token'
doneVia Dashboard
Navigate to Runners > Your Machines and click Connect Machine for each device. Each token is unique and single-use.
1-Click Hetzner Deployment
For teams that want managed infrastructure without self-hosting, the dashboard includes 1-click Hetzner Cloud deployment.
Via Dashboard
- In Runners > Your Machines, click Deploy New VM
- Select server type and region:
| Type | vCPU | RAM | Disk | Price |
|---|---|---|---|---|
| CX22 | 2 | 4 GB | 40 GB | ~$5/mo |
| CX32 | 4 | 8 GB | 80 GB | ~$9/mo |
| CX42 | 8 | 16 GB | 160 GB | ~$17/mo |
| CX52 | 16 | 32 GB | 320 GB | ~$33/mo |
- Select region: Nuremberg, Falkenstein, Helsinki, Ashburn, Hillsboro, or Singapore
- Click Deploy — the platform provisions the VM, installs Docker, and registers the agent automatically
Progress indicators show: Deploying → Booting → Installing → Ready
Via API
curl -X POST https://api.curate-me.ai/gateway/admin/byovm/cloud-deploy \
-H "X-CM-API-Key: cm_sk_your_key_here" \
-H "Content-Type: application/json" \
-d '{
"server_type": "cx32",
"region": "ashburn",
"hostname": "prod-runner-01"
}'Fleet Dashboard
The Your Machines page shows a unified fleet view with:
Summary Bar
- Total Machines — Count of all connected machines
- Status Breakdown — ONLINE, BUSY, OFFLINE counts
- Aggregate Resources — Total CPU cores, RAM, disk across fleet
- Average Utilization — CPU and memory usage averages
- Fleet Health — Healthy (all online), Degraded (some offline), Critical (majority offline)
Machine Grid
Each machine shows:
- Hostname with status indicator (green pulse = ONLINE, yellow = BUSY, gray = OFFLINE)
- OS type and cloud provider
- Resource usage (CPU, RAM, disk)
- Last heartbeat timestamp
- Expandable detail panel with tabs:
- Resources — CPU and RAM usage charts (1-hour history)
- Sessions — Active and recent sessions
- Policies — Applied governance policies
- Audit Log — Action history with timestamps
Sorting and Filtering
- Sort by: Hostname, state, uptime, CPU usage, last heartbeat
- Filter by: Status (ONLINE, OFFLINE, BUSY, REGISTERING), cloud provider, OS type
Batch Operations
Manage your entire fleet with batch operations:
Via Dashboard
Select multiple machines (or click Select All), then choose an operation:
| Operation | Description |
|---|---|
| Restart All | Restart selected machines |
| Update All | Pull latest container image and restart |
| Apply Config | Push governance policy changes to all |
Via API
# Dispatch job to all ONLINE machines
curl -X POST https://api.curate-me.ai/gateway/admin/byovm/dispatch \
-H "X-CM-API-Key: cm_sk_your_key_here" \
-H "Content-Type: application/json" \
-d '{
"agent_id": "all_online",
"command": ["session.create"],
"template_id": "default"
}'Job Dispatch and Load Balancing
When dispatching jobs to a fleet, the control plane selects the best machine based on:
- Availability — Only ONLINE machines receive jobs (not BUSY, OFFLINE, or STALE)
- Resource utilization — Prefers machines with lower CPU/memory usage
- Capability labels — Match job requirements to machine capabilities
Capability Labels
Tag your machines with capability labels during registration:
docker run -d \
--name curateme-agent \
-e CM_REGISTRATION_TOKEN="your_reg_token" \
-e CM_GATEWAY_URL="https://api.curate-me.ai" \
-e CM_AGENT_HOSTNAME="gpu-server-01" \
-e CM_CAPABILITY_LABELS="gpu,pytorch,cuda" \
ghcr.io/curate-me-ai/openclaw-base:latestThen dispatch jobs targeting specific capabilities:
curl -X POST https://api.curate-me.ai/gateway/admin/byovm/dispatch \
-H "X-CM-API-Key: cm_sk_your_key_here" \
-H "Content-Type: application/json" \
-d '{
"command": ["session.exec", "--", "python", "train.py"],
"required_labels": ["gpu", "pytorch"]
}'Job Lifecycle
queued ──> dispatched ──> running ──> completed
| |
v v
timed out failed
(5 min) (with error msg)| Status | Description |
|---|---|
queued | Job created, waiting for a machine to pick it up |
dispatched | Machine picked up the job |
running | Machine is executing the command |
completed | Job finished successfully |
failed | Job failed (error message in last_error) |
Jobs support automatic retries (default: 3 attempts) with idempotency keys for safe retry behavior.
WebSocket Real-Time Communication
For interactive sessions, machines maintain a WebSocket connection for bidirectional messaging:
# Machine-side (handled by the agent automatically)
wscat -c "wss://api.curate-me.ai/gateway/admin/byovm/agents/byovm_abc123/ws" \
-H "X-CM-Agent-Token: agent_token_here"The dashboard uses WebSocket for:
- Real-time status updates (no polling delay)
- Live session output streaming
- Interactive agent chat
Monitoring
Health Checks
The platform automatically monitors fleet health:
- Heartbeat monitoring — Machines missing heartbeats for 5 minutes transition to
STALE - Dead machine detection — Stale machines with no heartbeat for 15 minutes transition to
DEAD - Auto-recovery — Machines automatically return to
ONLINEwhen heartbeats resume - Stale dispatch cleanup — Jobs stuck in
dispatchedstate for 5+ minutes are requeued
Cost Tracking
All LLM costs from your connected machines are tracked in the dashboard:
- Per-machine cost breakdown
- Per-session cost tracking
- Daily budget enforcement (jobs rejected when budget exceeded)
- Historical cost trends
Next Steps
- Troubleshooting — Common issues and fixes
- Your Machines Overview — Architecture and concepts
- Runners Security — Security model
- API Reference — Complete endpoint documentation