Roboflow is a distributed data transformation pipeline for converting robotics bag/MCAP files to trainable datasets (LeRobot format).
- Horizontal Scaling: Distributed processing with TiKV coordination
- Schema-Driven Translation: CDR (ROS1/ROS2), Protobuf, JSON message formats
- Zero-Copy Allocation: Arena-based memory efficiency (~22% overhead reduction)
- Cloud Storage: Native S3 and Alibaba OSS support for distributed workloads
- High Throughput: Parallel chunk processing up to ~1800 MB/s
- LeRobot Export: Convert to LeRobot dataset format for robotics learning
Roboflow uses a Kubernetes-inspired distributed control plane for fault-tolerant batch processing.
┌─────────────────────────────────────────────────────────────────────┐
│ Control Plane │
├─────────────────────────────────────────────────────────────────────┤
│ ┌──────────────┐ ┌──────────────┐ ┌──────────────┐ │
│ │ Scanner │ │ Reaper │ │ Finalizer │ │
│ │ Controller │ │ Controller │ │ Controller │ │
│ │ │ │ │ │ │ │
│ │ • Discover │ │ • Detect │ │ • Monitor │ │
│ │ files │ │ stale pods │ │ batches │ │
│ │ • Create │ │ • Reclaim │ │ • Trigger │ │
│ │ jobs │ │ orphaned │ │ merge │ │
│ └──────┬───────┘ └──────┬───────┘ └──────┬───────┘ │
│ │ │ │ │
│ └─────────────────┼─────────────────┘ │
│ │ │
│ ▼ │
│ ┌─────────────┐ │
│ │ TiKV │ │
│ │ (etcd-like │ │
│ │ state) │ │
│ └─────────────┘ │
└─────────────────────────────────────────────────────────────────────┘
┌─────────────────────────────────────────────────────────────────────┐
│ Data Plane │
├─────────────────────────────────────────────────────────────────────┤
│ Worker (pod-abc) Worker (pod-def) Worker (pod-xyz) │
│ • Claim jobs • Claim jobs • Claim jobs │
│ • Send heartbeat • Send heartbeat • Send heartbeat │
│ • Process data • Process data • Process data │
│ • Save checkpoint • Save checkpoint • Save checkpoint │
└────────────────────────────────────────��────────────────────────────┘
| Kubernetes Concept | Roboflow Equivalent |
|---|---|
| Pod | Worker with pod_id |
| etcd | TiKV distributed store |
| kubelet heartbeat | HeartbeatManager |
| node-controller | ZombieReaper |
| Finalizers | Finalizer controller |
| Job/CronJob | JobRecord, BatchSpec |
| State machine | BatchPhase (Pending → Discovering → Running → Merging → Complete/Failed) |
| Crate | Purpose |
|---|---|
roboflow-core |
Error types, registry, values |
roboflow-storage |
S3, OSS, Local storage (always available) |
roboflow-executor |
Stage-based task executor for distributed pipelines |
roboflow-media |
Image and video encoding/decoding for robotics datasets |
roboflow-dataset |
KPS, LeRobot, streaming converters, data sources |
roboflow-pipeline |
Pipeline execution and stages for dataset processing |
roboflow-distributed |
TiKV client, catalog, circuit breaker, worker coordination |
# Set environment variables
export TIKV_PD_ENDPOINTS="127.0.0.1:2379"
export AWS_ACCESS_KEY_ID="your-key"
export AWS_SECRET_ACCESS_KEY="your-secret"
# Run unified service (scanner + worker + finalizer + reaper)
roboflow run# Worker only - processes work units
roboflow run --role worker
# Finalizer only - merges completed batches
roboflow run --role finalizer
# With custom pod ID
roboflow run --pod-id my-pod-1roboflow submit s3://bucket/input.bag --output s3://bucket/output/roboflow jobs list
roboflow jobs get <job-id>
roboflow batch list
roboflow batch get <batch-id>git clone https://github.com/archebase/roboflow.git
cd roboflow
cargo build --release- Rust 1.80+
- TiKV 4.0+ (for distributed coordination)
- ffmpeg (for video encoding in LeRobot datasets)
[dataset]
name = "my_dataset"
fps = 30
robot_type = "stretch"
[[mapping]]
topic = "/camera/image_raw"
name = "observation.images.camera_0"
encoding = "ros1msg"
[[mapping]]
topic = "/joint_states"
name = "observation.joint_state"
encoding = "cdr"| Variable | Description | Default |
|---|---|---|
TIKV_PD_ENDPOINTS |
TiKV PD endpoints | 127.0.0.1:2379 |
AWS_ACCESS_KEY_ID |
AWS access key | - |
AWS_SECRET_ACCESS_KEY |
AWS secret key | - |
AWS_REGION |
AWS region | - |
OSS_ACCESS_KEY_ID |
Alibaba OSS key | - |
OSS_ACCESS_KEY_SECRET |
Alibaba OSS secret | - |
OSS_ENDPOINT |
Alibaba OSS endpoint | - |
WORKER_POLL_INTERVAL_SECS |
Job poll interval | 5 |
WORKER_MAX_CONCURRENT_JOBS |
Max concurrent jobs | 1 |
SCANNER_SCAN_INTERVAL_SECS |
Scan interval | 60 |
FINALIZER_POLL_INTERVAL_SECS |
Finalizer poll interval | 30 |
cargo buildcargo testcargo fmt
cargo clippy --all-targets -- -D warningsStart required services with Docker Compose:
docker compose up -d # Start all services (MinIO, TiKV, PD)
docker compose down # Stop all servicesServices:
| Service | Purpose | Ports |
|---|---|---|
| MinIO | S3-compatible object storage | 9000 (API), 9001 (Console) |
| TiKV | Distributed KV storage | 20160 |
| PD | TiKV placement driver | 2379, 2380 |
See CONTRIBUTING.md for development setup and guidelines.
This project is licensed under the MulanPSL v2 - see the LICENSE file for details.