Community · Data Flywheel

Help make this model
better for everyone.

DreamZero-SO101 was trained on 715 episodes contributed by the SO-101 community. Every new episode improves the model's generalisation — especially zero-shot performance on novel objects, camera rigs, and task phrasings. If you own an SO-101, you can contribute.

Why contribute

Your data makes a measurable difference

📈

Better zero-shot generalisation

Our zero-shot RMSE is 11.9° vs 2.3° on held-out training episodes — a 5× gap. More diverse scenes, objects, and camera positions directly close this gap.

🔄

Flywheel effect

More data → better model → more SO-101 adopters → more contributors. We are at the beginning of this curve. Your 30 episodes matter more now than they will after 10,000 total.

🏆

Credit and recognition

Every contributor is credited in the model card, README, and release notes. Your HuggingFace handle appears next to the dataset table in every future checkpoint.

How it works

Five steps from recording to release

Step 01

Record

Use LeRobot v0.4+ with your SO-101. Record at least 30 episodes of any manipulation task.

Step 02

Format

Push to HuggingFace Hub with push_dataset_to_hub() using the LeRobot schema.

Step 03

Submit

Open a PR on GitHub adding your dataset to the manifest, or email us the HF URL.

Step 04

Retrain

Vizuara downloads, converts, and retrains from the latest checkpoint. ~127h on 2× H100.

Step 05

Release

New checkpoint released to Vizuara/dreamzero-so101-lora. You're in the credits.

Data format

What your dataset needs

We accept any LeRobot v2+ format dataset. The following fields are required for inclusion in the training pipeline. Optional fields improve quality.

FieldRequired?Spec
observation.images.*requiredAt least 1 RGB camera stream at ≥ 15 FPS. Resolution ≥ 240p. Front camera preferred.
actionrequired6-DOF joint positions in degrees: [shoulder_pan, shoulder_lift, elbow_flex, wrist_flex, wrist_roll, gripper]. Shape: [T, 6].
task_description / language_instructionrequiredA plain-English description of the task, e.g. "Pick up the red block and place it in the bin".
fps in metarequiredFrame rate. We resample to 30 FPS during conversion.
observation.images.topoptionalTop-down camera. Strongly recommended — the current model was trained with 3 cameras.
observation.images.gripperoptionalWrist/gripper camera. Improves fine-motor predictions.
observation.stateoptionalProprioceptive joint state at each frame. Helps with mid-episode chunk evaluations.
Recording recipe

LeRobot quickstart

# Record with LeRobot v0.4+ (assumes SO-101 is connected and calibrated)
python lerobot/scripts/record.py \
    --robot.type=so101_follower \
    --teleop.type=so101_leader \
    --dataset.repo_id=your-handle/so101-my-task-50ep \
    --dataset.fps=30 \
    --dataset.num_episodes=50 \
    --dataset.task="Pick up the red block and place it in the bin"

# Push to HuggingFace Hub
python lerobot/scripts/push_dataset_to_hub.py \
    --dataset.repo_id=your-handle/so101-my-task-50ep
# Then open a PR on GitHub, adding your dataset to scripts/enumerate_so101.py:
COMMUNITY_DATASETS = [
    "whosricky/so101-megamix-v1",
    "lipsop/so101-block-in-bin-100ep",
    # ... existing entries ...
    "your-handle/so101-my-task-50ep",  # ← add your line here
]
Roadmap

Where we're taking this

Current · checkpoint 72K

LoRA baseline

  • 715 episodes, 108M LoRA
  • Policy mode: 1.6–2.3° held-out RMSE
  • Zero-shot: 11.9° mean RMSE
  • DreamGen: task visually completed in imagination
Next · community contributions

Expanded LoRA (target 2K+ episodes)

  • Community data drive — your data here
  • Target: zero-shot RMSE under 5°
  • More task categories (pour, sort, assemble)
  • Checkpoint released once 500 new episodes in
Phase 2 · full fine-tune

14B full FT (target 10K+ episodes)

  • Full backbone fine-tune (4× H100, ~28h)
  • Task-completion token to fix post-episode drift
  • Multi-camera ablation study
  • Release: Vizuara/dreamzero-so101-14b
Phase 3 · embodiment expansion

Multi-arm support

  • Koch v1.1 arm support
  • Moss bimanual arm support
  • HuggingFace Hub dataset partnership
  • Live inference endpoint
Get in touch

HuggingFace partnership & custom requests

We're looking for a HuggingFace Hub dataset partnership to streamline contributor onboarding — automatic GEAR conversion, contributor leaderboard, and co-branded model releases. If you work at HuggingFace or know someone who does, please get in touch.

For organisations with larger SO-101 deployments (10+ arms, enterprise data), we offer custom fine-tuning runs and evaluation support.

Contribute data or partner with Vizuara

Tell us your HuggingFace dataset URL, the task category, and the number of episodes.

Email team@vizuara.ai
Community datasets

Current training set — checkpoint 72K

DatasetContributorEpisodesTasksCameras
whosricky/so101-megamix-v1whosricky40083
lipsop/so101-block-in-bin-100eplipsop10012
youliangtan/so101-table-cleanupyouliangtan8042
G3ND3K/so101_picking_up_green_lego_bigG3ND3K6012
lerobot/svla_so101_pickplaceLeRobot team5012
observabot/so101_cloth_folding1observabot2513
YOUR DATASET COULD BE HERE — open a PR