Robomotion

How It Works

Motion Imitation from Capture Data

The system starts with raw motion-capture recordings of human table tennis strokes. A quality filter removes noisy or physically implausible clips before they ever touch training, keeping only smooth, in-limit trajectories.

The robot then learns to reproduce these motions in a physics simulator, guided by a reward signal that penalizes deviation from the reference pose at every timestep.

1 Filter Score mocap clips on jerk, joint limits, and root height — discard outliers
2 Simulate Run thousands of parallel physics environments on GPU with MuJoCo MJX + JAX
3 Train Optimize a neural-network policy with PPO to minimize tracking error
4 Export Trace the learned policy into a portable ONNX file for hardware deployment

Multiple angles of the G1 robot performing table tennis motions

Robustness

Generalization Across Environments

Training with domain randomization exposes the robot to a range of physics conditions — varied joint friction, link masses, center-of-mass offsets, and motor armature — so the policy generalizes beyond the exact simulator it was trained in.

The result is a controller that handles the inherent gap between simulation and the real world without hand-tuned adaptation.

Read the code →

Under the Hood

Technical Features

A full end-to-end stack from raw motion data to a deployable robot policy.

⚡

GPU-Parallelized Simulation

MuJoCo MJX runs thousands of environments simultaneously on GPU. JAX's vmap vectorizes the physics step across the entire batch with no Python loops.

🧠

Motion Imitation via PPO

Proximal Policy Optimization trains a neural network to track reference joint trajectories frame-by-frame, learning coordinated whole-body control from mocap data.

🎲

Domain Randomization

Physics parameters — link masses, joint friction, center-of-mass offsets, motor armature — are randomized each episode to bridge the sim-to-real gap.

🔍

Trajectory Quality Filtering

A pre-training scorer evaluates every mocap clip on smoothness (jerk), joint limit compliance, and root height plausibility. Low-quality clips are excluded before training begins.

📦

ONNX Policy Export

The trained JAX policy is traced and converted to a portable .onnx file. Inference runs via ONNX Runtime — no JAX or GPU required on the target device.

🛑

Early Stopping

A configurable monitor tracks reward improvement and halts training when progress stalls — saving GPU hours on runs that have already converged or diverged.

Teaching a Humanoid Robot to Play Table Tennis

See It in Action