Cheetah¶
A planar bipedal cheetah-style runner. The agent torques 6 actuators across the body to maintain forward locomotion across flat ground.
CheetahRun¶

| Property | Value |
|---|---|
| Canonical ID | mjx/cheetah_run-v0 |
| Action space | Box(-1.0, 1.0, (6,), float32) |
| Observation space | Box(-inf, inf, (17,), float32) |
| Episode length | 1000 |
| Config | {"ctrl_dt": 0.01, "sim_dt": 0.01, "naconmax": 100_000, "njmax": 100} |
Description¶
The half-cheetah body runs rightward across flat ground. Six actuators across the torso and legs drive the gait. The body is planar, so the cheetah can't tip sideways, but it can collapse forward or backward — which is the main failure mode for naive policies that throw all their weight into raw acceleration.
Rewards¶
Uses a dense reward with a tolerance indicator over forward speed:
| Python | |
|---|---|
1 2 3 4 5 6 7 8 | |
tolerance is DM Control's smooth indicator. Here it's used with a linear sigmoid, so the reward returns:
1.0oncespeed >= RUN_SPEED.- A linear ramp from
0.0(at speed0) to1.0(atRUN_SPEED). 0.0if speed dips negative (running backwards).
Starting state¶
1 2 | |
(joint positions followed by joint velocities — body at rest with a small randomisation.)
Termination¶
Episode ends when step >= max_steps (default 1000). No early termination on falling.
Usage¶
| Python | |
|---|---|
1 2 | |
Reference¶
Upstream: mujoco_playground/_src/dm_control_suite/cheetah.py.