Hopper¶
A planar one-legged hopper. Four actuators across the leg drive locomotion or stationary balance depending on the variant. The body and dynamics are shared across both Hop and Stand; only the reward function changes.
HopperHop¶

| Property | Value |
|---|---|
| Canonical ID | mjx/hopper_hop-v0 |
| Action space | Box(-1.0, 1.0, (4,), float32) |
| Observation space | Box(-inf, inf, (15,), float32) |
| Episode length | 1000 |
| Config | {"ctrl_dt": 0.02, "sim_dt": 0.005, "naconmax": 50_000, "njmax": 50} |
Description¶
The hopper must produce sustained forward locomotion across flat ground while keeping its torso above a minimum standing height. A collapsed slide is fast but disqualifies the policy on the height constraint; a tall but stationary stance fails the locomotion constraint. Hopping is the only gait that satisfies both simultaneously, which is the whole challenge.
Rewards¶
Uses a dense reward that multiplies a standing-height tolerance with a forward-speed tolerance:
| Python | |
|---|---|
1 2 3 4 5 6 7 8 9 | |
The two terms are multiplied so neither alone is enough:
standing—1.0while torso height sits in(STAND_HEIGHT, 2), decaying smoothly outside that band.hopping— linear ramp from0.5(at half target speed) to1.0(atHOP_SPEEDor above). Soft floor at0.5so the gradient survives even slow hopping.
Starting state¶
1 2 | |
(joint positions followed by joint velocities — leg initialised in a default rest configuration.)
Termination¶
Episode ends when step >= max_steps (default 1000). No early termination on falling.
Usage¶
| Python | |
|---|---|
1 2 | |
Reference¶
Upstream: mujoco_playground/_src/dm_control_suite/hopper.py.
HopperStand¶

| Property | Value |
|---|---|
| Canonical ID | mjx/hopper_stand-v0 |
| Action space | Box(-1.0, 1.0, (4,), float32) |
| Observation space | Box(-inf, inf, (15,), float32) |
| Episode length | 1000 |
| Config | {"ctrl_dt": 0.02, "sim_dt": 0.005, "naconmax": 50_000, "njmax": 50} |
Description¶
The same one-legged hopper, but now stationary. The agent must balance the body upright above a minimum standing height with as little control effort as possible. Steady balance is preferred over jittery "standing" — a calm posture that holds the leg roughly still scores better than one that twitches constantly to stay upright.
Rewards¶
Uses a dense reward that multiplies a standing-height tolerance with a small-control penalty:
| Python | |
|---|---|
1 2 3 | |
The two terms encode separate soft constraints:
standing—1.0while torso height sits in(STAND_HEIGHT, 2), decaying smoothly outside that band.small_control— quadratic action penalty rescaled into[0.8, 1.0], so it lightly modulates the primarystandingterm rather than dominating it.
Starting state¶
1 2 | |
Termination¶
Episode ends when step >= max_steps (default 1000). No early termination on falling.
Usage¶
| Python | |
|---|---|
1 2 | |
Reference¶
Upstream: mujoco_playground/_src/dm_control_suite/hopper.py.