NVIDIA supports humanoid robotics
The market for humanoid robots is expected to reach $38 billion over the next two decades. To meet this demand, NVIDIA is releasing a collection of robot foundation models, data pipelines and simulation frameworks to accelerate next-generation humanoid robots.
The collection, which was announced by NVIDIA Founder and CEO Jensen Huang at CES today, features the NVIDIA Isaac GR00T Blueprint for synthetic motion generation helps deveopers to generate exponentially large synthetic motion data to train humanoid robots using imitation learning.
Imitation learning allows humanoids to acquire new skills through observing and mimicking expert human demonstrations. Collecting these extensive, high-quality datasets in the real world can be tedious, time-consuming and often prohibitively expensive. Implementing the Isaac GR00T blueprint for synthetic motion generation allows developers to easily generate exponentially large synthetic datasets from just a small number of human demonstrations.
Using the GR00T-Teleop workflow users can tap into the Apple Vision Pro to capture human actions in a digital twin, these actions are then mimicked by a robot in simulation and recorded for use as ground truth.
The GR00T-Mimic workflow then multiplies the captured human demonstration into a larger synthetic motion dataset. Finally, the GR00T-Gen workflow, built on the NVIDIA Omniverse and NVIDIA Cosmos platforms, exponentially expands this dataset through domain randomisation and 3D upscaling.
The dataset can then be used as an input to the robot policy, which teaches robots how to move and interact with their environment effectively and safely in NVIDIA Isaac Lab, an open-source and modular framework for robot learning.
Another announcement which took place at CES included Cosmos, NVIDIA's platform featuring a family of open, pretrained world foundation models purpose-built for generating physics-aware videos and world states for physical AI development. It includes autoregressive and diffusion models in a variety of sizes and input data formats. The models were trained on 18 quadrillion tokens, including 2 million hours of autonomous driving, robotics, drone footage and synthetic data.
Additionally, Cosmos can reduce the simulation-to-real gap by upscaling images from 3D to real. Combining Omniverse — a developer platform of application programming interfaces and microservices for building 3D applications and services — with Cosmos is essential, because it helps minimise potential hallucinations commonly associated with world models by providing crucial safeguards through its highly controllable, physically accurate simulations.
Together, NVIDIA Isaac GR00T, Omniverse and Cosmos are supporting physical AI and humanoid innovation progress significantly.