Design

NVIDIA unveils Cosmos models for physical AI developers

8th January 2025
Sheryl Miles
0

NVIDIA has unveiled its Cosmos world foundation models (WFMs). These neural networks can predict and generate physics-aware videos, simulating future states of virtual environments with remarkable accuracy.

Announced at CES, the Cosmos platform offers developers new tools to accelerate the creation of robotics and autonomous vehicles (AVs).

WFMs are foundational to physical AI in the same way that large language models revolutionised text-based AI. By using diverse input data – such as text, images, video, and motion – Cosmos models simulate virtual worlds with precise spatial relationships and realistic object interactions. This capability enables developers to design AI systems that can predict and respond to real-world dynamics, laying the groundwork for next-generation robotics and AV technologies.

NVIDIA is making the first wave of Cosmos models available under an open model license, which allows for both research and commercial use. The models come with a suite of state-of-the-art tools, including tokenizers, guardrails, and pipelines for data processing and curation. These resources are designed to streamline development and make cutting-edge physical AI tools accessible to researchers, startups, and enterprises of all sizes.

Unleashing the power of cosmos models

The Cosmos models are divided into three tiers to meet different needs:

  • Nano models are optimised for real-time, low-latency inference at the Edge, making them ideal for robotics and IoT systems where immediate decision-making is essential. These models operate efficiently on devices with limited processing power, enabling on-device AI applications without relying on Cloud resources.
  • Super models strike a balance between speed and quality, offering high performance for standard robotics and AV tasks. They serve as a versatile foundation for developers working on scalable AI applications across industries, providing reliable, consistent output.
  • Ultra models deliver maximum fidelity and precision, making them suitable for advanced use cases such as high-stakes autonomous driving scenarios. These models enable developers to create bespoke AI systems that require detailed simulations and intricate understanding of physical interactions.

Each model type is designed to handle specific tasks, from generating high-quality synthetic video data to predicting the next sequence of video frames in real time. Developers can also customise these models using NVIDIA’s NeMo framework, which allows for fine-tuning with proprietary datasets.

Cosmos models are particularly effective when paired with NVIDIA Omniverse, a platform for creating 3D simulations. By integrating Cosmos with Omniverse, developers can generate controllable, physics-based synthetic data to train robotic and AV perception systems more efficiently.

Extensive training and advanced capabilities

Cosmos world foundation models are built on a massive dataset of over 9,000 trillion tokens derived from 20 million hours of real-world interactions across domains like robotics, industrial operations, and driving. This extensive training ensures the models can accurately simulate complex environments and interactions.

The platform includes diffusion and autoregressive transformer models. Diffusion models are designed for generating synthetic video data with high quality, enabling developers to bootstrap training datasets. Autoregressive models predict the next sequence of video frames based on inputs, providing real-time forecasting for physical AI systems.

Additional models in the Cosmos suite include a 12-billion-parameter upsampling model for refining text prompts, a seven-billion-parameter video decoder optimised for augmented reality applications, and guardrail models to ensure responsible use of AI-generated content. These guardrails mitigate harmful inputs and screen generated videos for safety, enabling developers to create secure and reliable AI systems.

Applications in robotics and autonomous vehicles

Companies are already leveraging Cosmos models to advance robotics and AV technologies. Waabi, a leader in generative AI for autonomous vehicles, is exploring Cosmos to enhance its simulator, Waabi World. This platform uses AI to create realistic driving scenarios, allowing AV developers to test safety measures in a controlled virtual environment. By incorporating Cosmos, Waabi can accelerate the curation of video datasets and improve the realism of its simulations.

In the robotics sector, Hillbot, an embodied AI startup, is using Cosmos to generate terabytes of synthetic 3D environments. These environments provide cost-effective, controlled spaces for robot training, enabling faster skill acquisition and improved performance in tasks ranging from industrial operations to domestic assistance.

Both use cases highlight how Cosmos can reduce development costs, improve safety, and accelerate the deployment of physical AI systems. Developers can also use Cosmos as a multiverse simulation engine in conjunction with NVIDIA Omniverse, allowing AI models to evaluate countless possible outcomes before determining the optimal path for a given task.

Revolutionising data processing and training

One of Cosmos’ key strengths is its ability to process massive amounts of data quickly and efficiently. Using NVIDIA’s DGX Cloud, developers can process 20 million hours of video data in just 40 days on Hopper GPUs or as little as 14 days on Blackwell GPUs. In contrast, traditional CPU-based systems would take over three years to handle the same volume of data.

The platform also includes powerful video and image tokenizers, which compress data for training transformer models. These tokenizers achieve 8x more compression and 12x faster processing speeds than current state-of-the-art methods, reducing computational costs without compromising quality.

This combination of efficient data processing and advanced model training ensures developers can build sophisticated physical AI systems in a fraction of the time previously required.

Responsible AI development and customisation

NVIDIA designed Cosmos with a strong focus on responsible AI principles, including privacy, safety, and transparency. The platform’s guardrails prevent harmful inputs during data preprocessing and ensure generated outputs meet safety standards. Developers can also customise these guardrails to address the specific needs of their applications.

Cosmos models include an integrated watermarking system that identifies AI-generated content, promoting accountability, and reducing misuse. This aligns with NVIDIA’s commitment to creating trustworthy AI systems.

Customisation is further enhanced by the NeMo framework, which allows developers to fine-tune Cosmos models with their own datasets. This flexibility enables the creation of tailored AI applications for specialised use cases across industries.

Open access and developer support

Cosmos models are available under NVIDIA’s Open Model License, which permits free access for research and commercial purposes. Developers can preview the models on NVIDIA’s API catalog and download them from the NVIDIA NGC catalog or Hugging Face. Model cards and a detailed research paper, Cosmos World Foundation Model Platform for Physical AI, provide additional technical insights.

NVIDIA also offers extensive support for developers through its AI Enterprise software platform, which simplifies the deployment of Cosmos models. This includes resources for data processing, model training, and application integration, ensuring developers have the tools they need to succeed.

Showcased at CES

Cosmos was a key highlight of NVIDIA’s CES showcase, where live demonstrations illustrated the platform’s capabilities. Attendees saw how Cosmos integrates with NVIDIA Omniverse to accelerate AI development in robotics and AVs. Jensen Huang’s keynote further emphasised the platform’s potential to enable the next generation of physical AI technologies.

Featured products

Product Spotlight

Upcoming Events

No events found.
Newsletter
Latest global electronics news
© Copyright 2025 Electronic Specifier