Home > Products > Hugging Face inference-as-a-service capabilities with NVIDIA NIM

Artificial Intelligence

Hugging Face inference-as-a-service capabilities with NVIDIA NIM

29th July 2024

NVIDIA

Harry Fowle

0 0

Hugging Face will now offer developers new inference-as-a-service capabilities which are powered by NVIDIA NIM.

The Hugging Face platform is gaining easy access to NVIDIA-accelerated inference on some of the most popular AI models.

New inference-as-a-service capabilities will enable developers to rapidly deploy leading large language models such as the Llama 3 family and Mistral AI models with optimisation from NVIDIA NIM microservices running on NVIDIA DGX Cloud.

Announced at the SIGGRAPH conference, the service will help developers quickly prototype with open-source AI models hosted on the Hugging Face Hub and deploy them in production. Hugging Face Enterprise Hub users can tap serverless inference for increased flexibility, minimal infrastructure overhead, and optimised performance with NVIDIA NIM.

The inference service complements Train on DGX Cloud, an AI training service already available on Hugging Face.

Developers facing a growing number of open-source models can benefit from a hub where they can easily compare options. These training and inference tools give Hugging Face developers new ways to experiment with, test and deploy cutting-edge models on NVIDIA- accelerated infrastructure. They’re made easily accessible using the “Train” and “Deploy” drop-down menus on Hugging Face model cards, letting users get started with just a few clicks.

Beyond a token gesture — NVIDIA NIM brings big benefits

NVIDIA NIM is a collection of AI microservices — including NVIDIA AI foundation models and open-source community models — optimised for inference using industry-standard application programming interfaces, or APIs.

NIM offers users higher efficiency in processing tokens — the units of data used and generated by a language model. The optimised microservices also improve the efficiency of the underlying NVIDIA DGX Cloud infrastructure, which can increase the speed of critical AI applications.

This means developers see faster, more robust results from an AI model accessed as a NIM compared with other versions of the model. The 70-billion-parameter version of Llama 3, for example, delivers up to 5x higher throughput when accessed as a NIM compared with off-the-shelf deployment on NVIDIA H100 Tensor Core GPU-powered systems.

Near-instant access to DGX Cloud provides accessible AI acceleration

The NVIDIA DGX Cloud platform is purpose-built for generative AI, offering developers easy access to reliable accelerated computing infrastructure that can help them bring production-ready applications to market faster.

The platform provides scalable GPU resources that support every step of AI development, from prototype to production, without requiring developers to make long-term AI infrastructure commitments.

Hugging Face inference-as-a-service on NVIDIA DGX Cloud powered by NIM microservices offers easy access to compute resources that are optimised for AI deployment, enabling users to experiment with the latest AI models in an enterprise-grade environment.

Product Spotlight

TBF10SL-4PS-B

ITT Interconnect Solutions

Circular Connector Standard 5/2 Female Sockets/Male Pins Panel Mount CA/5015 Co...

SKU:
Stock:	50
Cost:	$45.59

Buy Now Learn More

CAA572C0G3A663J640LJ

TDK Corporation

Speciality Ceramic Capacitors Inline MEGA Cap,2220,C0G,1000V,66nF,+/-5%,6.4mm AE...

SKU:
Stock:	1037
Cost:	$9.16

Buy Now Learn More

RA1113112R

E-Switch, Inc.

E-Switch / RS PRO RA1113112R Rocker Switch, SPST, OFF-ON, 10A, 125V AC, QC 0.187...

SKU:	EG5619-ND
Stock:	5586
Cost:	$0.55

Buy Now Learn More

R30-3002002

Harwin

20.00mm M3 Metric M/F Threaded Hex Brass Spacer/Pillar Hardware - Spacer (Stand...

SKU:	952-1510-ND
Stock:	6545
Cost:	$0.76

Buy Now Learn More

NRF54L15-QFAA-R

Nordic Semiconductor

RF System on a Chip - SoC Ultra-low power Bluetooth Multiprotocol 5.4 SoC System...

SKU:	4823-NRF54L15-QFAA-RTR-ND
Stock:	0
Cost:	$2.39

Buy Now Learn More

STDRIVEG611Q

STMicroelectronics

Gate Drivers High voltage and high-speed half-bridge gate driver for GaN power s...

SKU:	497-STDRIVEG611QTR-ND
Stock:	0
Cost:	$2.63

Buy Now Learn More

TBF10SL-4PS-B

ITT Interconnect Solutions

Circular Connector Standard 5/2 Female Sockets/Male Pins Panel Mount CA/5015 Co...

SKU:
Stock:	50
Cost:	$45.59

Buy Now Learn More

CAA572C0G3A663J640LJ

TDK Corporation

Speciality Ceramic Capacitors Inline MEGA Cap,2220,C0G,1000V,66nF,+/-5%,6.4mm AE...

SKU:
Stock:	1037
Cost:	$9.16

Buy Now Learn More

RA1113112R

E-Switch, Inc.

E-Switch / RS PRO RA1113112R Rocker Switch, SPST, OFF-ON, 10A, 125V AC, QC 0.187...

SKU:	EG5619-ND
Stock:	5586
Cost:	$0.55

Buy Now Learn More

R30-3002002

Harwin

20.00mm M3 Metric M/F Threaded Hex Brass Spacer/Pillar Hardware - Spacer (Stand...

SKU:	952-1510-ND
Stock:	6545
Cost:	$0.76

Buy Now Learn More

NRF54L15-QFAA-R

Nordic Semiconductor

RF System on a Chip - SoC Ultra-low power Bluetooth Multiprotocol 5.4 SoC System...

SKU:	4823-NRF54L15-QFAA-RTR-ND
Stock:	0
Cost:	$2.39

Buy Now Learn More

STDRIVEG611Q

STMicroelectronics

Gate Drivers High voltage and high-speed half-bridge gate driver for GaN power s...

SKU:	497-STDRIVEG611QTR-ND
Stock:	0
Cost:	$2.63

Buy Now Learn More