Arrow Electronics & NeuReality advanced AI inferencing
Arrow Electronics supported the development of the world’s first 7nm Network Addressable Processing Unit (NR1 NAPU) housed in the complete NR1-S AI Inference Appliance from NeuReality – delivering competitive advantages in cost and power savings versus traditional CPU-centric architecture.
The NR1-S, when paired with AI accelerators in an AI inference server, reduces data centre costs by up to 90% and increases energy efficiency by up to 15 times while delivering linear scalability without performance drop-offs or lags as additional AI accelerators are added, according to NeuReality.
Bringing extensive embedded design skills to the project, Arrow’s in-house experts provided firmware and hardware design guidance, and developed and validated power management firmware. Arrow also handled debugging of the microcontroller (MCU) and platform power flows to support the successful NAPU bring-up, NR1-S and integrated NeuReality software – all performed in record time.
The Arrow team also helped select the most suitable MCU to provide the interface crosslink between system components of the PCIe card and server.
The NR1 NAPU is a custom server-on-a-chip that provides the full performance of each dedicated AI accelerator from approximately 30% today to 100% full utilisation – boosting total output and reducing silicon waste. The NAPU not only migrates services including network termination, quality of service, and AI data pre-and post-processing but also improves data flow for the high volume and variety of AI pipelines.
The NeuReality system architecture eliminates the performance bottleneck caused by traditional CPU-centric system architecture relied upon today by all AI Inference systems and hardware manufacturers. As a result, the NR1-S increases cost savings and energy efficiency of running high-volume, high-variety AI data pipelines – a top financial concern in the deployment of today’s power-hungry conventional and generative AI applications.
“Our NAPU addresses the major bottlenecks that restrict performance in today’s AI accelerators, such as power management and transferring data from the network into the AI accelerator, typically a GPU, FPGA or ASIC,” said Eli Bar-Lev, director of hardware at NeuReality. “Arrow’s support with the hardware and firmware for power management and thermal engineering allowed us to focus resources on a complete silicon-to-software AI inference solution which will reduce the AI market barriers for governments and businesses around the world.”
“This exciting project can potentially make cloud and on-premise enterprise AI inferencing more affordable and faster, thereby increasing access to valuable services in healthcare and medical imaging, banking and insurance, and AI-driven customer call centres and virtual assistants,” said Vitali Damasevich, director engineering Eastern Europe and engineering solutions centre EMEA.