Tachyum unveils Prodigy universal processor prototype
Availability of its Prodigy Universal Processor prototype, built using field-programmable gate array (FPGA) emulation boards has been announced by Tachyum.
The hardware prototype is currently completing in-house testing before being made available to early adopters.
Tachyum’s Prodigy emulation system has been developed in house to offer the best possible accuracy and performance for its engineers to verify the final Prodigy design, and for customers to begin benchmarking and porting their own application software to native Prodigy code.
The hardware emulator consists of multiple FPGA and IO boards connected by cables in a rack. The processor cores FPGA board size is 14.5 inches by 16 inches (368.3mm x 406.4mm). The 24-layer board is 0.110 inches thick (2.8mm) and has 5,948 components on the printed circuit board (PCB) on both sides.
The single board with 4 FPGAs emulates 8 full processor cores including vector and matrix fixed and floating-point processing units. FPGA prototypes, with multiple boards connected by cables, allow full-chip, half-chip, and smaller emulation systems ideal for native Prodigy software development, hardware compatibility testing and performance benchmarking.
Customers will be able to use Prodigy’s fully functional FPGA emulation for product evaluation, performance measurements, as well as software development, debug and compatibility testing.
A fully functional chip emulation (in FPGA-based hardware) is typically the last step before chip tape-out when the final design is submitted for fabrication in silicon.
The Prodigy FPGA emulation system will help customers smooth the adoption curve for Prodigy in their existing or new data centre and/or HPC systems that demand high performance, high utilisation and low power.
Fully assembled FPGA boards are undergoing testing in Tachyum offices in Santa Clara, Calif. Internal testing includes:
- Continuity and isolation tests
- Electrical tests (power, etc.)
- Emulation functionalities
- Integration with DDR-IO board
- Hardware and software integration
- Operation System boot and
- Applications tests
- Performance benchmarking
Some Prodigy FPGA systems will have connectors for test chips for DDR5, PCIE 5.0, chip-to-chip 112GB interconnect, PHYs and PLLs to eliminate risk.
“Finally, being at point where industry experts understand beyond a reasonable doubt that Tachyum is able to complete its design in coming months and bring product to market this year is of great satisfaction for our customers, company, our employees, supporters, and investors. The focus for the next quarter is to do DFT for manufacturing test, verification, finding and fixing bugs typical for this final stage of design” said Dr. Radoslav Danilak, Tachyum founder and CEO.
He continued, "With each step completed and each milestone received, we get closer and closer to delivering a system that many thought would be years away from reality, if not completely impossible. Having a physical FPGA prototype in house means that we are knocking on the door of data centres and letting them know that it won’t be long before they can experience for themselves this year the significant improvements in performance, energy consumption, server utilisation and space requirements they need to deliver next-generation solutions to the benefit of mankind."
Tachyum's Prodigy can run HPC applications, convolutional AI, explainable AI, general AI, bio AI, and spiking neural networks, plus normal data centre workloads, on a single homogeneous processor platform, using existing standard programming models.
Without Prodigy, hyperscale data centres must use a combination of CPU, GPU, TPU hardware, for these different workloads, creating inefficiency, expense, and the complexity of separate supply and maintenance infrastructures.
Using specific hardware dedicated to each type of workload (e.g., data centre, AI, HPC), results in underutilisation of hardware resources, and more challenging programming, support, and maintenance challenges. Prodigy’s ability to seamlessly switch among these various workloads dramatically changes the competitive landscape and the economics of data centres.
In hyperscale data centres, Prodigy significantly improves computational performance, energy consumption, hardware (server) utilisation and space requirements, compared to existing processor chips currently provisioned.
As the world’s first universal processor, it also runs legacy x86, ARM and RISC-V binaries in addition to its native Prodigy code.
With a single, highly efficient processor architecture, Prodigy delivers industry-leading performance across data centre, AI, and HPC workloads, outperforming the fastest Xeon processors while consuming 10x lower power (core vs. core), as well as outperforming NVIDIA’s fastest GPU in HPC, as well as AI training and inference. A mere 125 HPC Prodigy racks can deliver 32 tensor EXAFLOPS.
Tachyum says that Prodigy’s 3X lower cost per MIPS and its 10X lower core power translate to a 4X lower data centre Total Cost of Ownership (TCO), delivering billions of dollars in annual savings to hyperscalers.
Picture shows Chi To, Director of Solutions Engineering, Tachyum, with Prodigy Universal Processor FPGA Prototype