IBM scientists unveil analogue AI chip for deep learning inference
IBM Research has been exploring methods to revolutionise the computational approach of AI.
Analog in-memory computing, also known as analog AI, presents a promising strategy to tackle this challenge by incorporating essential characteristics inspired by the operational principles of neural networks in biological brains.
Within our brains, as well as those of numerous other creatures, the effectiveness of synapses, essentially acting as ‘weights’ in this context, governs the inter-neuronal communication. Analog AI systems emulate this mechanism by housing these synaptic weights within nanoscale resistive memory devices, like phase-change memory (PCM). These devices store conductance values and facilitate multiply-accumulate (MAC) operations, which are the prevalent computational procedures within deep neural networks (DNNs). This is achieved by leveraging circuit principles, reducing the necessity for constant data transmission between memory and processors.
Transforming the concept of analog AI into a tangible reality necessitates overcoming two primary hurdles: Firstly, these memory arrays must achieve computational accuracy comparable to established digital systems. Secondly, they must seamlessly integrate with other digital computing components, alongside a digital communication infrastructure on the analog AI chip.
In a recent publication in Nature Electronics, IBM Research has taken a notable stride in tackling these obstacles. It’s unveiled a mixed-signal analog AI chip designed to execute diverse deep neural network (DNN) inference tasks. It’s the first chip to demonstrate prowess in computer vision AI assignments, akin to its digital equivalents, all the while exhibiting significantly enhanced energy efficiency.
Crafted within IBM's Albany NanoTech Complex, the chip consists of 64 analog in-memory compute cores, often referred to as ‘tiles’. Each of these tiles harbours a 256-by-256 crossbar array composed of synaptic unit cells. To facilitate seamless transitions between analog and digital realms, these tiles incorporate efficient, compact, time-based analog-to-digital converters. Furthermore, within each tile, there are integrated lightweight digital processing units responsible for executing basic nonlinear neuronal activation functions and scaling operations.
Every individual tile possesses the capacity to undertake the computations attributed to a layer within a deep neural network (DNN) model. The vital synaptic weights are represented through analog conductance values residing within the PCM devices. Positioned centrally within the chip, a global digital processing unit is incorporated. This central unit undertakes intricate operations crucial for executing specific neural network configurations. Moreover, the chip incorporates digital communication routes at the interconnections between tiles and the central global digital processing unit, facilitating seamless data exchange.
Utilising this chip, IBM conducted an extensive investigation into the computational precision of analog in-memory computing, achieving an accuracy of 92.81% on the CIFAR-10 image dataset. This level of accuracy, IBM believes, stands as the highest reported among chips employing similar technology. Additionally, within the paper, it unveiled a seamless amalgamation of analog in-memory computing with multiple digital processing units and a digital communication infrastructure. Impressively, the measured throughput per unit area for 8-bit input-output matrix multiplications reached 400 GOPS/mm², surpassing previous multi-core, in-memory computing chips reliant on resistive memory by over 15-fold. This accomplishment was achieved while maintaining comparable energy efficiency standards.
Through the synergistic integration of area- and energy-efficient analog-to-digital converters (ADCs), multiply-accumulate-compute with exceptional linearity, and adept digital compute-blocks within the 64-tile configuration of this chip, alongside the massively-parallel data-transport capabilities previously showcased in a 34-tile chip featured at the IEEE VLSI symposium in 2021, we have successfully showcased a substantial portion of the fundamental components necessary to materialise an architectural concept for a swift and energy-efficient analog AI inference accelerator chip.
Building upon IBM’s insights, it has crafted an accelerator architecture of this nature, a design that it unveiled earlier this year in IEEE Transactions on VLSI Systems. The strategic vision involves integrating numerous analog in-memory computing tiles alongside a blend of specialised digital compute-cores. These components are interlinked via a massively-parallel 2D mesh configuration. In tandem with the advanced hardware-aware training techniques IBM has refined over the recent years, it holds the anticipation that these accelerators will yield neural network accuracies comparable to software-based implementations across an extensive spectrum of models in the forthcoming years.