Innovation at the edge: smarter cities and factories

2nd November 2021

Kiera Sowery

0 0

High-performance AI inference is enabling smarter cities and automated smart factories, but these applications need to be extremely reliable.

When deploying a system at the edge, power consumption, footprint, and cost are all limiting factors. Increasing processing demands, within the limitations of edge processing, mean that providing the required performance level is challenging. While CPUs have experienced improvements at the edge, the gains have slowed in recent years. Unaccelerated CPUs struggle to provide the performance needed for the next-gen of AI-enabled edge applications, especially when considering the tight latency requirements.

A domain-specific architecture (DSA) is key when implementing advanced AI applications at the edge. DSAs also provide determinism and low latency.

A suitable DSA will be designed specifically to process the required data efficiently – both the AI inference, and the non-AI parts of the application, essentially the whole application. This is important considering AI inference requires non-AI pre- and post-processing, all of which have higher performance requirements. Fundamentally, whole application acceleration is required to implement efficient AI-enabled applications at the edge (and elsewhere).

Like any fixed silicon solution, application-specific standard products (ASSP) that have been developed for AI edge applications still have limitations. The main challenge is that AI innovation is incredibly rapid, leaving AI models obsolete much quicker than non-AI technologies. Fixed silicon devices that implement AI can quickly become obsolete due to the emergence of newer, more-efficient AI models. It can take several years to tape-out a fixed silicon device by which time the state-of-the-art in AI models will have advanced. Security and functional safety requirements are also becoming more important for edge applications, often resulting in potentially expensive field updates.

The promise of adaptive computing

Adaptive computing – encompassing hardware that can be optimised for specific applications such as field programmable gate arrays (FPGAs) – is a powerful solution for AI enabled edge applications.

New adaptive hardware has also been introduced, including adaptive system-on-chips (SoC) which contain FPGA fabric, coupled with one or more embedded CPU subsystems. However, adaptive computing is so much more than “just hardware”. It incorporates a comprehensive set of design and runtime software that, when combined, delivers an adaptive platform from which flexible, yet efficient systems can be built.

Adaptive computing allows DSAs to be implemented without the design time and upfront cost needed when dealing with custom silicon devices, ASICs for example. This means flexible and optimised solutions can be deployed rapidly for any given domain, including AI-enabled edge applications. Adaptive SoCs are ideal for such domain-specific processing because they combine the flexibility of a comprehensive, embedded CPU subsystem with the optimal data processing of adaptive hardware.

Adaptive system-on-modules

System-on-Modules (SOMs) provide a complete, production-ready computing platform, and save significant development time and cost vs. chip down development. SOMs can plug into a larger edge application, providing both the flexibility of a custom implementation with the ease-of-use and reduced time to market of an off-the-shelf solution. These benefits make SOMs an ideal platform for edge AI applications. However, to achieve the performance required by modern AI-enabled applications, acceleration is needed.

Some applications require custom hardware components to interface with an adaptive SoC, meaning chip-down design is needed. However, an increasing number of AI-enabled edge applications need similar hardware components and interfaces, even for vastly different end applications. As industries have moved towards standardised interface and communications protocols, the same set of components are suitable for a variety of applications, despite having vastly different processing needs.

An adaptive SOM for an AI-enabled edge application incorporates an adaptive SoC with industry-standard interfaces and components, allowing developers with limited or even no hardware experience to benefit from adaptive computing technology. An adaptive SoC can implement both the AI and non-AI processing, hence the whole application.

Additionally, an adaptive SoC on an adaptive SOM enables a high degree of customisation. It is designed to be integrated into larger systems and uses a predefined form factor. Adaptive SOMs make it possible to take full advantage of adaptive computing without having to do chip-down design. An adaptive SOM is just part of the solution. The software is also key.

Companies that adopt adaptive SOMs benefit from a unique combination of performance, flexibility, and rapid development time. They can enjoy the benefits of adaptive computing without the need to build their own circuit boards, something that has only recently been possible at the edge with the introduction of Xilinx’s Kria portfolio of adaptive SOMs.

The Kria K26 SOM is built on top of the Zynq UltraScale+ MPSoC architecture, which features a quad-core Arm Cortex-A53 processor, more than 250 thousand logic cells, and a H.264/265 video codec. The SOM also features 4GB of DDR4 memory and 69 3.3V I/Os & 116 1.8V I/Os, which allow it to adapt to virtually any sensor or interface. With 1.4 tera-ops of AI compute, the Kria K26 SOM enables developers to create vision AI applications offering more than 3X higher performance at lower latency and power compared to GPU-based SOMs, critical for smart vision applications like security, traffic and city cameras, retail analytics, machine vision, and vision guided robotics. By standardising the core parts of the system, developers have more time to focus on building in features that differentiate their technology from the competition.

Unlike other edge AI products that allow software updates but are limited by fixed accelerators, Kria SOMs offer two degrees of flexibility - software and hardware over time. Users can adapt the I/O interfaces, vision processing, and AI accelerators to support some or all of the following: MIPI, LVDS, and SLVS-EC interfaces; higher quality, specialised high-dynamic range imaging algorithms for day or night; 8-bit deep learning processing units, or in the future, 4-bit or even binary neural network approaches. The intersection of multi-modal sensor fusion with real-time AI processing is now extremely accessible, starting with the Xilinx KV260 Vision AI Starter Kit and deploying into production with the Kria K26 SOM.

Benefits for hardware and software developers

Adaptive SOMs benefit both hardware and software developers. For hardware developers, adaptive SOMs offer an off-the shelf, production-ready solution, saving significant development cost and time. These devices also allow hardware teams to change the design late in the process, vs. a SOM based on fixed silicon technology.

For AI and software developers, adaptive computing is more accessible than ever. Xilinx has made sure to invest heavily in tool flows to ensure this is the case. The introduction of the Kria SOM portfolio takes this accessibility to the next level by coupling the hardware and software platform with production-ready vision accelerated applications. These turnkey applications eliminate all the FPGA hardware design work and only require software developers to integrate their custom AI models, application code, and optionally modify the vision pipeline – using familiar design environments, such as TensorFlow, Pytorch or Caffe frameworks, as well as C, C++, OpenCL, and Python programming languages—enabled by the Vitis unified software development platform and libraries.

With this new accelerated-application paradigm for software-based design, Xilinx also announced an embedded app store for edge applications, offering customers a wide selection of apps for Kria SOMs from Xilinx and its ecosystem partners. Xilinx offerings are open source accelerated applications, provided at no-charge, and range from smart camera and face detection to natural language processing with smart vision.