Critical AI video analytics
Xilinx has launched an AI video analytics platform with an ecosystem of partner solutions built to accelerate the most complex and latency-sensitive AI video inferencing applications. Pejman Roshan, Vice President of Marketing, Data Center Group, explains more.
There are many AI inferencing applications that are built relatively simply, or that have a tolerance for either high latency or for variable latency. However, the really critical video AI applications that protect human life, human health, property etc, tend to be quite complex and require deterministic latency. And, the more complex the model application becomes, the harder this is to deliver.
In many cases, the answer to this issue has been to throw hardware at the problem, and lots of it. However, this can lead to unpalatable overall TCO, and OPEX (space, power) and CAPEX (hardware costs), will skyrocket. The Xilinx Smart World platform, powered by the Video Machine-learning Streaming Server (VMSS), delivers whole application acceleration and can support multiple neural networks on a single Alveo accelerator card at deterministic sub-100ms pipeline latency.
“We’re targeting the most demanding applications - either in terms of latency requirements, model complexity requirements, or both,” said Roshan. “The Xilinx Smart World platform is built on the Alveo accelerator cards, which have the capacity for massively parallel processing. And we can use that parallelism to handle multiple models on a single card. When you combine that with the deterministic sub-100ms pipeline latency, the net result of the efficiencies is the lowest TCO in the industry.”
Xilinx has taken a two-pronged approach to the market with the Smart World platform. The company has delivered a toolkit and a development framework that enables the larger developer community to deliver turnkey applications that deliver the required performance. Added to that, Xilinx has also built a rapidly expanding ecosystem of partners who are already delivering those types of applications in the smart space - cities, retail, buildings, and healthcare.
Above: Xilinx Smart Video Analytics
“Those markets are big, meaty markets,” continued Roshan. As an example, losses from theft are costing retail stores almost $100bn a year. Approximately 50% of that is in North America, and around 25% is in EMEA. In the US alone, work related injuries are responsible for 105 million days of work lost on an annual basis.
In global critical care, nursing is a huge cost centre, and even in developed countries, ICU mortality is still edging upward toward 20%. So, there are huge areas where AI inferencing that is efficient, and performant, can make vast improvements and expand in those markets.
Roshan continued: “When we look at the overall TCO of Xilinx Smart World, compared against Nvidia, we see a nearly 30% reduction in overall CAPEX and OPEX. And the root of this is that we’re able to handle the same applications with about half of the hardware. So in this comparison (below) we compare four Nvidia cards versus two Xilinx Alveo accelerator cards that are doing the same job.”
Above: Xilinx Smart World End-to-End Latency
Roshan explained that when creating efficiencies of this nature, within the hardware that you use, it has a trickle-down effect of lowering maintenance, power and cooling costs, etc., in this case equating to a multi-million dollar difference in overall TCO.
He added: “The other major advantage is the really drastic, stark difference, in the end-to-end latency that we’re able to deliver. We recognise that today Nvidia is the market leader in this space. But we’ve got stark advantages here and we’re committed to making sure that they’re understood in the market.
Above: Xilinx Smart World TCO advantage
“This result is from our ability to accelerate, not just the inferencing piece, but the entire pipeline. And the combination of the massive saving in CAPEX and OPEX, with this deterministic low latency performance, we believe makes us the clear choice for critical applications like the ones previously discussed.”
Framework
The underlying framework for Smart World is the Video Media Streaming Server, also known as VMSS. The VMSS accelerates and optimises the entire pipeline. Using VMSS, Xilinx has been able to trim down the time required for pre-processing and initial image processing - shaving time off the pipeline on the ingress.
Resources can be flexibly used either for initial processing or for inferencing. So, depending on the specific situation, those available resources can flow back and forth so you can squeeze out the last ounce of overall performance.
A key piece of the offering for Xilinx is the ability to run multiple neural networks in parallel on the same hardware. Plus, there are additional advantages in terms of cost effectiveness and ease of deployment. There is also a suite of acceleration plugins that are available both from Xilinx and the company’s partners.
Roshan continued: “We have, with our partners, built a really powerful ecosystem, and that is continuing to grow all the time.” Example partners include Mipsology who not only deliver some compelling acceleration plugins, but also make it easy to migrate existing GPU-based applications into Smart World with practically zero recoding.
One of the big barriers to entry from the standpoint of an FPGA-based AI application is that moving over from GPUs typically requires a significant amount of work. Mipsology has made that process far easier. Therefore, customers can take advantage of that lower TCO and superior latency without having to do massive rebuilding.
Another partner, dEEPAI, has moved AI training to the edge and in doing so, have exposed a massive performance cost advantage over GPUs. Finally, Aupera is delivering a full suite of video AI solutions at the edge for smart cities, smart buildings, smart retail solutions.