Accelerate Your On-device AI with the Qualcomm AI Engine

Published in

ProAndroidDev

3 min readSep 26, 2023

TL:DR — Image classification, object detection, and speech recognition are all examples of AI use cases that can benefit from on-device inference. Qualcomm and Snapdragon® Platforms feature the Qualcomm AI Engine with the Qualcomm® Hexagon™ NPU. In conjunction With the Qualcomm Neural Processing SDK for AI, developers can add hardware-accelerated inference to their Android apps at the edge.

What would you do with several teraOPS of compute available for running your machine learning models on a mobile device?

That would really open up your options in robotics, edge computing, connected cameras and always-on applications, wouldn’t it? Hardware acceleration can speed up neural network execution by orders of magnitude in use cases like image classification, object detection, face detection, and speech recognition. It can pave the way for new, real-time use cases that otherwise wouldn’t be possible, like live video enhancements and fast body tracking for games.

Imagine the user experiences you could create if you got all that compute, with extremely low power consumption, and without bogging down your CPU and GPU.

On supported Qualcomm and Snapdragon® Platforms, you can run inference directly with the Qualcomm® Artificial Intelligence (AI) Engine, the dedicated hardware to accelerate on-device AI. Then use the Qualcomm® Neural Processing Unit (NPU) SDK for AI, to program the Qualcomm AI Engine and utilize those teraOPS for massive acceleration of deep neural networks.

Inference in the cloud or on the device?

A lot of developers are writing mobile and IoT apps with functions like image classification, object detection, and face detection. A few years ago, they had to perform both training and inference in the cloud. But as mobile processors have increased in power, developers have started to separate training from inference. After training machine learning models in the cloud, they’re moving the inference workloads down to the mobile device.

Why should you run your machine learning models on the device rather than in the cloud? For one thing, you don’t want the latency of going to the cloud and back. You can keep user data on the device, which is an advantage for privacy. And you don’t want your app to be at the mercy of the network connection. In short, you can take machine learning into new industries and enrich users with better mobile experiences.

But on-device processing demands high compute. Otherwise, the device becomes the bottleneck. For example, if the device takes, say, one-half or one-third of a second to run the image from your camera through the AI model, you won’t have very good performance in a real-time application.

AI processing power on the device

The Qualcomm AI Engine has the capacity for mind-blowing performance on machine learning models across Qualcomm and Snapdragon platforms. The engine provides high capacity for matrix multiplication on both the Qualcomm® Hexagon™ NPU Vector eXtensions (HVX) and the Hexagon Tensor Accelerator (HTA). With enough on-device processing power to run hundreds of inferences per second on the Inception-v3 neural network, your app could classify or detect dozens of objects in just a few milliseconds and with high confidence.

Your turn

Mobile apps, robotics, IoT, VR, AR, and connected car applications, are all verticals where the combination of the Qualcomm Neural Processing SDK and the Qualcomm AI Engine, can help power new user experiences.

The Qualcomm Neural Processing SDK for AI is a free download that you can use on supported platforms from Qualcomm Technologies, Inc. Give it a try!

Author: Enrico Ros
Enrico is an Artificial Intelligence Software Product Manager for Qualcomm Technologies, Inc. He is passionate about translating new technologies into everyday devices and has been responsible for the definition and launch of multiple digital products.

Snapdragon and Qualcomm branded products are products of Qualcomm Technologies, Inc. and/or its subsidiaries.