Skip to content

YOLOv5 Deployment

Introduction

YOLOv5 is a single-stage object detection network framework. Its main structure consists of four parts: a modified CSPNet-based backbone, a high-resolution feature fusion module based on FPN (Feature Pyramid Network), a pooling module based on SPP (Spatial Pyramid Pooling), and three different detection heads for detecting objects of different sizes.

This chapter demonstrates how to deploy, load, and perform inference with YOLOv5s on an edge device. Two deployment methods are provided:

  • AidLite Python API
  • AidLite C++ API

In this case, model inference runs on the device-side NPU compute unit, and relevant interfaces are called through code to receive user input and return results.

  • Device: IQ8275
  • System: Ubuntu 24.04
  • Source Model: YOLOv5s
  • Quantized Model Precision: INT8 quantization
  • Model Farm Reference: YOLOv5s-INT8

Supported Platforms

PlatformOperation Mode
IQ8275Ubuntu 24.04

Prerequisites

  1. IQ8275 hardware

  2. Ubuntu 24.04 system

System Dependency Configuration

Configure the AidLux Package Source

bash
# Download the correct public key
sudo wget -O- https://archive.aidlux.com/ubuntu24/public.key | gpg --dearmor | sudo tee /etc/apt/trusted.gpg.d/private-aidlux.gpg > /dev/null

# Edit the source list file
sudo vim /etc/apt/sources.list.d/private-aidlux.list

# Add the repository provided by AidLux to the source file
deb [arch=arm64 signed-by=/etc/apt/trusted.gpg.d/private-aidlux.gpg] https://archive.aidlux.com/ubuntu24 noble main

# Update the package cache
sudo apt update

After the update is complete, you can use the following command to retrieve the official AidLux SDK dependencies:

bash
sudo apt list | grep aid | grep unknown
bash
# Install software
# Must be installed first (not included with the system)
sudo apt install python3 python3-pip libopencv-dev python3-opencv net-tools
# Must be installed before aidlite
sudo apt install aidlux-aistack-base aidrtcm

# Install aidlite and dependencies
sudo apt install aid-lms aidlms-sdk aidlite-sdk cmake
sudo apt-get install libfmt-dev nlohmann-json3-dev
sudo apt install aidlite-*

# DSP support
sudo apt-get install qcom-fastrpc1
sudo apt-get install qcom-fastrpc-dev

# Install aidgen-sdk
sudo apt install aidgen-qnn240-sdk

# Install mms service
sudo apt install aid-mms

# GPU support
sudo apt-add-repository -s ppa:ubuntu-qcom-iot/qcom-ppa
sudo apt install qcom-adreno-cl1
sudo ln -s /usr/lib/aarch64-linux-gnu/libOpenCL.so.1 /usr/lib/aarch64-linux-gnu/libOpenCL.so

After installation, check that the aidlite and aidgen directories have been added under /usr/local/share:

Device Authorization

Obtain the Device Serial Number

bash
cat  /sys/devices/soc0/serial_number

Obtain the License File

Provide the serial number to APLUX technical staff to generate a device-specific License file, then place it in the /etc/opt/aidlux/license/AidLuxLics directory.

Activate the License

bash
sudo /opt/aidlux/cpf/aid-lms/manager.sh restart

Download YOLOv5s-INT8 Model Resources

bash
mms list yolov5s

#------------------------ YOLOv5s model listing ------------------------
Model        Precision  Chipset           Backend
-----        ---------  -------           -------
YOLOv5s      INT8       Qualcomm QCS6490  QNN2.31
YOLOv5s      INT8       Qualcomm QCS8550  QNN2.31
YOLOv5s      FP16       Qualcomm QCS8550  QNN2.31
YOLOv5s      W8A16      Qualcomm QCS6490  QNN2.31
YOLOv5s      W8A16      Qualcomm QCS8550  QNN2.31
YOLOv5s-seg  INT8       Qualcomm QCS8550  QNN2.16
YOLOv5s-seg  INT8       Qualcomm QCS6490  QNN2.31

# Download YOLOv5s-int8
mms get -m YOLOv5s -p int8 -c qcs8550 -b qnn2.31 -d /home/ubuntu/yolov5s
cd /home/ubuntu/yolov5s
# Extract
unzip YOLOv5s_qcs8550_w8a8.zip

💡Note

Developers can also search for and download models on the Model Farm website.

AidLite SDK Installation

Developers can also refer to the README.md in the model folder for SDK installation.

  • Ensure QNN backend version ≥ 2.31
  • Ensure aidlite-sdk and aidlite-qnnxxx versions are 2.3.x
bash
# AidLite & QNN version check
dpkg -l | grep aidlite
#------------------------ Sample output ------------------------
ii  aidlite-qnn236       2.3.0.230         arm64        aidlux aidlite qnn236 backend plugin
ii  aidlite-sdk          2.3.0.230         arm64        aidlux inference module sdk

QNN & AidLite version update:

bash
# Install AidLite SDK
sudo apt update
sudo apt install aidlite-sdk
sudo apt install aidlite-qnn236

# aidlite sdk c++ check
python3 -c "import aidlite; print(aidlite.get_library_version())"

# aidlite sdk python check
python3 -c "import aidlite; print(aidlite.get_py_library_version())"

AidLite Python API Deployment

Run Python API Example

bash
cd /home/ubuntu/yolov5s/code

# --target_model: Path to the model file
# --imgs: Image input
# --invoke_nums: Number of inference iterations
python3 python/run_test.py --target_model /home/ubuntu/yolov5s/models/QCS8550/W8A8/cutoff_yolov5s_qcs8550_w8a8.qnn231.ctx.bin --imgs ./python/bus.jpg --invoke_nums 10

You can view the model inference latency (in ms) and detection results in the terminal:

plain
=======================================
QNN inference 10 times :
 --mean_invoke_time is 1.5697240829467773
 --max_invoke_time is 2.2764205932617188
 --min_invoke_time is 1.477956771850586
 --var_invoketime is 0.05570707855895307
=======================================
5 regions detected
1 [668, 385, 141, 500] 0.86635846 person
2 [219, 407, 125, 465] 0.86257815 person
3 [55, 393, 178, 515] 0.845101 person
4 [3, 207, 812, 601] 0.8401416 bus
5 [0, 551, 73, 325] 0.5058796 person
Image saved to ./python/yolov5s_result.jpg
=======================================

AidLite C++ API Deployment

Run C++ API Example

bash
cd /home/ubuntu/yolov5s/code/cpp

mkdir build && cd build
cmake ..
make
./run_yolov5

You can view the model inference latency (in ms) and detection results in the terminal:

The C++ code defaults to 10 inference iterations.

plain
current thread_idx[1] [9] get_output_tensor cost time : 0.911757
repeat [10] time , input[21.061591] --- invoke[16.763899] --- output[16.023339] --- sum[53.848829]ms
postprocess cost time : 0.158227 ms
Result id[0]-x1[209.905304]-y1[242.500031]-x2[284.438965]-y2[518.384033]
Verify result : idx[0] id[0] coverage_ratio[0.983873]
Result id[0]-x1[105.769806]-y1[228.641464]-x2[234.684357]-y2[546.474487]
Verify result : idx[0] id[0] coverage_ratio[0.129211]