AidGen SDK Developer Documentation
Introduction
AidGen is an inference framework specifically designed for generative Transformer models, built on top of AidLite. It aims to fully utilize various computing units of hardware (CPU, GPU, NPU) to achieve inference acceleration for large models on edge devices.
AidGen is an SDK-level development kit that provides atomic-level large model inference interfaces, suitable for developers to integrate large model inference into their applications.
AidGen supports multiple types of generative AI models:
- Language large models -> AidLLM inference
- Multimodal large models -> AidMLM inference
The structure is shown in the diagram below:

💡Note
All large models supported by Model Farm achieve inference acceleration on Qualcomm chip NPUs through AidGen.
Support Status
Model Type Support Status
| AidLLM | AidMLM | |
|---|---|---|
| Text | ✅ | / |
| Image | / | 🚧 |
| Audio | / | 🚧 |
✅: Supported 🚧: Planned support
Operating System Support Status
| Linux | Android | |
|---|---|---|
| C++ | ✅ | / |
| Python | 🚧 | / |
| Java | / | 🚧 |
✅: Supported 🚧: Planned support
Large Language Model AidLLM SDK
Installation
sudo aid-pkg -i aidgen-sdkModel File Acquisition
Model files and default configuration files can be downloaded directly through the Model Farm Large Model Section
Example
Deploying Qwen2.5-0.5B-Instruct on Qualcomm QCS8550
Step 1: Install AidGen SDK
# Install AidGen SDK
sudo aid-pkg update
sudo aid-pkg -i aidgen-sdk
# Copy test code
cd /home/aidlux
cp -r /usr/local/share/aidgen/examples/cpp/aidllm .Step 2: Upload & Unzip Model Resources
- Upload the downloaded model resources to the edge device.
- Unzip the model resources to the
/home/aidlux/aidllmdirectory:
cd /home/aidlux/aidllm
unzip Qwen2.5-0.5B-Instruct_Qualcomm\ QCS8550_QNN2.29_W4A16.zip -d .Step 3: Confirm Resource Files
The file distribution is as follows:
/home/aidlux/aidllm
├── CMakeLists.txt
├── test_prompt_abort.cpp
├── test_prompt_serial.cpp
├── aidgen_chat_template.txt
├── chat.txt
├── htp_backend_ext_config.json
├── qwen2.5-0.5b-instruct-htp.json
├── qwen2.5-0.5b-instruct-tokenizer.json
├── qwen2.5-0.5b-instruct_qnn229_qcs8550_4096_1_of_2.serialized.bin
├── qwen2.5-0.5b-instruct_qnn229_qcs8550_4096_2_of_2.serialized.binStep 4: Set Dialogue Template
💡Note
For the dialogue template, refer to the aidgen_chat_template.txt file in the model resource package.
Modify the test_prompt_serial.cpp file according to the large model's template:
if(prompt_template_type == "qwen2"){
prompt_template = "<|im_start|>system\nYou are Qwen, created by Alibaba Cloud. You are a helpful assistant.<|im_end|>\n<|im_start|>user\n{0}<|im_end|>\n<|im_start|>assistant\n";
}Step 5: Compile and Run
# install dependency
sudo apt update
sudo apt install libfmt-dev
# compile
mkdir build && cd build
cmake .. && make
# Run test_prompt_serial after successful compilation
./build/test_prompt_serial qwen2.5-0.5b-instruct-htp.json- Enter dialogue content in the terminal.

Multi-modal Vision Model AidMLM SDK
Model Support Status
| Model | Status |
|---|---|
| Qwen2.5-VL-3B-Instruct | ✅ |
| Qwen2.5-VL-7B-Instruct | ✅ |
| InternVL3-2B | 🚧 |
| Qwen3-VL-4b | 🚧 |
| Qwen3-VL-2b | 🚧 |
Installation
sudo aid-pkg update
sudo aid-pkg -i aidgen-sdkModel File Acquisition
Acquire and download models directly via command line.
sudo aid-pkg -i aidgense
# View supported models
aidllm remote-list api
# Pull a model
aidllm pull api [Url]
# List downloaded models
aidllm list api
# Remove a downloaded model
sudo aidllm rm api [Name]Example Application
Deploying Qwen2.5-VL-3B-Instruct (392x392) on Qualcomm QCS8550
- Install AidGen SDK
# Install AidGen SDK
sudo aid-pkg update
sudo aid-pkg -i aidgen-sdk
# Copy test code
cd /home/aidlux
cp -r /usr/local/share/aidgen/examples/cpp/aidmlm ./- Model Acquisition
# Install aidllm tool
sudo aid-pkg -i aidgense
# Download the model
aidllm pull api aplux/Qwen2.5-VL-3B-392x392-8550
# Move model resources to the directory
mv -r /opt/aidlux/app/aid-openai-api/res/models/Qwen2.5-VL-3B-392x392-8550 /home/aidlux/aidmlm- Create Configuration File
cd /home/aidlux/aidmlm
vim config3b_392.jsonCreate the following json configuration file:
{
"vision_model_path":"veg.serialized.bin.aidem",
"pos_embed_cos_path":"position_ids_cos.raw",
"pos_embed_sin_path":"position_ids_sin.raw",
"vocab_embed_path":"embedding_weights_151936x2048.raw",
"window_attention_mask_path":"window_attention_mask.raw",
"full_attention_mask_path":"full_attention_mask.raw",
"llm_path_list":[
"qwen2p5-vl-3b-qnn231-qcs8550-cl2048_1_of_6.serialized.bin.aidem",
"qwen2p5-vl-3b-qnn231-qcs8550-cl2048_2_of_6.serialized.bin.aidem",
"qwen2p5-vl-3b-qnn231-qcs8550-cl2048_3_of_6.serialized.bin.aidem",
"qwen2p5-vl-3b-qnn231-qcs8550-cl2048_4_of_6.serialized.bin.aidem",
"qwen2p5-vl-3b-qnn231-qcs8550-cl2048_5_of_6.serialized.bin.aidem",
"qwen2p5-vl-3b-qnn231-qcs8550-cl2048_6_of_6.serialized.bin.aidem"
]
}The file distribution is as follows:
/home/aidlux/aidmlm
├── CMakeLists.txt
├── test_qwen25vl_abort.cpp
├── test_qwen25vl.cpp
├── demo.jpg
├── embedding_weights_151936x2048.raw
├── full_attention_mask.raw
├── position_ids_cos.raw
├── position_ids_sin.raw
├── qwen2p5-vl-3b-qnn231-qcs8550-cl2048_1_of_6.serialized.bin.aidem
├── qwen2p5-vl-3b-qnn231-qcs8550-cl2048_2_of_6.serialized.bin.aidem
├── qwen2p5-vl-3b-qnn231-qcs8550-cl2048_3_of_6.serialized.bin.aidem
├── qwen2p5-vl-3b-qnn231-qcs8550-cl2048_4_of_6.serialized.bin.aidem
├── qwen2p5-vl-3b-qnn231-qcs8550-cl2048_5_of_6.serialized.bin.aidem
├── qwen2p5-vl-3b-qnn231-qcs8550-cl2048_6_of_6.serialized.bin.aidem
├── veg.serialized.bin.aidem
├── window_attention_mask.raw- Compile and Run
sudo apt update
sudo apt-get install libfmt-dev nlohmann-json3-dev
mkdir build && cd build
cmake .. && make
mv test_qwen25vl /home/aidlux/aidmlm/
# Run test_qwen25vl after successful compilation
cd /home/aidlux/aidmlm/
./test_qwen25vl "qwen25vl3b392" "config3b_392.json" "demo.jpg" "Please describe the scene in the picture"In the test_qwen25vl.cpp test code, model_type is defined for different model types used as the first argument of the executable. Currently supported types are:
| Model | Type |
|---|---|
| Qwen2.5-VL-3B (392X392) | qwen25vl3b392 |
| Qwen2.5-VL-3B (672X672) | qwen25vl3b672 |
| Qwen2.5-VL-7B (392X392) | qwen25vl7b392 |
| Qwen2.5-VL-7B (672X672) | qwen25vl7b672 |
- The running result is shown below:
