docs: Update README and compilation guides for clarity and consistency, including path corrections and improved formatting. Add copyright notices to source files and adjust file permissions for several scripts and directories.

This commit is contained in:
dian.yuan 2026-02-28 11:06:26 +08:00
parent f960c5030d
commit bd891a96dd
136 changed files with 14413 additions and 9399 deletions

1
.gitignore vendored Executable file
View file

@ -0,0 +1 @@
dependency

View file

@ -101,10 +101,12 @@ export ANDROID_NDK_PATH=/path/to/android-ndk-r25c
To build **all examples at once**, use the top-level batch script:
```bash
cd examples
./build-android-all.sh # auto-detects amlnn-toolkit
./examples/build-android-all.sh # auto-detects amlnn-toolkit
# or explicitly:
AMLNN_HOME=/path/to/amlnn-toolkit ./build-android-all.sh
#clean build files
./examples/build-android-all.sh
```
The script automatically cleans the previous build, resolves the AMLNN SDK via the priority rules above, and prints a build summary at the end.
@ -122,35 +124,36 @@ export YOCTO_SDK_ROOT=/path/to/poky/sdk
The toolchain file is shared across all demos at `examples/cmake/yocto-toolchain.cmake`.
**Build a single demo:**
```bash
cd examples/yolox/cpp
# 64-bit (default)
./build-linux.sh -m yocto -s /path/to/poky/sdk
# 32-bit
./build-linux.sh -m yocto -b 32 -s /path/to/poky/32bit-sdk
```
**Build all demos at once:**
```bash
cd examples
# 64-bit
./build-linux-all.sh -m yocto -s /path/to/poky/sdk
./examples/build-linux-all.sh -m yocto -s /path/to/poky/sdk
# 32-bit
./build-linux-all.sh -m yocto -b 32 -s /path/to/poky/32bit-sdk
./examples/build-linux-all.sh -m yocto -b 32 -s /path/to/poky/32bit-sdk
# Clean yocto build artifacts
./clean-linux-all.sh -m yocto
./examples/clean-linux-all.sh -m yocto
```
> **Note:** The `LLMs` demo is automatically excluded from the batch build scripts.
**Build a single demo:**
```bash
# 64-bit (default)
./examples/yolox/cpp/build-linux.sh -m yocto -s /path/to/poky/sdk
# 32-bit
./examples/yolox/cpp/build-linux.sh -m yocto -b 32 -s /path/to/poky/32bit-sdk
```
# **Release Notes**
| Version | Description |

0
common/.gitkeep Normal file → Executable file
View file

View file

@ -2,6 +2,22 @@
// Exposed Functions
// -------------------------------------------------------------------------
/*
* Copyright (C) 2026 Amlogic, Inc. All rights reserved.
*
* Licensed under the Apache License, Version 2.0 (the "License");
* you may not use this file except in compliance with the License.
* You may obtain a copy of the License at
*
* http://www.apache.org/licenses/LICENSE-2.0
*
* Unless required by applicable law or agreed to in writing, software
* distributed under the License is distributed on an "AS IS" BASIS,
* WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
* See the License for the specific language governing permissions and
* limitations under the License.
*/
#include "model_loader.h"
#include <cstring>
#include <iostream>

View file

@ -1,3 +1,19 @@
/*
* Copyright (C) 2026 Amlogic, Inc. All rights reserved.
*
* Licensed under the Apache License, Version 2.0 (the "License");
* you may not use this file except in compliance with the License.
* You may obtain a copy of the License at
*
* http://www.apache.org/licenses/LICENSE-2.0
*
* Unless required by applicable law or agreed to in writing, software
* distributed under the License is distributed on an "AS IS" BASIS,
* WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
* See the License for the specific language governing permissions and
* limitations under the License.
*/
#ifndef _AMLNN_MODEL_LOADER_H_
#define _AMLNN_MODEL_LOADER_H_

0
dependency/.gitkeep Normal file → Executable file
View file

View file

@ -1,6 +1,6 @@
# 1. Android Platform
# 1Android Platform
**Android compilation** depends on the NDK toolchain. Currently, version r25c is recommended. Download link: https://github.com/android/ndk/wiki/Unsupported-Downloads
@ -34,7 +34,7 @@
# 2. Linux Platform
# 2Linux Platform
**Linux compilation** toolchain dependency: **gcc-arm-10.3-2021.07-x86_64-arm-none-linux-gnueabihf**, download link: https://developer.arm.com/tools-and-software/open-source-software/developer-tools/gnu-toolchain/gnu-a/downloads/

0
examples/ECAPA-TDNN/cpp/.gitkeep Normal file → Executable file
View file

0
examples/ECAPA-TDNN/model/.gitkeep Normal file → Executable file
View file

0
examples/ECAPA-TDNN/py/.gitkeep Normal file → Executable file
View file

View file

@ -94,7 +94,7 @@ To compile the CPP project using Android NDK, please follow these steps:
**Hardware Requirements**:
- SOC: A311D2
- DDR: 4GB
- DDR: = 4GB
**System Requirements**:
- OS: Ubuntu 22.04

View file

@ -1,3 +1,19 @@
#
# Copyright (C) 20242025 Amlogic, Inc. All rights reserved.
#
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.
#
LOCAL_PATH := $(call my-dir)
LLM_SDK_PATH := $(LOCAL_PATH)/../../../../amlnn-toolkit/nn_runtime/llmsdk
3RDPARTY_PATH := $(LOCAL_PATH)/../../../dependency
@ -17,4 +33,15 @@ LOCAL_LDLIBS := -llog -ldl -lm -fuse-ld=ld
LOCAL_MODULE := demo_llm_main
LOCAL_LICENSE_KINDS := SPDX-License-Identifier: Apache-2.0
LOCAL_LICENSE_CONDITIONS := notice
LOCAL_LICENSE_COMMENTS := Copyright (C) 20242025 Amlogic, Inc. All rights reserved.
LOCAL_LICENSE_URL := https://www.apache.org/licenses/LICENSE-2.0
LOCAL_LICENSE_FILE := LICENSE
include $(BUILD_EXECUTABLE)

View file

@ -3,30 +3,6 @@ cmake_minimum_required(VERSION 3.5.1)
set(CMAKE_SYSTEM_NAME Linux)
project(AML_LLM_NNSDK)
# xinxin, when building the .so with Yocto using CMake, you can remove the sysroot settings
# from these CMakeLists.txt files and use the officially recommended approach instead:
# after sourcing the environment script, many environment variables will be set (check with `export`),
# and CMake will configure itself automatically based on them without needing explicit settings here.
# source /mnt/fileroot/xinxin.he/environment/new-yocto/64/environment-setup-armv8a-poky-linux
# export CXXFLAGS=$(echo "$CXXFLAGS" | sed 's/-g//g')
# export CFLAGS=$(echo "$CXXFLAGS" | sed 's/-g//g')
# cmake -DCMAKE_TOOLCHAIN_FILE=${OE_CMAKE_TOOLCHAIN_FILE} ..
# # Set Yocto cross-compilation environment
# set(SYSROOT_PATH /mnt/fileroot/xinxin.he/environment/new-yocto/64/sysroots/x86_64-pokysdk-linux)
# set(CMAKE_SYSROOT "${SYSROOT_PATH}")
# message(STATUS "Using sysroot path as ${SYSROOT_PATH}")
# include(CMakeForceCompiler)
# cmake_force_c_compiler("${SYSROOT_PATH}/usr/bin/aarch64-poky-linux/aarch64-poky-linux-gcc" GNU)
# cmake_force_cxx_compiler("${SYSROOT_PATH}/usr/bin/aarch64-poky-linux/aarch64-poky-linux-g++" GNU)
# # Set the sysroot for the actual target board
# set(MYSYSROOT "/mnt/fileroot/xinxin.he/environment/new-yocto/64/sysroots/armv8a-poky-linux")
# add_definitions("--sysroot=${MYSYSROOT}")
# set(CMAKE_SHARED_LINKER_FLAGS "${CMAKE_SHARED_LINKER_FLAGS} --sysroot=${MYSYSROOT}" CACHE INTERNAL "" FORCE)
# set(CMAKE_EXE_LINKER_FLAGS "${CMAKE_EXE_LINKER_FLAGS} --sysroot=${MYSYSROOT}" CACHE INTERNAL "" FORCE)
# set(CMAKE_FIND_ROOT_PATH "${MYSYSROOT}")
set(CMAKE_FIND_ROOT_PATH_MODE_PROGRAM NEVER)
set(CMAKE_FIND_ROOT_PATH_MODE_LIBRARY ONLY)
set(CMAKE_FIND_ROOT_PATH_MODE_INCLUDE ONLY)

Binary file not shown.

After

Width:  |  Height:  |  Size: 67 KiB

View file

@ -1,5 +1,21 @@
# -*- coding: utf-8 -*-
#
# Copyright (C) 2026 Amlogic, Inc. All rights reserved.
#
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.
#
import argparse
import sys
from datetime import datetime

View file

@ -0,0 +1,104 @@
# blazepose_detect
## 1.Overview
BlazePose Detection was introduced by Google as part of the MediaPipe framework, providing fast and lightweight person detection optimized for real-time performance on mobile and edge devices. The detector identifies the human region of interest (ROI) in an image, ensuring stable and efficient pose tracking in subsequent stages.
## 2.Model Download
- **Open Source model**
- **Open Source projects:** https://github.com/google-ai-edge/mediapipe/tree/master
- **Download weights**
wget https://storage.googleapis.com/mediapipe-assets/pose_detection.tflite
## 3. Model Conversion
```
cd model
Usage: ./adla_convert.sh model_path adla_toolkit_path target_platform
example
./adla_convert.sh pose_detection.tflite /xxxx/adla-toolkit-binary-3.2.9.3 PRODUCT_PID0XA005
./adla_convert.sh pose_detection.tflite /xxxx/adla-toolkit-binary-3.2.9.3 PRODUCT_PID0XA005
./adla_convert.sh pose_detection.tflite /xxxx/adla-toolkit-binary-3.2.9.3 PRODUCT_PID0XA005
```
| Parameter | Description |
| ----------------- | ------------------------------------------------------------ |
| model_path | onnx model path |
| adla_toolkit_path | path to adla_toolkit |
| target_platform | Specify target platform. for A311D2: PRODUCT_PID0XA003. for S905X5: PRODUCT_PID0XA005 |
## 4. Demo Run
### CPP
#### 1. Compile
**Prerequisites:**
- Android NDK (r25e recommended)
- `ANDROID_NDK_PATH` environment variable set
**Build:**
```bash
# Build for arm64-v8a
cd examples/blazepose_detect/cpp
./build-android.sh -a arm64-v8a
```
The executable will be generated at `build/android/blazepose_detect_demo` (Note: executable name may vary, verify in build folder).
#### 2. Run
```bash
# Push executable to device
adb push build/android/blazepose_detect_demo /data/local/tmp/
adb push model/blazepose_detect_int8_A311D2.adla /data/local/tmp/
adb push test_image.jpg /data/local/tmp/
# Run on device
adb shell
cd /data/local/tmp
chmod +x blazepose_detect_demo
export LD_LIBRARY_PATH=/vendor/lib64 or (/vendor/lib)
# Usage: ./blazepose_detect_demo <model_path> <image_path>
./blazepose_detect_demo blazepose_detect_int8_A311D2.adla test_image.jpg"
```
**Note:** Replace `blazepose_detect_int8_A311D2.adla` with your actual model file path.
### Python
**Prerequisites:**
- Python 3.10
- Required packages: `numpy`, `opencv-python`, `amlnnlite`
**Install dependencies:**
```bash
pip install numpy opencv-python amlnnlite-1.0.0-cp310-cp310-linux_aarch64.whl
```
**Run on device:**
```bash
python blazepose_detect.py --model-path ./blazepose_detect_int8_A311D2.adla
```
The script will automatically process all image files (`.jpg`, `.jpeg`, `.png`, `.bmp`) in the current directory and save results to a `{model_name}_result` folder.
## 5.Results
The program will print the detection count and inference time. The result image with bounding boxes will be saved to the specified output path (`result.jpg` by default).
You can pull the result image back to view it:
```bash
adb pull result.jpg.
```
![alt text](result.jpg)

0
examples/blazepose_detect/cpp/.gitkeep Normal file → Executable file
View file

View file

@ -0,0 +1,77 @@
#!/bin/bash
set -e
#
# Copyright (C) 20242025 Amlogic, Inc. All rights reserved.
#
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.
#
usage() {
echo "Usage: $0 [-a <target_abi>]"
echo " -a <target_abi> : Target ABI (default: arm64-v8a)"
echo " -h : Show this help message"
exit 1
}
# Default values
TARGET_ABI=arm64-v8a
# Parse arguments
while getopts 'a:h' opt; do
case "$opt" in
a)
TARGET_ABI=$OPTARG
;;
h)
usage
;;
*)
usage
;;
esac
done
if [ -z "${ANDROID_NDK_PATH}" ]; then
if [ -n "${ANDROID_NDK}" ]; then
ANDROID_NDK_PATH=${ANDROID_NDK}
elif [ -n "${ANDROID_NDK_HOME}" ]; then
ANDROID_NDK_PATH=${ANDROID_NDK_HOME}
else
echo "Error: ANDROID_NDK_PATH is not set."
echo "Please set ANDROID_NDK_PATH to your Android NDK directory."
exit 1
fi
fi
ROOT_PWD=$(cd "$(dirname $0)" && pwd)
BUILD_DIR=${ROOT_PWD}/build/android
echo "Building for Android..."
echo "NDK_PATH: ${ANDROID_NDK_PATH}"
echo "TARGET_ABI: ${TARGET_ABI}"
echo "BUILD_DIR: ${BUILD_DIR}"
mkdir -p ${BUILD_DIR}
cd ${BUILD_DIR}
cmake ../../src \
-DCMAKE_TOOLCHAIN_FILE=${ANDROID_NDK_PATH}/build/cmake/android.toolchain.cmake \
-DANDROID_ABI=${TARGET_ABI} \
-DANDROID_PLATFORM=android-24 \
-DCMAKE_BUILD_TYPE=Release \
-DOpenCV_DIR=${ROOT_PWD}/../../../dependency/opencv/opencv-android-sdk-build/sdk/native/jni/abi-${TARGET_ABI}
make -j4
echo "Build complete. Executable in ${BUILD_DIR}/blazepose_detect_demo"

View file

@ -0,0 +1,168 @@
#!/bin/bash
set -e
#
# Copyright (C) 20242025 Amlogic, Inc. All rights reserved.
#
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.
#
usage() {
echo "Usage: $0 [-m <mode>] [-a <target_arch>] [-b <arch_bits>] [-s <yocto_sdk_root>] [-t <toolchain_file>]"
echo " -m <mode> : Build mode: 'linux' or 'yocto' (default: linux)"
echo " -a <target> : Target arch for linux mode (default: aarch64)"
echo " -b <arch_bits> : Arch bits for yocto mode: 32 or 64 (default: 64)"
echo " -s <sdk_root> : Yocto SDK root path (overrides YOCTO_SDK_ROOT env var)"
echo " -t <toolchain> : CMake toolchain file (overrides TOOLCHAIN_FILE env var)"
echo " -h : Show this help message"
exit 1
}
# Default values
BUILD_MODE=linux
TARGET_ARCH=aarch64
ARCH_BITS=64
CLI_SDK_ROOT=""
CLI_TOOLCHAIN_FILE=""
# Parse arguments
while getopts 'm:a:b:s:t:h' opt; do
case "$opt" in
m)
BUILD_MODE=$OPTARG
;;
a)
TARGET_ARCH=$OPTARG
;;
b)
ARCH_BITS=$OPTARG
;;
s)
CLI_SDK_ROOT=$OPTARG
;;
t)
CLI_TOOLCHAIN_FILE=$OPTARG
;;
h)
usage
;;
*)
usage
;;
esac
done
ROOT_PWD=$(cd "$(dirname $0)" && pwd)
# ===========================================================================
# Yocto build
# ===========================================================================
if [[ "${BUILD_MODE}" == "yocto" ]]; then
if [[ "${ARCH_BITS}" != "32" && "${ARCH_BITS}" != "64" ]]; then
echo "Unsupported ARCH_BITS \"${ARCH_BITS}\". Must be 32 or 64." >&2
exit 1
fi
# Configurable via environment variables (CLI args > env vars > defaults)
CMAKE_BIN="${CMAKE_BIN:-cmake}"
YOCTO_SDK_ROOT="${CLI_SDK_ROOT:-${YOCTO_SDK_ROOT:-/data/yuandian/tools/poky/4.0.20}}"
TOOLCHAIN_FILE="${CLI_TOOLCHAIN_FILE:-${TOOLCHAIN_FILE:-${ROOT_PWD}/../../cmake/yocto-toolchain.cmake}}"
# Export variables for CMake
export YOCTO_SDK_ROOT
export ARCH_BITS
BUILD_DIR="${ROOT_PWD}/build/yocto/${ARCH_BITS}"
echo "==> Building Yocto ${ARCH_BITS}-bit"
echo " toolchain : ${TOOLCHAIN_FILE}"
echo " SDK root : ${YOCTO_SDK_ROOT}"
echo " BUILD_DIR : ${BUILD_DIR}"
mkdir -p "${BUILD_DIR}"
rm -rf "${BUILD_DIR}"
# Select OpenCV based on target architecture
if [[ "${ARCH_BITS}" == "32" ]]; then
OPENCV_DIR="${ROOT_PWD}/../../../dependency/opencv/opencv-linux-armhf/share/OpenCV"
else
OPENCV_DIR="${ROOT_PWD}/../../../dependency/opencv/opencv-linux-aarch64/share/OpenCV"
fi
"${CMAKE_BIN}" \
-S "${ROOT_PWD}/src" \
-B "${BUILD_DIR}" \
-DCMAKE_TOOLCHAIN_FILE="${TOOLCHAIN_FILE}" \
-DYOCTO_SDK_ROOT="${YOCTO_SDK_ROOT}" \
-DARCH_BITS="${ARCH_BITS}" \
-DCMAKE_BUILD_TYPE=Release \
-DOpenCV_DIR="${OPENCV_DIR}"
"${CMAKE_BIN}" --build "${BUILD_DIR}" --config Release
# Strip (best-effort)
HOST_SYSROOT="${YOCTO_SDK_ROOT}/sysroots/x86_64-pokysdk-linux"
if [[ "${ARCH_BITS}" == "32" ]]; then
CROSS_TRIPLE="arm-poky-linux-gnueabi"
else
CROSS_TRIPLE="aarch64-poky-linux"
fi
STRIP_TOOL="${HOST_SYSROOT}/usr/bin/${CROSS_TRIPLE}/${CROSS_TRIPLE}-strip"
if [[ -x "${STRIP_TOOL}" ]]; then
"${STRIP_TOOL}" --strip-unneeded "${BUILD_DIR}/blazepose_detect_demo"
else
echo "warning: strip tool not found; keeping debug info." >&2
fi
echo "Build complete. Executable in ${BUILD_DIR}/blazepose_detect_demo"
exit 0
fi
# ===========================================================================
# Standard Linux cross-compile build
# ===========================================================================
# Default to aarch64-linux-gnu if GCC_COMPILER is not set
GCC_COMPILER=${GCC_COMPILER:-aarch64-linux-gnu}
# Set compilers
export CC=${GCC_COMPILER}-gcc
export CXX=${GCC_COMPILER}-g++
# Validate compiler
if ! command -v ${CC} &> /dev/null; then
echo "Error: Compiler ${CC} not found."
echo "Please set GCC_COMPILER environment variable to your cross-compiler path prefix."
echo "Example: export GCC_COMPILER=/path/to/toolchain/bin/aarch64-linux-gnu"
exit 1
fi
BUILD_DIR=${ROOT_PWD}/build/linux
echo "Building for Linux..."
echo "COMPILER: ${CC}"
echo "TARGET_ARCH: ${TARGET_ARCH}"
echo "BUILD_DIR: ${BUILD_DIR}"
mkdir -p ${BUILD_DIR}
cd ${BUILD_DIR}
cmake ../../src \
-DCMAKE_SYSTEM_NAME=Linux \
-DCMAKE_SYSTEM_PROCESSOR=${TARGET_ARCH} \
-DCMAKE_BUILD_TYPE=Release
make -j4
echo "Build complete. Executable in ${BUILD_DIR}/blazepose_detect_demo"

View file

@ -0,0 +1,36 @@
cmake_minimum_required(VERSION 3.10...3.27)
project(blazepose_detect_demo)
set(CMAKE_CXX_STANDARD 17)
list(APPEND CMAKE_MODULE_PATH "${CMAKE_SOURCE_DIR}/../../../../cmake")
find_package(AMLNN REQUIRED)
include_directories(${AMLNN_INCLUDE_DIR})
link_directories(${AMLNN_LIBRARY_DIR})
include_directories(${CMAKE_SOURCE_DIR}/../../../../common)
# Set 3rdparty path
set(3RDPARTY_DIR "${CMAKE_SOURCE_DIR}/../../../../dependency")
if(CMAKE_SYSTEM_NAME STREQUAL "Android")
# Android needs log
link_libraries(log)
endif()
# Find OpenCV
message(STATUS "OpenCV_DIR: ${OpenCV_DIR}")
find_package(OpenCV REQUIRED)
include_directories(${OpenCV_INCLUDE_DIRS})
add_executable(blazepose_detect_demo
main.cpp
postprocess.cpp
postprocess.h
${CMAKE_SOURCE_DIR}/../../../../common/model_loader.cpp
)
target_link_libraries(blazepose_detect_demo
${OpenCV_LIBS}
${AMLNN_LIBRARY}
)

File diff suppressed because it is too large Load diff

View file

@ -0,0 +1,151 @@
/*
* Copyright (C) 20242025 Amlogic, Inc. All rights reserved.
*
* Licensed under the Apache License, Version 2.0 (the "License");
* you may not use this file except in compliance with the License.
* You may obtain a copy of the License at
*
* http://www.apache.org/licenses/LICENSE-2.0
*
* Unless required by applicable law or agreed to in writing, software
* distributed under the License is distributed on an "AS IS" BASIS,
* WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
* See the License for the specific language governing permissions and
* limitations under the License.
*/
#include <iostream>
#include <string>
#include <vector>
#include <chrono>
#include <tuple>
#include <iomanip>
#include <fstream>
#include <opencv2/opencv.hpp>
#include "postprocess.h"
#include "model_loader.h"
const std::string DEFAULT_OUTPUT_PATH = "./result.jpg";
const int MODEL_INPUT_WIDTH = 224;
const int MODEL_INPUT_HEIGHT = 224;
const float SCORE_THRESHOLD = 0.5f;
const float NMS_THRESHOLD = 0.3f;
int main(int argc, char **argv)
{
std::string model_path;
std::string image_path;
if (argc != 3)
{
printf("%s <model_path> <image_path>\n", argv[0]);
return -1;
}
if (argc > 1)
model_path = argv[1];
if (argc > 2)
image_path = argv[2];
std::cout << "Blazepose Detect Demo" << std::endl;
std::cout << "Model: " << model_path << std::endl;
std::cout << "Image: " << image_path << std::endl;
std::cout << "Output: " << DEFAULT_OUTPUT_PATH << std::endl;
// 1. Load Image
cv::Mat img = cv::imread(image_path);
if (img.empty())
{
std::cerr << "Failed to load image from " << image_path << std::endl;
return -1;
}
// 2. Initialize Network
void *context = init_network(model_path.c_str());
if (!context)
{
std::cerr << "Failed to initialize network." << std::endl;
return -1;
}
// 3. Preprocess
auto start_time = std::chrono::high_resolution_clock::now();
auto [preprocessed, scale, pad] = preprocess(img, std::make_tuple(MODEL_INPUT_HEIGHT, MODEL_INPUT_WIDTH));
std::cout << "scale" << scale << std::endl;
std::cout << "pad: ("
<< std::get<0>(pad) << ", "
<< std::get<1>(pad) << ")"
<< std::endl;
// Quantize to int8 (model expects quantized input)
cv::Mat quantized_img = quantize_input(preprocessed, 0.007843137718737125, -1);
// 4. Set input and run inference
nn_input inData;
memset(&inData, 0, sizeof(nn_input));
inData.input_type = BINARY_RAW_DATA;
inData.input = quantized_img.data;
inData.input_index = 0;
inData.size = quantized_img.total() * quantized_img.elemSize();
if (aml_module_input_set(context, &inData) != 0)
{
std::cerr << "Failed to set input." << std::endl;
uninit_network(context);
return -1;
}
aml_output_config_t outconfig;
memset(&outconfig, 0, sizeof(aml_output_config_t));
outconfig.typeSize = sizeof(aml_output_config_t);
outconfig.format = AML_OUTDATA_FLOAT32;
nn_output *outdata = (nn_output *)aml_module_output_get(context, outconfig);
if (!outdata)
{
std::cerr << "Failed to run network." << std::endl;
uninit_network(context);
return -1;
}
// 5. Postprocess
float *ori_boxes = (float *)outdata->out[0].buf; // 2254 * 12
float *raw_scores = (float *)outdata->out[1].buf; // 2254 * 1
std::vector<BlazePoseDetection> detections = postprocess(
ori_boxes,
raw_scores,
std::make_tuple(preprocessed, scale, pad),
SCORE_THRESHOLD,
NMS_THRESHOLD);
auto end_time = std::chrono::high_resolution_clock::now();
std::chrono::duration<double, std::milli> inference_time = end_time - start_time;
std::cout << "Inference time: " << inference_time.count() << " ms" << std::endl;
std::cout << "Detections: " << detections.size() << std::endl;
// 6. Draw and Save
cv::Mat result_img = draw_detections(img, detections);
cv::imwrite(DEFAULT_OUTPUT_PATH, result_img);
std::cout << "Result saved to " << DEFAULT_OUTPUT_PATH << std::endl;
// image_path -> txt_path
std::string txt_path = image_path.substr(0, image_path.find_last_of('.'));
txt_path += ".txt";
std::ofstream ofs(txt_path);
if (ofs.is_open())
{
for (const auto &det : detections)
{
for (int i = 0; i < NUM_COORDS + 1; ++i)
ofs << det.coords[i] << (i < NUM_COORDS ? " " : "\n");
}
}
std::cout << "Detections saved to " << txt_path << std::endl;
// 7. Cleanup
uninit_network(context);
return 0;
}

View file

@ -0,0 +1,306 @@
/*
* Copyright (C) 20242025 Amlogic, Inc. All rights reserved.
*
* Licensed under the Apache License, Version 2.0 (the "License");
* you may not use this file except in compliance with the License.
* You may obtain a copy of the License at
*
* http://www.apache.org/licenses/LICENSE-2.0
*
* Unless required by applicable law or agreed to in writing, software
* distributed under the License is distributed on an "AS IS" BASIS,
* WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
* See the License for the specific language governing permissions and
* limitations under the License.
*/
#include "postprocess.h"
#include <iostream>
#include <cmath>
#include <algorithm>
#include <unordered_map>
#define LOGI(...) \
do \
{ \
printf(__VA_ARGS__); \
printf("\n"); \
} while (0)
#define LOGE(...) \
do \
{ \
fprintf(stderr, __VA_ARGS__); \
fprintf(stderr, "\n"); \
} while (0)
// SHOW class names (1 classes)
const char *SHOW_CLASSES[1] = {"pose"};
inline float sigmoid(float x)
{
return 1.0f / (1.0f + std::exp(-x));
}
void decode_boxes(const float *ori_boxes, std::vector<std::vector<float>> &boxes)
{
const float x_scale = 224.0f;
const float y_scale = 224.0f;
const float h_scale = 224.0f;
const float w_scale = 224.0f;
boxes.resize(NUM_ANCHORS, std::vector<float>(NUM_COORDS, 0.0f));
for (int i = 0; i < NUM_ANCHORS; ++i)
{
float x_center = ori_boxes[i * NUM_COORDS + 0] / x_scale * anchors[i * 4 + 2] + anchors[i * 4 + 0];
float y_center = ori_boxes[i * NUM_COORDS + 1] / y_scale * anchors[i * 4 + 3] + anchors[i * 4 + 1];
float w = ori_boxes[i * NUM_COORDS + 2] / w_scale * anchors[i * 4 + 2];
float h = ori_boxes[i * NUM_COORDS + 3] / h_scale * anchors[i * 4 + 3];
boxes[i][0] = y_center - h / 2.0f;
boxes[i][1] = x_center - w / 2.0f;
boxes[i][2] = y_center + h / 2.0f;
boxes[i][3] = x_center + w / 2.0f;
for (int k = 0; k < 4; ++k)
{
int offset = 4 + k * 2;
float keypoint_x = ori_boxes[i * NUM_COORDS + offset] / x_scale * anchors[i * 4 + 2] + anchors[i * 4 + 0];
float keypoint_y = ori_boxes[i * NUM_COORDS + offset + 1] / y_scale * anchors[i * 4 + 3] + anchors[i * 4 + 1];
boxes[i][offset] = keypoint_x;
boxes[i][offset + 1] = keypoint_y;
}
}
}
void convert_output_to_detections(const float *ori_boxes, const float *ori_scores, std::vector<BlazePoseDetection> &detections, float min_score_thresh = 0.3f)
{
std::vector<std::vector<float>> decoded_boxes;
decode_boxes(ori_boxes, decoded_boxes);
detections.clear();
for (int i = 0; i < NUM_ANCHORS; ++i)
{
float s = sigmoid(std::min(std::max(ori_scores[i], -100.0f), 100.0f));
if (s < min_score_thresh)
continue;
BlazePoseDetection det;
for (int j = 0; j < NUM_COORDS; ++j)
det.coords[j] = decoded_boxes[i][j];
det.coords[NUM_COORDS] = s;
detections.push_back(det);
}
}
static inline float iou(const float *a, const float *b)
{
float xA = std::max(a[1], b[1]);
float yA = std::max(a[0], b[0]);
float xB = std::min(a[3], b[3]);
float yB = std::min(a[2], b[2]);
float interW = std::max(0.0f, xB - xA);
float interH = std::max(0.0f, yB - yA);
float inter = interW * interH;
float areaA = (a[3] - a[1]) * (a[2] - a[0]);
float areaB = (b[3] - b[1]) * (b[2] - b[0]);
float unionAB = areaA + areaB - inter;
if (unionAB <= 0.0f)
return 0.0f;
return inter / unionAB;
}
void weighted_nms(
std::vector<BlazePoseDetection> &detections, std::vector<BlazePoseDetection> &output, float iou_threshold = 0.3f)
{
output.clear();
if (detections.empty())
return;
std::sort(detections.begin(), detections.end(),
[](const BlazePoseDetection &a, const BlazePoseDetection &b)
{
return a.coords[NUM_COORDS] > b.coords[NUM_COORDS];
});
std::vector<bool> removed(detections.size(), false);
for (size_t i = 0; i < detections.size(); ++i)
{
if (removed[i])
continue;
std::vector<size_t> overlap_indices;
overlap_indices.push_back(i);
for (size_t j = i + 1; j < detections.size(); ++j)
{
if (removed[j])
continue;
if (iou(detections[i].coords, detections[j].coords) > iou_threshold)
overlap_indices.push_back(j);
}
float total_score = 0.0f;
std::vector<float> weighted(NUM_COORDS, 0.0f);
for (size_t idx : overlap_indices)
{
float score = detections[idx].coords[NUM_COORDS];
total_score += score;
for (int k = 0; k < NUM_COORDS; ++k)
weighted[k] += detections[idx].coords[k] * score;
removed[idx] = true;
}
BlazePoseDetection wdet;
for (int k = 0; k < NUM_COORDS; ++k)
wdet.coords[k] = weighted[k] / total_score;
wdet.coords[NUM_COORDS] = total_score / overlap_indices.size();
output.push_back(wdet);
}
}
std::tuple<cv::Mat, float, std::tuple<int, int>> preprocess(cv::Mat img, std::tuple<int, int> new_shape)
{
cv::Mat img_rgb;
if (img.empty())
{
LOGE("Preprocess received empty image");
return {};
}
// Convert to RGB
if (img.channels() == 4)
cv::cvtColor(img, img_rgb, cv::COLOR_RGBA2RGB);
else if (img.channels() == 3)
cv::cvtColor(img, img_rgb, cv::COLOR_BGR2RGB);
else
img_rgb = img.clone();
int orig_h = img.rows;
int orig_w = img.cols;
float scale = std::min(static_cast<float>(std::get<0>(new_shape)) / orig_h,
static_cast<float>(std::get<1>(new_shape)) / orig_w);
int new_h = static_cast<int>(round(orig_h * scale));
int new_w = static_cast<int>(round(orig_w * scale));
cv::Mat img_resized;
cv::resize(img_rgb, img_resized, cv::Size(new_w, new_h), 0, 0, cv::INTER_LINEAR);
int pad_h = std::get<0>(new_shape) - new_h;
int pad_w = std::get<1>(new_shape) - new_w;
int pad_left = static_cast<int>(round(pad_w / 2.0 - 0.1));
int pad_right = static_cast<int>(round(pad_w / 2.0 + 0.1));
int pad_top = static_cast<int>(round(pad_h / 2.0 - 0.1));
int pad_bottom = static_cast<int>(round(pad_h / 2.0 + 0.1));
cv::Mat img_padded;
cv::copyMakeBorder(img_resized, img_padded, pad_top, pad_bottom, pad_left, pad_right, cv::BORDER_CONSTANT, cv::Scalar(0, 0, 0));
cv::Mat img_float;
img_padded.convertTo(img_float, CV_32F, 1.0 / 127.5, -1.0);
scale = 1.0f / scale;
int pad_orig_h = static_cast<int>(pad_top * scale);
int pad_orig_w = static_cast<int>(pad_left * scale);
return std::make_tuple(img_float, scale, std::make_tuple(pad_orig_h, pad_orig_w));
}
cv::Mat quantize_input(const cv::Mat &float_img, float scale, int8_t zero_point)
{
if (float_img.empty() || float_img.type() != CV_32FC3)
{
LOGE("quantize_input: Invalid input image (must be CV_32FC3)");
return cv::Mat();
}
cv::Mat quantized_img(float_img.rows, float_img.cols, CV_8SC3);
const float *src_ptr = (const float *)float_img.data;
int8_t *dst_ptr = (int8_t *)quantized_img.data;
int total_elements = float_img.total() * float_img.channels();
for (int i = 0; i < total_elements; ++i)
{
dst_ptr[i] = static_cast<int8_t>(std::round(src_ptr[i] / scale + zero_point));
}
return quantized_img;
}
void denorm_detections(std::vector<float> &detection, float scale, const float pad[2])
{
detection[0] = detection[0] * scale * 224.0f - pad[0];
detection[1] = detection[1] * scale * 224.0f - pad[1];
detection[2] = detection[2] * scale * 224.0f - pad[0];
detection[3] = detection[3] * scale * 224.0f - pad[1];
for (size_t k = 4; k + 1 < detection.size(); k += 2)
{
detection[k] = detection[k] * scale * 224.0f - pad[1];
detection[k + 1] = detection[k + 1] * scale * 224.0f - pad[0];
}
}
std::vector<BlazePoseDetection> postprocess(float *ori_boxes, float *ori_scores,
std::tuple<cv::Mat, float, std::tuple<int, int>> input_tuple,
float conf_threshold, float iou_threshold)
{
float scale = std::get<1>(input_tuple);
int pad_left = std::get<0>(std::get<2>(input_tuple));
int pad_top = std::get<1>(std::get<2>(input_tuple));
float pad[2] = {static_cast<float>(pad_left), static_cast<float>(pad_top)};
std::vector<BlazePoseDetection> detections;
convert_output_to_detections(ori_boxes, ori_scores, detections, conf_threshold);
std::vector<BlazePoseDetection> filtered;
weighted_nms(detections, filtered, iou_threshold);
int pose_num = filtered.size();
for (size_t b = 0; b < pose_num; ++b)
{
std::vector<float> coords(filtered[b].coords, filtered[b].coords + NUM_COORDS + 1);
// mapping to original size
denorm_detections(coords, scale, pad);
for (size_t i = 0; i < NUM_COORDS + 1; ++i)
filtered[b].coords[i] = coords[i];
}
return filtered;
}
cv::Mat draw_detections(cv::Mat image, const std::vector<BlazePoseDetection> &detections)
{
cv::Mat drawn_image = image.clone();
int class_id = 0;
for (const auto &det : detections)
{
// Generate color based on class_id using HSV
float hue = fmod(class_id * 137.508f, 360.0f);
cv::Mat hsv(1, 1, CV_8UC3, cv::Scalar(hue / 2.0f, 204, 230));
cv::Mat rgb;
cv::cvtColor(hsv, rgb, cv::COLOR_HSV2BGR);
cv::Scalar color(rgb.at<cv::Vec3b>(0, 0)[0], rgb.at<cv::Vec3b>(0, 0)[1], rgb.at<cv::Vec3b>(0, 0)[2]);
// Draw bounding box
int x1 = static_cast<int>(det.coords[1]);
int y1 = static_cast<int>(det.coords[0]);
int x2 = static_cast<int>(det.coords[3]);
int y2 = static_cast<int>(det.coords[2]);
cv::rectangle(drawn_image, cv::Point(x1, y1), cv::Point(x2, y2), color, 2);
// Draw label
std::string label = std::string(SHOW_CLASSES[class_id]) + ": " + cv::format("%.2f", det.coords[12]);
int baseline = 0;
cv::Size text_size = cv::getTextSize(label, cv::FONT_HERSHEY_SIMPLEX, 0.6, 1, &baseline);
int label_x = x1;
int label_y = y1 - 5;
if (label_y < text_size.height)
label_y = x1 + text_size.height + 5;
// Draw label background
cv::rectangle(drawn_image,
cv::Point(label_x, label_y - text_size.height - baseline),
cv::Point(label_x + text_size.width, label_y + baseline),
color, cv::FILLED);
// Determine text color based on background brightness
int brightness = (color[0] + color[1] + color[2]) / 3;
cv::Scalar text_color = brightness < 128 ? cv::Scalar(255, 255, 255) : cv::Scalar(0, 0, 0);
cv::putText(drawn_image, label,
cv::Point(label_x, label_y),
cv::FONT_HERSHEY_SIMPLEX, 0.6, text_color, 1, cv::LINE_AA);
}
return drawn_image;
}

View file

@ -0,0 +1,52 @@
/*
* Copyright (C) 20242025 Amlogic, Inc. All rights reserved.
*
* Licensed under the Apache License, Version 2.0 (the "License");
* you may not use this file except in compliance with the License.
* You may obtain a copy of the License at
*
* http://www.apache.org/licenses/LICENSE-2.0
*
* Unless required by applicable law or agreed to in writing, software
* distributed under the License is distributed on an "AS IS" BASIS,
* WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
* See the License for the specific language governing permissions and
* limitations under the License.
*/
#ifndef _AMLNN_BLAZEPOSE_DETECT_POSTPROCESS_H_
#define _AMLNN_BLAZEPOSE_DETECT_POSTPROCESS_H_
#include <opencv2/opencv.hpp>
#include <vector>
#include <tuple>
#include <string>
#include "anchors.h"
#define NUM_COORDS 12
// BlazePoseDetection result structure
struct BlazePoseDetection
{
float coords[NUM_COORDS + 1]; // 12 coords + 1 score
};
// COCO class names (80 classes)
extern const char *COCO_CLASSES[80];
// Preprocess image with letterbox resizing
std::tuple<cv::Mat, float, std::tuple<int, int>> preprocess(cv::Mat img, std::tuple<int, int> new_shape);
// Quantize float32 image to int8 for model input
cv::Mat quantize_input(const cv::Mat &float_img, float scale = 0.007843137718737125, int8_t zero_point = -1);
// Postprocess blazepose_detect outputs with DFL decoding
std::vector<BlazePoseDetection> postprocess(float *raw_boxes, float *raw_scores,
std::tuple<cv::Mat, float, std::tuple<int, int>> input_tuple,
float conf_threshold, float iou_threshold);
// Draw detections on image
cv::Mat draw_detections(cv::Mat image, const std::vector<BlazePoseDetection> &detections);
#endif // _AMLNN_BLAZEPOSE_DETECT_POSTPROCESS_H_

0
examples/blazepose_detect/model/.gitkeep Normal file → Executable file
View file

View file

@ -0,0 +1,40 @@
#
# Copyright (C) 2026 Amlogic, Inc. All rights reserved.
#
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.
#
# 1. $1: set ADLA_TOOL_PATH
# 2. $2: set target-platform
# for A311D2 target-platform is PRODUCT_PID0XA003
# for S905X5 target-platform is PRODUCT_PID0XA005
# Usage: ./adla_convert.sh pose_detection.tflite /XXX/adla-toolkit-binary-3.2.9.3 PRODUCT_PID0XA005
model_path=$1
ADLA_TOOL_PATH=$2
target_platform=$3
echo "model_path:[$model_path]"
echo "ADLA_TOOL_PATH:[$ADLA_TOOL_PATH]"
echo "target-platform:[$target_platform]"
adla_convert=${ADLA_TOOL_PATH}/bin/adla_convert
$adla_convert --model-type tflite \
--model $model_path \
--inputs input_1 --input-shapes "224,224,3" \
--quantize-dtype int8 \
--source-file dataset_coco.txt \
--channel-mean-value "127.5,127.5,127.5,127.5" \
--target-platform $target_platform \
--disable-per-channel false

View file

@ -0,0 +1,50 @@
../../../resource/coco_dataset/000000000139.jpg
../../../resource/coco_dataset/000000000285.jpg
../../../resource/coco_dataset/000000000632.jpg
../../../resource/coco_dataset/000000000724.jpg
../../../resource/coco_dataset/000000000776.jpg
../../../resource/coco_dataset/000000000785.jpg
../../../resource/coco_dataset/000000000802.jpg
../../../resource/coco_dataset/000000000872.jpg
../../../resource/coco_dataset/000000000885.jpg
../../../resource/coco_dataset/000000001000.jpg
../../../resource/coco_dataset/000000001268.jpg
../../../resource/coco_dataset/000000001296.jpg
../../../resource/coco_dataset/000000001353.jpg
../../../resource/coco_dataset/000000001425.jpg
../../../resource/coco_dataset/000000001490.jpg
../../../resource/coco_dataset/000000001503.jpg
../../../resource/coco_dataset/000000001532.jpg
../../../resource/coco_dataset/000000001584.jpg
../../../resource/coco_dataset/000000001675.jpg
../../../resource/coco_dataset/000000001761.jpg
../../../resource/coco_dataset/000000001818.jpg
../../../resource/coco_dataset/000000001993.jpg
../../../resource/coco_dataset/000000002006.jpg
../../../resource/coco_dataset/000000002149.jpg
../../../resource/coco_dataset/000000002153.jpg
../../../resource/coco_dataset/000000002157.jpg
../../../resource/coco_dataset/000000002261.jpg
../../../resource/coco_dataset/000000002299.jpg
../../../resource/coco_dataset/000000002431.jpg
../../../resource/coco_dataset/000000002473.jpg
../../../resource/coco_dataset/000000002532.jpg
../../../resource/coco_dataset/000000002587.jpg
../../../resource/coco_dataset/000000002592.jpg
../../../resource/coco_dataset/000000002685.jpg
../../../resource/coco_dataset/000000002923.jpg
../../../resource/coco_dataset/000000003156.jpg
../../../resource/coco_dataset/000000003255.jpg
../../../resource/coco_dataset/000000003501.jpg
../../../resource/coco_dataset/000000003553.jpg
../../../resource/coco_dataset/000000003661.jpg
../../../resource/coco_dataset/000000003845.jpg
../../../resource/coco_dataset/000000003934.jpg
../../../resource/coco_dataset/000000004134.jpg
../../../resource/coco_dataset/000000004395.jpg
../../../resource/coco_dataset/000000004495.jpg
../../../resource/coco_dataset/000000004765.jpg
../../../resource/coco_dataset/000000004795.jpg
../../../resource/coco_dataset/000000005001.jpg
../../../resource/coco_dataset/000000005037.jpg
../../../resource/coco_dataset/000000005060.jpg

Binary file not shown.

After

Width:  |  Height:  |  Size: 2.4 MiB

0
examples/blazepose_detect/py/.gitkeep Normal file → Executable file
View file

View file

@ -0,0 +1,266 @@
#
# Copyright (C) 2026 Amlogic, Inc. All rights reserved.
#
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.
#
import numpy as np
import os
import glob
import argparse
import cv2
from pathlib import Path
from amlnnlite.api import AMLNNLite
def letterbox(img, new_shape=(224, 224), color=(0, 0, 0)):
shape = img.shape[:2] # [height, width]
scale = min(new_shape[0] / shape[0], new_shape[1] / shape[1])
new_unpad = (int(round(shape[1] * scale)), int(round(shape[0] * scale)))
pad_w = (new_shape[1] - new_unpad[0]) / 2
pad_h = (new_shape[0] - new_unpad[1]) / 2
if shape[::-1] != new_unpad:
img = cv2.resize(img, new_unpad, interpolation=cv2.INTER_LINEAR)
top, bottom = int(round(pad_h - 0.1)), int(round(pad_h + 0.1))
left, right = int(round(pad_w - 0.1)), int(round(pad_w + 0.1))
img = cv2.copyMakeBorder(img, top, bottom, left, right, cv2.BORDER_CONSTANT, value=color)
scale = 1. / scale
ori_left = left * scale
ori_top = top * scale
return img, scale, (ori_left, ori_top)
def preprocess(img_path, new_shape=(224, 224), data_format='NCHW', s=0.003921568859368563, zp=-128):
original_img = cv2.imread(str(img_path))
if original_img is None:
raise ValueError(f"can't read image: {img_path}")
processed_img, scale, pad = letterbox(original_img, new_shape)
rgb_img = cv2.cvtColor(processed_img, cv2.COLOR_BGR2RGB)
normalized_img = rgb_img.astype(np.float32) / 127.5 - 1.
if data_format == 'NCHW':
# HWC -> CHW -> BCHW (ONNX default format)
input_tensor = np.transpose(normalized_img, (2, 0, 1))
input_tensor = np.expand_dims(input_tensor, axis=0)
elif data_format == 'NHWC':
# HWC -> BHWC (TFLITE default format)
input_tensor = np.expand_dims(normalized_img, axis=0)
else:
raise ValueError(f"Unsupported data format: {data_format}. Only 'NCHW' and 'NHWC' are supported.")
# Quantize to int8
input_tensor = np.round(input_tensor / s + zp).astype(np.int8)
return input_tensor, original_img, scale, pad
def postprocess(outputs, scale, pad, data_format='NCHW', anchor_path='anchors.npy', score_threshold=0.5, nms_threshold=0.3):
all_boxes = []
all_scores = []
raw_box = outputs[0] # (1, 2254, 12)
raw_score = outputs[1] # (1, 2254, 1)
anchors = np.load(anchor_path).astype("float32")
# all_boxes = decode_boxes(raw_box, anchors)
# anchors: [N, 4] -> x, y, w, h
anc_x, anc_y, anc_w, anc_h = anchors.T
# raw_box shape: [..., K]
all_boxes = np.zeros_like(raw_box)
# box center & size
x_center = raw_box[..., 0] / 224.0 * anc_w + anc_x
y_center = raw_box[..., 1] / 224.0 * anc_h + anc_y
w = raw_box[..., 2] / 224.0 * anc_w
h = raw_box[..., 3] / 224.0 * anc_h
# bbox: ymin, xmin, ymax, xmax
all_boxes[..., 0] = y_center - 0.5 * h
all_boxes[..., 1] = x_center - 0.5 * w
all_boxes[..., 2] = y_center + 0.5 * h
all_boxes[..., 3] = x_center + 0.5 * w
# keypoints (4 points, each has x/y)
for k in range(4):
idx = 4 + k * 2
all_boxes[..., idx] = raw_box[..., idx] / 224.0 * anc_w + anc_x
all_boxes[..., idx + 1] = raw_box[..., idx + 1] / 224.0 * anc_h + anc_y
thresh = 100.0
raw_score = raw_score.clip(-thresh, thresh)
# Apply sigmoid activation to class scores
all_scores = 1.0 / (1.0 + np.exp(-raw_score)).squeeze(axis=-1)
print(f"all_scores {all_scores}")
print(f"max(all_scores) {max(all_scores[0])}")
mask = all_scores >= score_threshold
# Merge all scales
final_boxes = np.concatenate(all_boxes, axis=0)
final_scores = np.concatenate(all_scores, axis=0)
# Filter by confidence threshold
valid_mask = final_scores > score_threshold
if not np.any(valid_mask):
return []
valid_boxes = final_boxes[valid_mask]
valid_scores = final_scores[valid_mask]
# Map coordinates back to original image
pad_x, pad_y = pad
s = scale * 224
valid_boxes[:, [0, 2]] = valid_boxes[:, [0, 2]] * s - pad_x
valid_boxes[:, [1, 3]] = valid_boxes[:, [1, 3]] * s - pad_y
valid_boxes[:, 4::2] = valid_boxes[:, 4::2] * s - pad_y
valid_boxes[:, 5::2] = valid_boxes[:, 5::2] * s - pad_x
valid_boxes = np.maximum(valid_boxes, 0)
# NMS
if len(valid_boxes) > 0:
nms_indices = cv2.dnn.NMSBoxes(
valid_boxes.tolist(), valid_scores.tolist(), score_threshold, nms_threshold
)
if len(nms_indices) > 0:
nms_indices = nms_indices.flatten()
detections = []
for idx in nms_indices:
x1, y1, x2, y2 = valid_boxes[idx, :4]
confidence = valid_scores[idx]
# x_center = (valid_boxes[:,1] + valid_boxes[:,3]) / 2
# y_center = (valid_boxes[:,0] + valid_boxes[:,2]) / 2
# scale = (valid_boxes[:,3] - valid_boxes[:,1]) # assumes square boxes
detections.append({
'bbox': [float(x1), float(y1), float(x2), float(y2)],
'confidence': float(confidence)
})
return detections
return []
def get_class_color(class_id):
import colorsys
hue = (class_id * 137.508) % 360
rgb = colorsys.hsv_to_rgb(hue/360.0, 0.8, 0.9)
bgr = (int(rgb[2]*255), int(rgb[1]*255), int(rgb[0]*255))
return bgr
def draw_detections(img, detections, save_path):
result_img = img.copy()
for det in detections:
x1, y1, x2, y2 = [int(coord) for coord in det['bbox']]
confidence = det['confidence']
class_name = det['class_name']
class_id = det['class_id']
color = get_class_color(class_id)
cv2.rectangle(result_img, (x1, y1), (x2, y2), color, 2)
label = f"{class_name}: {confidence:.2f}"
(label_w, label_h), _ = cv2.getTextSize(label, cv2.FONT_HERSHEY_SIMPLEX, 0.6, 1)
cv2.rectangle(result_img, (x1, y1 - label_h - 10), (x1 + label_w, y1), color, -1)
text_color = (255, 255, 255) if sum(color) < 400 else (0, 0, 0)
cv2.putText(result_img, label, (x1, y1 - 5), cv2.FONT_HERSHEY_SIMPLEX, 0.6, text_color, 1)
cv2.imwrite(save_path, result_img)
return result_img
def main():
parser = argparse.ArgumentParser()
parser.add_argument('--model-path', default='./blazepose_detect_int8_A311D2.adla')
parser.add_argument('--run-cycles', default= 1, type=int)
args = parser.parse_args()
# Initialize AMLNNLite
amlnn = AMLNNLite()
amlnn.config(
model_path=args.model_path, # Model file path, Support ADLA and quantized TFlite models
run_cycles=args.run_cycles
)
amlnn.init()
# Find all image files in the 01_export_model directory
image_dir = "./"
image_extensions = ["*.jpg", "*.jpeg", "*.png", "*.bmp"]
image_files = []
for ext in image_extensions:
image_files.extend(glob.glob(os.path.join(image_dir, ext)))
image_files.extend(glob.glob(os.path.join(image_dir, ext.upper())))
if not image_files:
print("No image files found in", image_dir)
amlnn.uninit()
return
print(f"Found {len(image_files)} image files to process:")
for img_file in image_files:
print(f" - {os.path.basename(img_file)}")
print()
# Process each image
for i, image_path in enumerate(image_files, 1):
print(f"=" * 60)
print(f"Processing image {i}/{len(image_files)}: {os.path.basename(image_path)}")
print(f"=" * 60)
try:
# Preprocess input
input_tensor, original_img, scale, pad = preprocess(image_path, new_shape=(224, 224), data_format='NHWC', s=0.007843137718737125, zp=-1)
# Run inference
outputs = amlnn.inference(inputs=[input_tensor])
# Postprocess results
detections = postprocess(outputs, scale, pad, data_format='NHWC', score_threshold=0.5, nms_threshold=0.3)
# Print detection results
if detections:
print(f" Detected {len(detections)} objects:")
for i, det in enumerate(detections, 1):
print(f" {i}. {det['class_name']} ({det['confidence']:.2f})")
else:
print(" No objects detected")
# Save result image
model_name = Path(args.model_path).stem
result_dir = f"{model_name}_result"
os.makedirs(result_dir, exist_ok=True)
img_name = Path(image_path).stem
save_path = os.path.join(result_dir, f"{img_name}_result.jpg")
draw_detections(original_img, detections, str(save_path))
print(f" Result saved to: {save_path}")
except Exception as e:
print(f"Error processing {os.path.basename(image_path)}: {e}")
print()
# Optional visualization
amlnn.visualize()
# Release resources
amlnn.uninit()
if __name__ == "__main__":
main()

Binary file not shown.

After

Width:  |  Height:  |  Size: 452 KiB

View file

@ -0,0 +1,104 @@
# blazepose_landmark
## 1.Overview
BlazePose Landmark builds upon the detected ROI to predict 33 precise body keypoints, enabling full-body pose estimation with high accuracy and temporal stability. Its efficient design makes it suitable for applications such as fitness tracking, motion analysis, augmented reality, and real-time humancomputer interaction.
## 2.Model Download
- **Open Source model**
- **Open Source projects:** https://github.com/google-ai-edge/mediapipe/tree/master
- **Download weights**
wget https://storage.googleapis.com/mediapipe-assets/pose_landmark_full.tflite
## 3. Model Conversion
```
cd model
Usage: ./adla_convert.sh model_path adla_toolkit_path target_platform
example
./adla_convert.sh pose_detection.tflite /xxxx/adla-toolkit-binary-3.2.9.3 PRODUCT_PID0XA005
./adla_convert.sh pose_detection.tflite /xxxx/adla-toolkit-binary-3.2.9.3 PRODUCT_PID0XA005
./adla_convert.sh pose_detection.tflite /xxxx/adla-toolkit-binary-3.2.9.3 PRODUCT_PID0XA005
```
| Parameter | Description |
| ----------------- | ------------------------------------------------------------ |
| model_path | onnx model path |
| adla_toolkit_path | path to adla_toolkit |
| target_platform | Specify target platform. for A311D2: PRODUCT_PID0XA003. for S905X5: PRODUCT_PID0XA005 |
## 4. Demo Run
### CPP
#### 1. Compile
**Prerequisites:**
- Android NDK (r25e recommended)
- `ANDROID_NDK_PATH` environment variable set
**Build:**
```bash
# Build for arm64-v8a
cd examples/blazepose_landmark/cpp
./build-android.sh -a arm64-v8a
```
The executable will be generated at `build/android/blazepose_landmark_demo` (Note: executable name may vary, verify in build folder).
#### 2. Run
```bash
# Push executable to device
adb push build/android/blazepose_landmark_demo /data/local/tmp/
adb push model/blazepose_landmark_full_int16_A311D2.adla /data/local/tmp/
adb push test_image.jpg /data/local/tmp/
# Run on device
adb shell
cd /data/local/tmp
chmod +x blazepose_landmark_demo
export LD_LIBRARY_PATH=/vendor/lib64 or (/vendor/lib)
# Usage: ./blazepose_landmark_demo <model_path> <image_path>
./blazepose_landmark_demo blazepose_landmark_full_int16_A311D2.adla test_image.jpg"
```
**Note:** Replace `blazepose_landmark_full_int16_A311D2.adla` with your actual model file path.
### Python
**Prerequisites:**
- Python 3.10
- Required packages: `numpy`, `opencv-python`, `amlnnlite`
**Install dependencies:**
```bash
pip install numpy opencv-python amlnnlite-1.0.0-cp310-cp310-linux_aarch64.whl
```
**Run on device:**
```bash
python blazepose_landmark.py --model-path ./blazepose_landmark_full_int16_A311D2.adla
```
The script will automatically process all image files (`.jpg`, `.jpeg`, `.png`, `.bmp`) in the current directory and save results to a `{model_name}_result` folder.
## 5.Results
The program will print the detection count and inference time. The result image with bounding boxes will be saved to the specified output path (`result.jpg` by default).
You can pull the result image back to view it:
```bash
adb pull result.jpg.
```
![alt text](result.jpg)

0
examples/blazepose_landmark/cpp/.gitkeep Normal file → Executable file
View file

View file

@ -0,0 +1,77 @@
#!/bin/bash
set -e
#
# Copyright (C) 20242025 Amlogic, Inc. All rights reserved.
#
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.
#
usage() {
echo "Usage: $0 [-a <target_abi>]"
echo " -a <target_abi> : Target ABI (default: arm64-v8a)"
echo " -h : Show this help message"
exit 1
}
# Default values
TARGET_ABI=arm64-v8a
# Parse arguments
while getopts 'a:h' opt; do
case "$opt" in
a)
TARGET_ABI=$OPTARG
;;
h)
usage
;;
*)
usage
;;
esac
done
if [ -z "${ANDROID_NDK_PATH}" ]; then
if [ -n "${ANDROID_NDK}" ]; then
ANDROID_NDK_PATH=${ANDROID_NDK}
elif [ -n "${ANDROID_NDK_HOME}" ]; then
ANDROID_NDK_PATH=${ANDROID_NDK_HOME}
else
echo "Error: ANDROID_NDK_PATH is not set."
echo "Please set ANDROID_NDK_PATH to your Android NDK directory."
exit 1
fi
fi
ROOT_PWD=$(cd "$(dirname $0)" && pwd)
BUILD_DIR=${ROOT_PWD}/build/android
echo "Building for Android..."
echo "NDK_PATH: ${ANDROID_NDK_PATH}"
echo "TARGET_ABI: ${TARGET_ABI}"
echo "BUILD_DIR: ${BUILD_DIR}"
mkdir -p ${BUILD_DIR}
cd ${BUILD_DIR}
cmake ../../src \
-DCMAKE_TOOLCHAIN_FILE=${ANDROID_NDK_PATH}/build/cmake/android.toolchain.cmake \
-DANDROID_ABI=${TARGET_ABI} \
-DANDROID_PLATFORM=android-24 \
-DCMAKE_BUILD_TYPE=Release \
-DOpenCV_DIR=${ROOT_PWD}/../../../dependency/opencv/opencv-android-sdk-build/sdk/native/jni/abi-${TARGET_ABI}
make -j4
echo "Build complete. Executable in ${BUILD_DIR}/blazepose_landmark_demo"

View file

@ -0,0 +1,168 @@
#!/bin/bash
set -e
#
# Copyright (C) 20242025 Amlogic, Inc. All rights reserved.
#
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.
#
usage() {
echo "Usage: $0 [-m <mode>] [-a <target_arch>] [-b <arch_bits>] [-s <yocto_sdk_root>] [-t <toolchain_file>]"
echo " -m <mode> : Build mode: 'linux' or 'yocto' (default: linux)"
echo " -a <target> : Target arch for linux mode (default: aarch64)"
echo " -b <arch_bits> : Arch bits for yocto mode: 32 or 64 (default: 64)"
echo " -s <sdk_root> : Yocto SDK root path (overrides YOCTO_SDK_ROOT env var)"
echo " -t <toolchain> : CMake toolchain file (overrides TOOLCHAIN_FILE env var)"
echo " -h : Show this help message"
exit 1
}
# Default values
BUILD_MODE=linux
TARGET_ARCH=aarch64
ARCH_BITS=64
CLI_SDK_ROOT=""
CLI_TOOLCHAIN_FILE=""
# Parse arguments
while getopts 'm:a:b:s:t:h' opt; do
case "$opt" in
m)
BUILD_MODE=$OPTARG
;;
a)
TARGET_ARCH=$OPTARG
;;
b)
ARCH_BITS=$OPTARG
;;
s)
CLI_SDK_ROOT=$OPTARG
;;
t)
CLI_TOOLCHAIN_FILE=$OPTARG
;;
h)
usage
;;
*)
usage
;;
esac
done
ROOT_PWD=$(cd "$(dirname $0)" && pwd)
# ===========================================================================
# Yocto build
# ===========================================================================
if [[ "${BUILD_MODE}" == "yocto" ]]; then
if [[ "${ARCH_BITS}" != "32" && "${ARCH_BITS}" != "64" ]]; then
echo "Unsupported ARCH_BITS \"${ARCH_BITS}\". Must be 32 or 64." >&2
exit 1
fi
# Configurable via environment variables (CLI args > env vars > defaults)
CMAKE_BIN="${CMAKE_BIN:-cmake}"
YOCTO_SDK_ROOT="${CLI_SDK_ROOT:-${YOCTO_SDK_ROOT:-/data/yuandian/tools/poky/4.0.20}}"
TOOLCHAIN_FILE="${CLI_TOOLCHAIN_FILE:-${TOOLCHAIN_FILE:-${ROOT_PWD}/../../cmake/yocto-toolchain.cmake}}"
# Export variables for CMake
export YOCTO_SDK_ROOT
export ARCH_BITS
BUILD_DIR="${ROOT_PWD}/build/yocto/${ARCH_BITS}"
echo "==> Building Yocto ${ARCH_BITS}-bit"
echo " toolchain : ${TOOLCHAIN_FILE}"
echo " SDK root : ${YOCTO_SDK_ROOT}"
echo " BUILD_DIR : ${BUILD_DIR}"
mkdir -p "${BUILD_DIR}"
rm -rf "${BUILD_DIR}"
# Select OpenCV based on target architecture
if [[ "${ARCH_BITS}" == "32" ]]; then
OPENCV_DIR="${ROOT_PWD}/../../../dependency/opencv/opencv-linux-armhf/share/OpenCV"
else
OPENCV_DIR="${ROOT_PWD}/../../../dependency/opencv/opencv-linux-aarch64/share/OpenCV"
fi
"${CMAKE_BIN}" \
-S "${ROOT_PWD}/src" \
-B "${BUILD_DIR}" \
-DCMAKE_TOOLCHAIN_FILE="${TOOLCHAIN_FILE}" \
-DYOCTO_SDK_ROOT="${YOCTO_SDK_ROOT}" \
-DARCH_BITS="${ARCH_BITS}" \
-DCMAKE_BUILD_TYPE=Release \
-DOpenCV_DIR="${OPENCV_DIR}"
"${CMAKE_BIN}" --build "${BUILD_DIR}" --config Release
# Strip (best-effort)
HOST_SYSROOT="${YOCTO_SDK_ROOT}/sysroots/x86_64-pokysdk-linux"
if [[ "${ARCH_BITS}" == "32" ]]; then
CROSS_TRIPLE="arm-poky-linux-gnueabi"
else
CROSS_TRIPLE="aarch64-poky-linux"
fi
STRIP_TOOL="${HOST_SYSROOT}/usr/bin/${CROSS_TRIPLE}/${CROSS_TRIPLE}-strip"
if [[ -x "${STRIP_TOOL}" ]]; then
"${STRIP_TOOL}" --strip-unneeded "${BUILD_DIR}/blazepose_landmark_demo"
else
echo "warning: strip tool not found; keeping debug info." >&2
fi
echo "Build complete. Executable in ${BUILD_DIR}/blazepose_landmark_demo"
exit 0
fi
# ===========================================================================
# Standard Linux cross-compile build
# ===========================================================================
# Default to aarch64-linux-gnu if GCC_COMPILER is not set
GCC_COMPILER=${GCC_COMPILER:-aarch64-linux-gnu}
# Set compilers
export CC=${GCC_COMPILER}-gcc
export CXX=${GCC_COMPILER}-g++
# Validate compiler
if ! command -v ${CC} &> /dev/null; then
echo "Error: Compiler ${CC} not found."
echo "Please set GCC_COMPILER environment variable to your cross-compiler path prefix."
echo "Example: export GCC_COMPILER=/path/to/toolchain/bin/aarch64-linux-gnu"
exit 1
fi
BUILD_DIR=${ROOT_PWD}/build/linux
echo "Building for Linux..."
echo "COMPILER: ${CC}"
echo "TARGET_ARCH: ${TARGET_ARCH}"
echo "BUILD_DIR: ${BUILD_DIR}"
mkdir -p ${BUILD_DIR}
cd ${BUILD_DIR}
cmake ../../src \
-DCMAKE_SYSTEM_NAME=Linux \
-DCMAKE_SYSTEM_PROCESSOR=${TARGET_ARCH} \
-DCMAKE_BUILD_TYPE=Release
make -j4
echo "Build complete. Executable in ${BUILD_DIR}/blazepose_landmark_demo"

View file

@ -0,0 +1,36 @@
cmake_minimum_required(VERSION 3.10...3.27)
project(blazepose_landmark_demo)
set(CMAKE_CXX_STANDARD 17)
list(APPEND CMAKE_MODULE_PATH "${CMAKE_SOURCE_DIR}/../../../../cmake")
find_package(AMLNN REQUIRED)
include_directories(${AMLNN_INCLUDE_DIR})
link_directories(${AMLNN_LIBRARY_DIR})
include_directories(${CMAKE_SOURCE_DIR}/../../../../common)
# Set 3rdparty path
set(3RDPARTY_DIR "${CMAKE_SOURCE_DIR}/../../../../dependency")
if(CMAKE_SYSTEM_NAME STREQUAL "Android")
# Android needs log
link_libraries(log)
endif()
# Find OpenCV
message(STATUS "OpenCV_DIR: ${OpenCV_DIR}")
find_package(OpenCV REQUIRED)
include_directories(${OpenCV_INCLUDE_DIRS})
add_executable(blazepose_landmark_demo
main.cpp
postprocess.cpp
postprocess.h
${CMAKE_SOURCE_DIR}/../../../../common/model_loader.cpp
)
target_link_libraries(blazepose_landmark_demo
${OpenCV_LIBS}
${AMLNN_LIBRARY}
)

View file

@ -0,0 +1,141 @@
/*
* Copyright (C) 20242025 Amlogic, Inc. All rights reserved.
*
* Licensed under the Apache License, Version 2.0 (the "License");
* you may not use this file except in compliance with the License.
* You may obtain a copy of the License at
*
* http://www.apache.org/licenses/LICENSE-2.0
*
* Unless required by applicable law or agreed to in writing, software
* distributed under the License is distributed on an "AS IS" BASIS,
* WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
* See the License for the specific language governing permissions and
* limitations under the License.
*/
#include <iostream>
#include <string>
#include <vector>
#include <chrono>
#include <tuple>
#include <iomanip>
#include <fstream>
#include <opencv2/opencv.hpp>
#include "postprocess.h"
#include "model_loader.h"
const std::string DEFAULT_OUTPUT_PATH = "./result.jpg";
const int MODEL_INPUT_WIDTH = 256;
const int MODEL_INPUT_HEIGHT = 256;
const float SCORE_THRESHOLD = 0.5f;
int main(int argc, char **argv)
{
std::string model_path;
std::string image_path;
if (argc != 3)
{
printf("%s <model_path> <image_path>\n", argv[0]);
return -1;
}
if (argc > 1)
model_path = argv[1];
if (argc > 2)
image_path = argv[2];
std::cout << "Blazepose Detect Demo" << std::endl;
std::cout << "Model: " << model_path << std::endl;
std::cout << "Image: " << image_path << std::endl;
std::cout << "Output: " << DEFAULT_OUTPUT_PATH << std::endl;
// 1. Load Image
cv::Mat img = cv::imread(image_path);
if (img.empty())
{
std::cerr << "Failed to load image from " << image_path << std::endl;
return -1;
}
// Load detections
// n * 13 detections
// image_path -> txt_path
std::vector<std::vector<float>> detections;
std::string txt_path = image_path.substr(0, image_path.find_last_of('.'));
txt_path += ".txt";
std::ifstream ifs(txt_path);
for (std::string line; std::getline(ifs, line);)
{
std::istringstream iss(line);
std::vector<float> det;
float val;
while (iss >> val)
det.push_back(val);
if (!det.empty())
detections.push_back(det);
}
// 2. Initialize Network
void *context = init_network(model_path.c_str());
if (!context)
{
std::cerr << "Failed to initialize network." << std::endl;
return -1;
}
// 3. Preprocess
auto start_time = std::chrono::high_resolution_clock::now();
auto [preprocessed, affine] = preprocess(img, detections, std::make_tuple(MODEL_INPUT_HEIGHT, MODEL_INPUT_WIDTH));
// Quantize to int16 (model expects quantized input)
cv::Mat quantized_img = quantize_input(preprocessed, 0.000030518509447574615f);
// 4. Set input and run inference
nn_input inData;
memset(&inData, 0, sizeof(nn_input));
inData.input_type = BINARY_RAW_DATA;
inData.input = quantized_img.data;
inData.input_index = 0;
inData.size = quantized_img.total() * quantized_img.elemSize();
if (aml_module_input_set(context, &inData) != 0)
{
std::cerr << "Failed to set input." << std::endl;
uninit_network(context);
return -1;
}
aml_output_config_t outconfig;
memset(&outconfig, 0, sizeof(aml_output_config_t));
outconfig.typeSize = sizeof(aml_output_config_t);
outconfig.format = AML_OUTDATA_FLOAT32;
nn_output *outdata = (nn_output *)aml_module_output_get(context, outconfig);
if (!outdata)
{
std::cerr << "Failed to run network." << std::endl;
uninit_network(context);
return -1;
}
// 5. Postprocess
std::vector<BlazePoseLandmark> landmarks = postprocess(outdata, affine);
auto end_time = std::chrono::high_resolution_clock::now();
std::chrono::duration<double, std::milli> inference_time = end_time - start_time;
std::cout << "Inference time: " << inference_time.count() << " ms" << std::endl;
std::cout << "Landmarks: " << landmarks.size() << std::endl;
// 6. Draw and Save
cv::Mat result_img = draw_landmarks(img, landmarks, SCORE_THRESHOLD);
cv::imwrite(DEFAULT_OUTPUT_PATH, result_img);
std::cout << "Result saved to " << DEFAULT_OUTPUT_PATH << std::endl;
// 7. Cleanup
uninit_network(context);
return 0;
}

View file

@ -0,0 +1,337 @@
/*
* Copyright (C) 20242025 Amlogic, Inc. All rights reserved.
*
* Licensed under the Apache License, Version 2.0 (the "License");
* you may not use this file except in compliance with the License.
* You may obtain a copy of the License at
*
* http://www.apache.org/licenses/LICENSE-2.0
*
* Unless required by applicable law or agreed to in writing, software
* distributed under the License is distributed on an "AS IS" BASIS,
* WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
* See the License for the specific language governing permissions and
* limitations under the License.
*/
#include "postprocess.h"
#include <iostream>
#include <cmath>
#include <algorithm>
#include <unordered_map>
#define LOGI(...) \
do \
{ \
printf(__VA_ARGS__); \
printf("\n"); \
} while (0)
#define LOGE(...) \
do \
{ \
fprintf(stderr, __VA_ARGS__); \
fprintf(stderr, "\n"); \
} while (0)
// SHOW class names (1 classes)
const char *SHOW_CLASSES[1] = {"lm"};
inline float sigmoid(float x)
{
return 1.0f / (1.0f + std::exp(-x));
}
struct ROI
{
float x_center;
float y_center;
float scale;
float theta;
};
ROI detection_to_roi(const std::vector<float> &detection, int kp1 = 0, int kp2 = 1)
{
float theta0 = 90.f * M_PI / 180.f;
float dscale = 1.1f; // 1.0 * 256 / 224; // 1.1f;
float dy = 0.f;
float x_center = detection[4 + 2 * kp1];
float y_center = detection[4 + 2 * kp1 + 1];
float x1 = detection[4 + 2 * kp2];
float y1 = detection[4 + 2 * kp2 + 1];
float roi_scale = std::sqrt((x_center - x1) * (x_center - x1) + (y_center - y1) * (y_center - y1)) * 2.f;
y_center += dy * roi_scale;
roi_scale *= dscale;
float theta = std::atan2(detection[4 + 2 * kp1 + 1] - detection[4 + 2 * kp2 + 1], detection[4 + 2 * kp1] - detection[4 + 2 * kp2]) - theta0;
return {x_center, y_center, roi_scale, theta};
}
cv::Mat extract_roi(cv::Mat &frame, const ROI &roi, int resolution, cv::Mat &affine)
{
cv::Point2f src_pts[3];
src_pts[0] = cv::Point2f(-roi.scale / 2.f, -roi.scale / 2.f); // will map to (0,0)
src_pts[1] = cv::Point2f(-roi.scale / 2.f, roi.scale / 2.f); // will map to (0,res-1)
src_pts[2] = cv::Point2f(roi.scale / 2.f, -roi.scale / 2.f); // will map to (res-1,0)
float cos_theta = std::cos(roi.theta);
float sin_theta = std::sin(roi.theta);
for (int i = 0; i < 3; i++)
{
float x = src_pts[i].x;
float y = src_pts[i].y;
src_pts[i].x = roi.x_center + x * cos_theta - y * sin_theta;
src_pts[i].y = roi.y_center + x * sin_theta + y * cos_theta;
}
cv::Point2f dst_pts[3] = {
cv::Point2f(0.f, 0.f),
cv::Point2f(0.f, resolution - 1.f),
cv::Point2f(resolution - 1.f, 0.f)};
cv::Mat roi_img;
cv::Mat M = cv::getAffineTransform(src_pts, dst_pts);
cv::invertAffineTransform(M, affine);
cv::warpAffine(frame, roi_img, M, cv::Size(resolution, resolution));
return roi_img;
}
std::tuple<cv::Mat, cv::Mat> preprocess(cv::Mat img, std::vector<std::vector<float>> &detections, std::tuple<int, int> new_shape)
{
cv::Mat img_rgb;
if (img.empty())
{
LOGE("Preprocess received empty image");
return {};
}
// Convert to RGB
if (img.channels() == 4)
cv::cvtColor(img, img_rgb, cv::COLOR_RGBA2RGB);
else if (img.channels() == 3)
cv::cvtColor(img, img_rgb, cv::COLOR_BGR2RGB);
else
img_rgb = img.clone();
ROI roi = detection_to_roi(detections[0]); // get the first bounding box
cv::Mat affine;
cv::Mat roi_img = extract_roi(img_rgb, roi, IMAGE_SIZE, affine);
cv::Mat img_float;
roi_img.convertTo(img_float, CV_32F, 1.0 / 255.0);
return std::make_tuple(img_float, affine);
}
cv::Mat quantize_input(const cv::Mat &float_img, float scale, int16_t zero_point)
{
if (float_img.empty() || float_img.type() != CV_32FC3)
{
LOGE("quantize_input: Invalid input image (must be CV_32FC3)");
return cv::Mat();
}
cv::Mat quantized_img(float_img.rows, float_img.cols, CV_16SC3);
const float *src_ptr = (const float *)float_img.data;
int16_t *dst_ptr = (int16_t *)quantized_img.data;
int total_elements = float_img.total() * float_img.channels();
// for (int i = 0; i < total_elements; ++i)
// {
// dst_ptr[i] = static_cast<int16_t>(std::round(src_ptr[i] / scale + zero_point));
// }
for (int i = 0; i < total_elements; ++i)
{
int32_t q = static_cast<int32_t>(std::round(src_ptr[i] / scale));
q = std::max(-32768, std::min(32767, q));
dst_ptr[i] = static_cast<int16_t>(q);
}
return quantized_img;
}
void blazepose_postprocess(const float *landmarks, float *normalized_landmarks)
{
if (!landmarks || !normalized_landmarks)
return;
for (int j = 0; j < NUM_LANDMARKS; j++)
{
float x = landmarks[j * LANDMARK_FEATURE_DIM + 0] / IMAGE_SIZE;
float y = landmarks[j * LANDMARK_FEATURE_DIM + 1] / IMAGE_SIZE;
float z = landmarks[j * LANDMARK_FEATURE_DIM + 2] / IMAGE_SIZE;
float visibility = landmarks[j * LANDMARK_FEATURE_DIM + 3];
float presence = landmarks[j * LANDMARK_FEATURE_DIM + 4];
float score = sigmoid(fminf(visibility, presence));
normalized_landmarks[j * LANDMARK_OUT_DIM + 0] = x;
normalized_landmarks[j * LANDMARK_OUT_DIM + 1] = y;
normalized_landmarks[j * LANDMARK_OUT_DIM + 2] = z;
normalized_landmarks[j * LANDMARK_OUT_DIM + 3] = score;
}
}
/**
* Denormalize landmarks: map normalized coordinates back to original image using affine
* @param landmarks Input/Output: [NUM_LANDMARKS * LANDMARK_OUT_DIM], first three dimensions are x, y, z
* @param affine Input: [2 x 3] affine matrix (CV_32F)
*/
void blazepose_denorm_landmarks(float *landmarks, const cv::Mat &affine)
{
if (!landmarks || affine.empty() || affine.rows != 2 || affine.cols != 3)
{
return;
}
const double *a = affine.ptr<double>();
double a00 = a[0], a01 = a[1], a02 = a[2];
double a10 = a[3], a11 = a[4], a12 = a[5];
for (int j = 0; j < NUM_LANDMARKS; j++)
{
float *p = landmarks + j * LANDMARK_OUT_DIM;
// scale to input resolution
float x = p[0] * IMAGE_SIZE;
float y = p[1] * IMAGE_SIZE;
float z = p[2] * IMAGE_SIZE;
// apply affine transform
float new_x = a00 * x + a01 * y + a02;
float new_y = a10 * x + a11 * y + a12;
p[0] = new_x;
p[1] = new_y;
p[2] = z;
}
}
std::vector<BlazePoseLandmark> postprocess(nn_output *outdata, const cv::Mat &affine)
{
// keep all outputs, even if unused
float *world_landmarks = (float *)outdata->out[0].buf;
float *heatmap = (float *)outdata->out[1].buf;
float *flags = (float *)outdata->out[2].buf;
float *landmarks = (float *)outdata->out[4].buf;
float *normalized_landmarks =
new float[NUM_LANDMARKS * LANDMARK_OUT_DIM]();
blazepose_postprocess(landmarks, normalized_landmarks);
// refine_landmark_from_heatmap(normalized_landmarks, 39, heatmap, 64, 64);
blazepose_denorm_landmarks(normalized_landmarks, affine);
std::vector<BlazePoseLandmark> pose_res;
pose_res.reserve(1);
BlazePoseLandmark pose;
pose.landmarks.resize(NUM_LANDMARKS);
for (int i = 0; i < NUM_LANDMARKS; ++i)
{
int base = i * LANDMARK_OUT_DIM;
double x = normalized_landmarks[base + 0]; // x
double y = normalized_landmarks[base + 1]; // y
double z = normalized_landmarks[base + 2]; // z
double score = normalized_landmarks[base + 3]; // score
pose.landmarks[i] = {x, y, z, score};
}
pose_res.push_back(pose);
delete[] normalized_landmarks;
normalized_landmarks = nullptr;
return pose_res;
}
static const std::vector<std::pair<int, int>> POSE_CONNECTIONS = {
// Face
{0, 1},
{1, 2},
{2, 3},
{3, 7},
{0, 4},
{4, 5},
{5, 6},
{6, 8},
// Mouth
{9, 10},
// Shoulders
{11, 12},
// Right arm
{11, 13},
{13, 15},
{15, 17},
{15, 19},
{15, 21},
{17, 19},
// Left arm
{12, 14},
{14, 16},
{16, 18},
{16, 20},
{16, 22},
{18, 20},
// Torso
{11, 23},
{12, 24},
{23, 24},
// Right leg
{23, 25},
{25, 27},
{27, 29},
{27, 31},
{29, 31},
// Left leg
{24, 26},
{26, 28},
{28, 30},
{28, 32},
{30, 32}};
cv::Mat draw_landmarks(cv::Mat image, const std::vector<BlazePoseLandmark> &landmarks, float score_threshold)
{
cv::Mat out = image.clone();
for (const auto &lm : landmarks)
{
const auto &lms = lm.landmarks;
for (size_t i = 0; i < lms.size(); ++i)
{
int x = static_cast<int>(lms[i][0]);
int y = static_cast<int>(lms[i][1]);
double v = lms[i][3];
if (v < score_threshold)
continue;
cv::circle(out, cv::Point(x, y), 3, cv::Scalar(0, 255, 0), -1);
}
for (const auto &conn : POSE_CONNECTIONS)
{
int i0 = conn.first;
int i1 = conn.second;
if (i0 >= lms.size() || i1 >= lms.size())
continue;
if (lms[i0][3] < score_threshold || lms[i1][3] < score_threshold)
continue;
cv::Point p0(static_cast<int>(lms[i0][0]), static_cast<int>(lms[i0][1]));
cv::Point p1(static_cast<int>(lms[i1][0]), static_cast<int>(lms[i1][1]));
cv::line(out, p0, p1, cv::Scalar(255, 0, 0), 2);
}
}
return out;
}

View file

@ -0,0 +1,50 @@
/*
* Copyright (C) 20242025 Amlogic, Inc. All rights reserved.
*
* Licensed under the Apache License, Version 2.0 (the "License");
* you may not use this file except in compliance with the License.
* You may obtain a copy of the License at
*
* http://www.apache.org/licenses/LICENSE-2.0
*
* Unless required by applicable law or agreed to in writing, software
* distributed under the License is distributed on an "AS IS" BASIS,
* WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
* See the License for the specific language governing permissions and
* limitations under the License.
*/
#ifndef _AMLNN_BLAZEPOSE_LANDMARK_POSTPROCESS_H_
#define _AMLNN_BLAZEPOSE_LANDMARK_POSTPROCESS_H_
#include <opencv2/opencv.hpp>
#include <vector>
#include <tuple>
#include <string>
#include "model_loader.h"
#define NUM_LANDMARKS 33
#define LANDMARK_OUT_DIM 4
#define LANDMARK_FEATURE_DIM 5
#define IMAGE_SIZE 256
// BlazePoseLandmark result structure
struct BlazePoseLandmark
{
std::vector<std::vector<double>> landmarks; // [N][x,y,z,v]
};
// Preprocess image with letterbox resizing
std::tuple<cv::Mat, cv::Mat> preprocess(cv::Mat img, std::vector<std::vector<float>> &detections, std::tuple<int, int> new_shape);
// Quantize float32 image to int8 for model input
cv::Mat quantize_input(const cv::Mat &float_img, float scale = 0.000030518509447574615f, int16_t zero_point = 0);
// Postprocess blazepose_landmark outputs with DFL decoding
std::vector<BlazePoseLandmark> postprocess(nn_output *outdata, const cv::Mat &affine);
// Draw detections on image
cv::Mat draw_landmarks(cv::Mat image, const std::vector<BlazePoseLandmark> &landmarks, float score_threshold = 0.5);
#endif // _AMLNN_BLAZEPOSE_LANDMARK_POSTPROCESS_H_

0
examples/blazepose_landmark/model/.gitkeep Normal file → Executable file
View file

View file

@ -0,0 +1,40 @@
#
# Copyright (C) 2026 Amlogic, Inc. All rights reserved.
#
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.
#
# 1. $1: set ADLA_TOOL_PATH
# 2. $2: set target-platform
# for A311D2 target-platform is PRODUCT_PID0XA003
# for S905X5 target-platform is PRODUCT_PID0XA005
# Usage: ./adla_convert.sh pose_landmark_full.tflite /XXX/adla-toolkit-binary-3.2.9.3 PRODUCT_PID0XA005
model_path=$1
ADLA_TOOL_PATH=$2
target_platform=$3
echo "model_path:[$model_path]"
echo "ADLA_TOOL_PATH:[$ADLA_TOOL_PATH]"
echo "target-platform:[$target_platform]"
adla_convert=${ADLA_TOOL_PATH}/bin/adla_convert
$adla_convert --model-type tflite \
--model $model_path \
--inputs input_1 --input-shapes "256,256,3" \
--quantize-dtype int16 \
--source-file dataset_coco.txt \
--channel-mean-value "0,0,0,255" \
--target-platform $target_platform \
--disable-per-channel false

View file

@ -0,0 +1,50 @@
../../../resource/coco_dataset/000000000139.jpg
../../../resource/coco_dataset/000000000285.jpg
../../../resource/coco_dataset/000000000632.jpg
../../../resource/coco_dataset/000000000724.jpg
../../../resource/coco_dataset/000000000776.jpg
../../../resource/coco_dataset/000000000785.jpg
../../../resource/coco_dataset/000000000802.jpg
../../../resource/coco_dataset/000000000872.jpg
../../../resource/coco_dataset/000000000885.jpg
../../../resource/coco_dataset/000000001000.jpg
../../../resource/coco_dataset/000000001268.jpg
../../../resource/coco_dataset/000000001296.jpg
../../../resource/coco_dataset/000000001353.jpg
../../../resource/coco_dataset/000000001425.jpg
../../../resource/coco_dataset/000000001490.jpg
../../../resource/coco_dataset/000000001503.jpg
../../../resource/coco_dataset/000000001532.jpg
../../../resource/coco_dataset/000000001584.jpg
../../../resource/coco_dataset/000000001675.jpg
../../../resource/coco_dataset/000000001761.jpg
../../../resource/coco_dataset/000000001818.jpg
../../../resource/coco_dataset/000000001993.jpg
../../../resource/coco_dataset/000000002006.jpg
../../../resource/coco_dataset/000000002149.jpg
../../../resource/coco_dataset/000000002153.jpg
../../../resource/coco_dataset/000000002157.jpg
../../../resource/coco_dataset/000000002261.jpg
../../../resource/coco_dataset/000000002299.jpg
../../../resource/coco_dataset/000000002431.jpg
../../../resource/coco_dataset/000000002473.jpg
../../../resource/coco_dataset/000000002532.jpg
../../../resource/coco_dataset/000000002587.jpg
../../../resource/coco_dataset/000000002592.jpg
../../../resource/coco_dataset/000000002685.jpg
../../../resource/coco_dataset/000000002923.jpg
../../../resource/coco_dataset/000000003156.jpg
../../../resource/coco_dataset/000000003255.jpg
../../../resource/coco_dataset/000000003501.jpg
../../../resource/coco_dataset/000000003553.jpg
../../../resource/coco_dataset/000000003661.jpg
../../../resource/coco_dataset/000000003845.jpg
../../../resource/coco_dataset/000000003934.jpg
../../../resource/coco_dataset/000000004134.jpg
../../../resource/coco_dataset/000000004395.jpg
../../../resource/coco_dataset/000000004495.jpg
../../../resource/coco_dataset/000000004765.jpg
../../../resource/coco_dataset/000000004795.jpg
../../../resource/coco_dataset/000000005001.jpg
../../../resource/coco_dataset/000000005037.jpg
../../../resource/coco_dataset/000000005060.jpg

Binary file not shown.

After

Width:  |  Height:  |  Size: 2.4 MiB

0
examples/blazepose_landmark/py/.gitkeep Normal file → Executable file
View file

View file

@ -0,0 +1,301 @@
#
# Copyright (C) 2026 Amlogic, Inc. All rights reserved.
#
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at
#
# http:#www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.
#
import numpy as np
import os
import glob
import argparse
import cv2
from pathlib import Path
from amlnnlite.api import AMLNNLite
import math
def preprocess(img_path, detections, new_shape=(256, 256), data_format='NCHW', s=0.003921568859368563, zp=-128):
original_img = cv2.imread(str(img_path))
if original_img is None:
raise ValueError(f"can't read image: {img_path}")
im_h, im_w, _ = original_img.shape
if detections.shape[0] > 0:
detections = detections[:1, :]
else:
raise ValueError("No detections input, please run blazepose_detect and generate the detections first.")
x_center, y_center = detections[4:6]
x_scale, y_scale = detections[6:8]
print(f"---------center {x_center}, {y_center}, x_scale {x_scale}, y_scale {y_scale}")
box_size = (((x_scale - x_center) ** 2 + (y_scale - y_center) ** 2) ** 0.5) * 2
box_size *= 1.25
angle = (np.pi * 90 / 180) - math.atan2(-(y_scale - y_center), x_scale - x_center)
rotation = angle - 2 * np.pi * np.floor((angle - (-np.pi)) / (2 * np.pi))
rotated_rect = ((x_center, y_center), (box_size, box_size), rotation * 180. / np.pi)
pts1 = cv2.boxPoints(rotated_rect)
h, w = new_shape
pts2 = np.float32([[0, h], [0, 0], [w, 0], [w, h]])
M = cv2.getPerspectiveTransform(pts1, pts2)
processed_img = cv2.warpPerspective(original_img, M, (w, h), flags=cv2.INTER_LINEAR, borderMode=cv2.BORDER_REPLICATE)
rgb_img = cv2.cvtColor(processed_img, cv2.COLOR_BGR2RGB)
normalized_img = rgb_img.astype(np.float32) / 255.0
if data_format == 'NCHW':
# HWC -> CHW -> BCHW (ONNX default format)
input_tensor = np.transpose(normalized_img, (2, 0, 1))
input_tensor = np.expand_dims(input_tensor, axis=0)
elif data_format == 'NHWC':
# HWC -> BHWC (TFLITE default format)
input_tensor = np.expand_dims(normalized_img, axis=0)
else:
raise ValueError(f"Unsupported data format: {data_format}. Only 'NCHW' and 'NHWC' are supported.")
# Quantize to int16
input_tensor = np.round(input_tensor / s + zp).astype(np.int16)
return input_tensor, original_img, [x_center, y_center, rotation, box_size]
def tensor_to_landmark(landmarks):
num_landmarks = 39
num_dimensions = landmarks.shape[1] # num_landmarks
output = landmarks.reshape(-1, num_landmarks, num_dimensions).copy()
if num_dimensions > 3:
output[..., 3:5] = 1.0 / (1.0 + np.exp(-output[..., 3:5]))
return output
def refine_landmark(landmarks, heatmap):#39*5, 64*64*39
min_confidence = 0.5
kernel_size = 9
offset = kernel_size
hm_h, hm_w, _ = heatmap.shape
for i, lm in enumerate(landmarks):
col = int(lm[0] * hm_w)
row = int(lm[1] * hm_h)
if not (0 <= col < hm_w and 0 <= row < hm_h):
continue
c0 = max(0, col - offset)
c1 = min(hm_w, col + offset + 1)
r0 = max(0, row - offset)
r1 = min(hm_h, row + offset + 1)
val_sum = 0.0
weighted_col = 0.0
weighted_row = 0.0
max_conf = 0.0
for r in range(r0, r1):
for c in range(c0, c1):
conf = 1.0 / (1.0 + np.exp(-heatmap[r, c, i]))
val_sum += conf
max_conf = max(max_conf, conf)
weighted_col += c * conf
weighted_row += r * conf
if max_conf >= min_confidence and val_sum > 0:
lm[0] = weighted_col / (hm_w * val_sum)
lm[1] = weighted_row / (hm_h * val_sum)
return landmarks
def postprocess(outputs, params, data_format='NCHW'):
x_center, y_center, rotation, box_size = params
flag, landmark_tensor, world_landmark_tensor, segment, heatmap_tensor = [], [], [], [], []
for out in outputs:
if len(out.shape) == 2:
if out.shape == (1, 1) or out.shape == (1,):
flag = out
elif out.shape[1] == 195:
landmark_tensor = out
elif out.shape[1] == 117:
world_landmark_tensor = out
elif len(out.shape) == 4 and out.shape[3] == 1 and out.shape[1] == 256:
segment = out
elif len(out.shape) == 4 and out.shape[1] == 64 and out.shape[3] == 39:
heatmap_tensor = out
raw_landmarks = tensor_to_landmark(landmark_tensor)
all_world_landmarks = tensor_to_landmark(world_landmark_tensor)
h = w = 256
raw_landmarks[:, :, 0] = raw_landmarks[:, :, 0] / w
raw_landmarks[:, :, 1] = raw_landmarks[:, :, 1] / h
raw_landmarks[:, :, 2] = raw_landmarks[:, :, 2] / w
# Refines landmarks with the heatmap tensor.
all_landmarks = refine_landmark(raw_landmarks[0], heatmap_tensor[0])
all_world_landmarks = all_world_landmarks[0]
print(f"rotation {rotation}")
cosa = math.cos(rotation)
sina = math.sin(rotation)
for landmark in all_landmarks:
x = landmark[0] - 0.5
y = landmark[1] - 0.5
landmark[0] = ((cosa * x - sina * y) * box_size + x_center)
landmark[1] = ((sina * x + cosa * y) * box_size + y_center)
landmark[2] = landmark[2] * box_size
# Projects the world landmarks from the letterboxed ROI to the full image.
for landmark in all_world_landmarks:
x = landmark[0]
y = landmark[1]
landmark[0] = cosa * x - sina * y
landmark[1] = sina * x + cosa * y
return all_landmarks
def get_class_color(class_id):
import colorsys
hue = (class_id * 137.508) % 360
rgb = colorsys.hsv_to_rgb(hue/360.0, 0.8, 0.9)
bgr = (int(rgb[2]*255), int(rgb[1]*255), int(rgb[0]*255))
return bgr
POSE_CONNECTIONS = [
# Face
(0, 1),(1, 2),(2, 3),(3, 7),
(0, 4),(4, 5),(5, 6),(6, 8),
# Mouth
(9, 10),
# Shoulders
(11, 12),
# Right arm
(11, 13), (13, 15), (15, 17), (15, 19), (15, 21), (17, 19),
# Left arm
(12, 14), (14, 16), (16, 18), (16, 20), (16, 22), (18, 20),
# Torso
(11, 23), (12, 24), (23, 24),
# Right leg
(23, 25), (25, 27), (27, 29), (27, 31), (29, 31),
# Left leg
(24, 26), (26, 28), (28, 30), (28, 32), (30, 32)
]
def draw_landmarks(img, landmarks, save_path, score_threshold=0.5):
result_img = img.copy()
for lm in landmarks:
lms = lm.landmarks
for point in lms:
x, y, score = int(point[0]), int(point[1]), point[3]
if score < score_threshold:
continue
cv2.circle(result_img, (x, y), 3, (0, 255, 0), -1)
for i0, i1 in POSE_CONNECTIONS:
if i0 >= len(lms) or i1 >= len(lms):
continue
if lms[i0][3] < score_threshold or lms[i1][3] < score_threshold:
continue
p0 = (int(lms[i0][0]), int(lms[i0][1]))
p1 = (int(lms[i1][0]), int(lms[i1][1]))
cv2.line(result_img, p0, p1, (255, 0, 0), 2)
cv2.imwrite(save_path, result_img)
return result_img
def read_detections_from_txt(txt_path):
with open(txt_path, "r") as f:
detections = [[float(x) for x in line.split()] for line in f if line.strip()]
return np.array(detections, dtype=np.float32)
def main():
parser = argparse.ArgumentParser()
parser.add_argument('--model-path', default='./blazepose_landmark_int8_A311D2.adla')
parser.add_argument('--run-cycles', default= 1, type=int)
args = parser.parse_args()
# Initialize AMLNNLite
amlnn = AMLNNLite()
amlnn.config(
model_path=args.model_path, # Model file path, Support ADLA and quantized TFlite models
run_cycles=args.run_cycles
)
amlnn.init()
# Find all image files in the 01_export_model directory
image_dir = "./"
image_extensions = ["*.jpg", "*.jpeg", "*.png", "*.bmp"]
image_files = []
for ext in image_extensions:
image_files.extend(glob.glob(os.path.join(image_dir, ext)))
image_files.extend(glob.glob(os.path.join(image_dir, ext.upper())))
if not image_files:
print("No image files found in", image_dir)
amlnn.uninit()
return
print(f"Found {len(image_files)} image files to process:")
for img_file in image_files:
print(f" - {os.path.basename(img_file)}")
print()
# Process each image
for i, image_path in enumerate(image_files, 1):
txt_path = os.path.splitext(image_path)[0] + ".txt"
detections = read_detections_from_txt(txt_path=txt_path)
print(f"=" * 60)
print(f"Processing image {i}/{len(image_files)}: {os.path.basename(image_path)}")
print(f"=" * 60)
try:
# Preprocess input
input_tensor, original_img, params = preprocess(image_path, detections, new_shape=(256, 256), data_format='NHWC', s=0.000030518509447574615, zp=0)
# Run inference
outputs = amlnn.inference(inputs=[input_tensor])
# Postprocess results
landmarks = postprocess(outputs, params, data_format='NHWC')
# Print detection results
if landmarks:
print(f" Detected {len(landmarks)} objects:")
for i, lm in enumerate(landmarks, 1):
print(f" {i}. {lm['class_name']} ({lm['confidence']:.2f})")
else:
print(" No objects detected")
# Save result image
model_name = Path(args.model_path).stem
result_dir = f"{model_name}_result"
os.makedirs(result_dir, exist_ok=True)
img_name = Path(image_path).stem
save_path = os.path.join(result_dir, f"{img_name}_result.jpg")
draw_landmarks(original_img, landmarks, str(save_path), score_threshold=0.5)
print(f" Result saved to: {save_path}")
except Exception as e:
print(f"Error processing {os.path.basename(image_path)}: {e}")
print()
# Optional visualization
amlnn.visualize()
# Release resources
amlnn.uninit()
if __name__ == "__main__":
main()

Binary file not shown.

After

Width:  |  Height:  |  Size: 468 KiB

0
examples/clip/cpp/.gitkeep Normal file → Executable file
View file

View file

@ -26,8 +26,8 @@
// Initialize network from file
void* init_network_file(const char *model_path);
// Run vision model inference
std::vector<float> run_vision_model(void* context, const std::vector<float>& input_data);
// Run image model inference
std::vector<float> run_image_model(void* context, const std::vector<float>& input_data);
// Run text model inference
std::vector<float> run_text_model(void* context, const std::vector<int64_t>& input_ids);

View file

@ -33,7 +33,7 @@ struct ProfilingTimer
{
uint64_t init_start, init_end;
uint64_t preprocess_start, preprocess_end;
uint64_t vision_infer_start, vision_infer_end;
uint64_t image_infer_start, image_infer_end;
uint64_t text_infer_start, text_infer_end;
};
@ -71,10 +71,10 @@ std::vector<std::string> parse_texts(const std::string& input)
void print_usage(const char* prog_name)
{
printf("Usage: %s <vision_model> <text_model> <tokenizer_dir> [--profiling]\n", prog_name);
printf("Usage: %s <image_model> <text_model> <tokenizer_dir> [--profiling]\n", prog_name);
printf("\n");
printf("Arguments:\n");
printf(" vision_model: Path to vision model (.adla)\n");
printf(" image_model: Path to image model (.adla)\n");
printf(" text_model: Path to text model (.adla)\n");
printf(" tokenizer_dir: Path to directory containing vocab.json and merges.txt\n");
printf(" --profiling: Enable performance profiling output (optional)\n");
@ -96,7 +96,7 @@ int main(int argc, char ** argv)
return -1;
}
const char* vision_model_path = argv[1];
const char* image_model_path = argv[1];
const char* text_model_path = argv[2];
const char* tokenizer_dir = argv[3];
@ -119,11 +119,11 @@ int main(int argc, char ** argv)
}
// Initialize models
printf("[Info] Initializing vision model: %s\n", vision_model_path);
printf("[Info] Initializing image model: %s\n", image_model_path);
timer.init_start = get_time_count();
void* vision_context = init_network_file(vision_model_path);
if (vision_context == NULL) {
printf("[Error] Failed to initialize vision model.\n");
void* image_context = init_network_file(image_model_path);
if (image_context == NULL) {
printf("[Error] Failed to initialize image model.\n");
return -1;
}
@ -131,7 +131,7 @@ int main(int argc, char ** argv)
void* text_context = init_network_file(text_model_path);
if (text_context == NULL) {
printf("[Error] Failed to initialize text model.\n");
destroy_network(vision_context);
destroy_network(image_context);
return -1;
}
timer.init_end = get_time_count();
@ -218,14 +218,14 @@ int main(int argc, char ** argv)
}
timer.preprocess_end = get_time_count();
// Run vision model
timer.vision_infer_start = get_time_count();
std::vector<float> image_embedding = run_vision_model(vision_context, image_input);
// Run image model
timer.image_infer_start = get_time_count();
std::vector<float> image_embedding = run_image_model(image_context, image_input);
if (image_embedding.empty()) {
printf("[Error] Vision model inference failed.\n");
printf("[Error] Image model inference failed.\n");
continue;
}
timer.vision_infer_end = get_time_count();
timer.image_infer_end = get_time_count();
// L2 normalize image embedding
image_embedding = l2_normalize(image_embedding);
@ -264,7 +264,8 @@ int main(int argc, char ** argv)
continue;
}
printf("[Info] Text embeddings size: %zu x %zu\n", text_embeddings.size(),
printf("[Info] Text embeddings size: %zu x %zu\n",
text_embeddings.size(),
text_embeddings.empty() ? 0 : text_embeddings[0].size());
// ==================== Compute Similarity ====================
@ -302,11 +303,11 @@ int main(int argc, char ** argv)
if (profiling) {
uint64_t preprocess_time = (timer.preprocess_end - timer.preprocess_start) / 1000000;
uint64_t vision_time = (timer.vision_infer_end - timer.vision_infer_start) / 1000000;
uint64_t image_time = (timer.image_infer_end - timer.image_infer_start) / 1000000;
uint64_t text_total_time = (timer.text_infer_end - timer.text_infer_start) / 1000000;
printf("\n[Profiling]\n");
printf(" Image preprocess: %lums\n", preprocess_time);
printf(" Vision inference: %lums\n", vision_time);
printf(" Image inference: %lums\n", image_time);
for (size_t i = 0; i < texts.size() && i < text_infer_times.size(); ++i) {
printf(" Text inference[%zu]: %lums '%s'\n", i, text_infer_times[i], texts[i].c_str());
}
@ -316,9 +317,9 @@ int main(int argc, char ** argv)
}
// Cleanup
ret = destroy_network(vision_context);
ret = destroy_network(image_context);
if (ret != 0) {
printf("[Error] Failed to destroy vision model.\n");
printf("[Error] Failed to destroy image model.\n");
}
ret = destroy_network(text_context);

View file

@ -27,9 +27,9 @@
#include "nn_sdk.h"
// Global DMA config for models
static aml_memory_config_t vision_mem_config;
static aml_memory_data_t vision_mem_data;
static void* vision_context_flag = nullptr;
static aml_memory_config_t image_mem_config;
static aml_memory_data_t image_mem_data;
static void* image_context_flag = nullptr;
static aml_memory_config_t text_mem_config;
static aml_memory_data_t text_mem_data;
@ -84,10 +84,11 @@ void* init_network_file(const char *model_path)
return qcontext;
}
std::vector<float> run_vision_model(void* qcontext, const std::vector<float>& input_data)
std::vector<float> run_image_model(void* qcontext, const std::vector<float>& input_data)
{
int ret = 0;
nn_input inData;
nn_output *outdata = NULL;
aml_output_config_t outconfig;
@ -96,19 +97,19 @@ std::vector<float> run_vision_model(void* qcontext, const std::vector<float>& in
inData.size = input_data.size() * sizeof(float);
// Use DMA
if (!vision_context_flag) {
vision_mem_config.cache_type = AML_WITH_CACHE;
vision_mem_config.memory_type = AML_VIRTUAL_ADDR;
vision_mem_config.direction = AML_MEM_DIRECTION_READ_WRITE;
vision_mem_config.index = 0;
vision_mem_config.mem_size = inData.size;
aml_util_mallocBuffer(qcontext, &vision_mem_config, &vision_mem_data);
aml_util_swapExternalInputBuffer(qcontext, &vision_mem_config, &vision_mem_data);
vision_context_flag = qcontext;
if (!image_context_flag) {
image_mem_config.cache_type = AML_WITH_CACHE;
image_mem_config.memory_type = AML_VIRTUAL_ADDR;
image_mem_config.direction = AML_MEM_DIRECTION_READ_WRITE;
image_mem_config.index = 0;
image_mem_config.mem_size = inData.size;
aml_util_mallocBuffer(qcontext, &image_mem_config, &image_mem_data);
aml_util_swapExternalInputBuffer(qcontext, &image_mem_config, &image_mem_data);
image_context_flag = qcontext;
}
inData.input_type = INPUT_DMA_DATA;
memcpy(vision_mem_data.viraddr, input_data.data(), vision_mem_config.mem_size);
memcpy(image_mem_data.viraddr, input_data.data(), image_mem_config.mem_size);
inData.input = NULL;
memset(&outconfig, 0, sizeof(aml_output_config_t));
@ -117,7 +118,7 @@ std::vector<float> run_vision_model(void* qcontext, const std::vector<float>& in
outdata = (nn_output*)aml_module_output_get(qcontext, outconfig);
if (outdata == NULL || outdata->out[0].buf == NULL) {
printf("Vision model inference failed.\n");
printf("Image model inference failed.\n");
return {};
}
@ -178,10 +179,10 @@ int destroy_network(void *qcontext)
{
int ret = 0;
if (vision_context_flag == qcontext) {
printf("Free vision model memory.\n");
aml_util_freeBuffer(qcontext, &vision_mem_config, &vision_mem_data);
vision_context_flag = nullptr;
if (image_context_flag == qcontext) {
printf("Free image model memory.\n");
aml_util_freeBuffer(qcontext, &image_mem_config, &image_mem_data);
image_context_flag = nullptr;
} else if (text_context_flag == qcontext) {
printf("Free text model memory.\n");
aml_util_freeBuffer(qcontext, &text_mem_config, &text_mem_data);

View file

@ -1,3 +1,19 @@
/*
* Copyright (C) 2026 Amlogic, Inc. All rights reserved.
*
* Licensed under the Apache License, Version 2.0 (the "License");
* you may not use this file except in compliance with the License.
* You may obtain a copy of the License at
*
* http://www.apache.org/licenses/LICENSE-2.0
*
* Unless required by applicable law or agreed to in writing, software
* distributed under the License is distributed on an "AS IS" BASIS,
* WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
* See the License for the specific language governing permissions and
* limitations under the License.
*/
#ifndef MODEL_INVOKE_H
#define MODEL_INVOKE_H

View file

@ -102,7 +102,7 @@ std::vector<float> preprocess_image(const std::string& image_path) {
}
}
// Return NHWC format (batch dimension will be added in caller)
// get NHWC
return cropped;
}

0
examples/clip/model/.gitkeep Normal file → Executable file
View file

0
examples/clip/py/.gitkeep Normal file → Executable file
View file

View file

@ -1,21 +1,18 @@
# -*- coding: utf-8 -*-
"""
Copyright (C) 20242025 Amlogic, Inc. All rights reserved.
Licensed under the Apache License, Version 2.0 (the "License");
you may not use this file except in compliance with the License.
You may obtain a copy of the License at
http://www.apache.org/licenses/LICENSE-2.0
Unless required by applicable law or agreed to in writing, software
distributed under the License is distributed on an "AS IS" BASIS,
WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
See the License for the specific language governing permissions and
limitations under the License.
"""
# This inference script is designed for CLIP model using AMLNNLite.
#
# Copyright (C) 2026 Amlogic, Inc. All rights reserved.
#
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.
#
import os
import argparse

0
examples/mobilenet/cpp/.gitkeep Normal file → Executable file
View file

View file

@ -1,4 +1,20 @@
#!/bin/bash
#
# Copyright (C) 2026 Amlogic, Inc. All rights reserved.
#
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.
#
set -e
usage() {

0
examples/mobilenet/model/.gitkeep Normal file → Executable file
View file

View file

@ -240,7 +240,7 @@ miniature pinscher
Greater Swiss Mountain dog
Bernese mountain dog
Appenzeller
EntleBucher
Entlebucher
boxer
bull mastiff
Tibetan mastiff
@ -420,7 +420,7 @@ balloon
ballpoint
Band Aid
banjo
bannister
banister
barbell
barber chair
barbershop
@ -468,7 +468,7 @@ bulletproof vest
bullet train
butcher shop
cab
caldron
cauldron
candle
cannon
canoe

0
examples/mobilenet/py/.gitkeep Normal file → Executable file
View file

View file

@ -1,3 +1,19 @@
#
# Copyright (C) 2026 Amlogic, Inc. All rights reserved.
#
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.
#
import numpy as np
import os
import glob

0
examples/ppocr-det/cpp/.gitkeep Normal file → Executable file
View file

View file

@ -2569,7 +2569,7 @@ void Clipper::ProcessHorizontal(TEdge *horzEdge) {
while (e) {
// this code block inserts extra coords into horizontal edges (in output
// polygons) whereever maxima touch these horizontal edges. This helps
// polygons) wherever maxima touch these horizontal edges. This helps
//'simplifying' polygons (ie if the Simplify property is set).
if (m_Maxima.size() > 0) {
if (dir == dLeftToRight) {
@ -4109,10 +4109,9 @@ inline double DistanceSqrd(const IntPoint &pt1, const IntPoint &pt2) {
double DistanceFromLineSqrd(const IntPoint &pt, const IntPoint &ln1,
const IntPoint &ln2) {
// The equation of a line in general form (Ax + By + C = 0)
// given 2 points (x1,y1) & (x2,y2) is ...
//(y1 - y2)x + (x2 - x1)y + (y1 - y2)x2 - (x2 - x1)y2 = 0
// A = (y1 - y2); B = (x2 - x1); C = (y1 - y2)x2 - (x2 - x1)y2
// perpendicular distance of point (x0,y0) = (Ax0 + By0 + C)/Sqrt(A^2 + B^2)
// given 2 points (x锟?y锟? & (x锟?y锟? is ...
//(y锟?- y锟?x + (x锟?- x锟?y + (y锟?- y锟?x锟?- (x锟?- x锟?y锟?= 0
// A = (y锟?- y锟?; B = (x锟?- x锟?; C = (y锟?- y锟?x锟?- (x锟?- x锟?y锟? // perpendicular distance of point (x锟?y锟? = (Ax锟?+ By锟?+ C)/Sqrt(A锟?+ B锟?
// see http://en.wikipedia.org/wiki/Perpendicular_distance
double A = double(ln1.Y - ln2.Y);
double B = double(ln2.X - ln1.X);

0
examples/ppocr-det/model/.gitkeep Normal file → Executable file
View file

0
examples/ppocr-det/py/.gitkeep Normal file → Executable file
View file

View file

@ -41,16 +41,16 @@
```
cd model
Usage: ./adla_covnert.sh model_path adla_tookkit_path target_platform
Usage: ./adla_convert.sh model_path adla_toolkit_path target_platform
example
```
| Parameter | Discription |
| Parameter | Description |
| ----------------- | ------------------------------------------------------------ |
| model_path | onnx model path |
| adla_tookkit_path | path to adla_toolkit |
| adla_toolkit_path | path to adla_toolkit |
| target_platform | Specify target platform. for A311D2 : PRODUCT_PID0XA003. for S905X5: PRODUCT_PID0XA005 |

View file

@ -38,16 +38,16 @@
```
cd model
Usage: ./adla_covnert.sh model_path adla_tookkit_path target_platform
Usage: ./adla_convert.sh model_path adla_toolkit_path target_platform
example
```
| Parameter | Discription |
| Parameter | Description |
| ----------------- | ------------------------------------------------------------ |
| model_path | onnx model path |
| adla_tookkit_path | path to adla_toolkit |
| adla_toolkit_path | path to adla_toolkit |
| target_platform | Specify target platform. for A311D2 : PRODUCT_PID0XA003. for S905X5: PRODUCT_PID0XA005 |

0
examples/whisper/cpp/.gitkeep Normal file → Executable file
View file

View file

@ -391,7 +391,7 @@ If the return value differs from bytesToWrite, it indicates an error.
typedef size_t (* drwav_write_proc)(void* pUserData, const void* pData, size_t bytesToWrite);
/*
Callback for when data needs to be seeked.
Callback for when data needs to be sought.
pUserData [in] The user data that was passed to drwav_init() and family.
offset [in] The number of bytes to move, relative to the origin. Will never be negative.
@ -415,10 +415,10 @@ pChunkHeader [in] A pointer to an object containing basic header informatio
container [in] Whether or not the WAV file is a RIFF or Wave64 container. If you're unsure of the difference, assume RIFF.
pFMT [in] A pointer to the object containing the contents of the "fmt" chunk.
Returns the number of bytes read + seeked.
Returns the number of bytes read + sought.
To read data from the chunk, call onRead(), passing in pReadSeekUserData as the first parameter. Do the same for seeking with onSeek(). The return value must
be the total number of bytes you have read _plus_ seeked.
be the total number of bytes you have read _plus_ sought.
Use the `container` argument to discriminate the fields in `pChunkHeader->id`. If the container is `drwav_container_riff` or `drwav_container_rf64` you should
use `id.fourcc`, otherwise you should use `id.guid`.
@ -499,7 +499,7 @@ typedef struct
/* A pointer to the function to call when data needs to be written. Only used when the drwav object is opened in write mode. */
drwav_write_proc onWrite;
/* A pointer to the function to call when the wav file needs to be seeked. */
/* A pointer to the function to call when the wav file needs to be sought. */
drwav_seek_proc onSeek;
/* The user data to pass to callbacks. */
@ -3561,16 +3561,16 @@ DRWAV_API size_t drwav_read_raw(drwav* pWav, size_t bytesToRead, void* pBufferOu
/* When we get here we may need to read-and-discard some data. */
while (bytesRead < bytesToRead) {
drwav_uint8 buffer[4096];
size_t bytesSeeked;
size_t bytessought;
size_t bytesToSeek = (bytesToRead - bytesRead);
if (bytesToSeek > sizeof(buffer)) {
bytesToSeek = sizeof(buffer);
}
bytesSeeked = pWav->onRead(pWav->pUserData, buffer, bytesToSeek);
bytesRead += bytesSeeked;
bytessought = pWav->onRead(pWav->pUserData, buffer, bytesToSeek);
bytesRead += bytessought;
if (bytesSeeked < bytesToSeek) {
if (bytessought < bytesToSeek) {
break; /* Reached the end. */
}
}

0
examples/whisper/model/.gitkeep Normal file → Executable file
View file

0
examples/whisper/py/.gitkeep Normal file → Executable file
View file

0
examples/yoloe/cpp/.gitkeep Normal file → Executable file
View file

0
examples/yoloe/model/.gitkeep Normal file → Executable file
View file

0
examples/yoloe/py/.gitkeep Normal file → Executable file
View file

View file

@ -48,18 +48,18 @@
```
cd model
Usage: ./adla_covnert.sh model_path adla_tookkit_path target_platform
Usage: ./adla_convert.sh model_path adla_toolkit_path target_platform
example
./adla_covnert.sh yolov11m.onnx /xxxx/adla-toolkit-binary-3.2.9.3 PRODUCT_PID0XA005
./adla_covnert.sh yolov11s.onnx /xxxx/adla-toolkit-binary-3.2.9.3 PRODUCT_PID0XA005
./adla_covnert.sh yolov11n.onnx /xxxx/adla-toolkit-binary-3.2.9.3 PRODUCT_PID0XA005
./adla_convert.sh yolov11m.onnx /xxxx/adla-toolkit-binary-3.2.9.3 PRODUCT_PID0XA005
./adla_convert.sh yolov11s.onnx /xxxx/adla-toolkit-binary-3.2.9.3 PRODUCT_PID0XA005
./adla_convert.sh yolov11n.onnx /xxxx/adla-toolkit-binary-3.2.9.3 PRODUCT_PID0XA005
```
| Parameter | Discription |
| Parameter | Description |
| ----------------- | ------------------------------------------------------------ |
| model_path | onnx model path |
| adla_tookkit_path | path to adla_toolkit |
| adla_toolkit_path | path to adla_toolkit |
| target_platform | Specify target platform. for A311D2 : PRODUCT_PID0XA003. for S905X5: PRODUCT_PID0XA005 |

View file

@ -23,7 +23,7 @@ import cv2
from pathlib import Path
from amlnnlite.api import AMLNNLite
class_names = {0: 'person', 1: 'bicycle', 2: 'car', 3: 'motorcycle', 4: 'airplane', 5: 'bus', 6: 'train', 7: 'truck', 8: 'boat', 9: 'traffic light', 10: 'fire hydrant', 11: 'stop sign', 12: 'parking meter', 13: 'bench', 14: 'bird', 15: 'cat', 16: 'dog', 17: 'horse', 18: 'sheep', 19: 'cow', 20: 'elephant', 21: 'bear', 22: 'zebra', 23: 'giraffe', 24: 'backpack', 25: 'umbrella', 26: 'handbag', 27: 'tie', 28: 'suitcase', 29: 'frisbee', 30: 'skis', 31: 'snowboard', 32: 'sports ball', 33: 'kite', 34: 'baseball bat', 35: 'baseball glove', 36: 'skateboard', 37: 'surfboard', 38: 'tennis racket', 39: 'bottle', 40: 'wine glass', 41: 'cup', 42: 'fork', 43: 'knife', 44: 'spoon', 45: 'bowl', 46: 'banana', 47: 'apple', 48: 'sandwich', 49: 'orange', 50: 'broccoli', 51: 'carrot', 52: 'hot dog', 53: 'pizza', 54: 'donut', 55: 'cake', 56: 'chair', 57: 'couch', 58: 'potted plant', 59: 'bed', 60: 'dining table', 61: 'toilet', 62: 'tv', 63: 'laptop', 64: 'mouse', 65: 'remote', 66: 'keyboard', 67: 'cell phone', 68: 'microwave', 69: 'oven', 70: 'toaster', 71: 'sink', 72: 'refrigerator', 73: 'book', 74: 'clock', 75: 'vase', 76: 'scissors', 77: 'teddy bear', 78: 'hair drier', 79: 'toothbrush'}
class_names = {0: 'person', 1: 'bicycle', 2: 'car', 3: 'motorcycle', 4: 'airplane', 5: 'bus', 6: 'train', 7: 'truck', 8: 'boat', 9: 'traffic light', 10: 'fire hydrant', 11: 'stop sign', 12: 'parking meter', 13: 'bench', 14: 'bird', 15: 'cat', 16: 'dog', 17: 'horse', 18: 'sheep', 19: 'cow', 20: 'elephant', 21: 'bear', 22: 'zebra', 23: 'giraffe', 24: 'backpack', 25: 'umbrella', 26: 'handbag', 27: 'tie', 28: 'suitcase', 29: 'frisbee', 30: 'skis', 31: 'snowboard', 32: 'sports ball', 33: 'kite', 34: 'baseball bat', 35: 'baseball glove', 36: 'skateboard', 37: 'surfboard', 38: 'tennis racket', 39: 'bottle', 40: 'wine glass', 41: 'cup', 42: 'fork', 43: 'knife', 44: 'spoon', 45: 'bowl', 46: 'banana', 47: 'apple', 48: 'sandwich', 49: 'orange', 50: 'broccoli', 51: 'carrot', 52: 'hot dog', 53: 'pizza', 54: 'doughnut', 55: 'cake', 56: 'chair', 57: 'couch', 58: 'potted plant', 59: 'bed', 60: 'dining table', 61: 'toilet', 62: 'tv', 63: 'laptop', 64: 'mouse', 65: 'remote', 66: 'keyboard', 67: 'cell phone', 68: 'microwave', 69: 'oven', 70: 'toaster', 71: 'sink', 72: 'refrigerator', 73: 'book', 74: 'clock', 75: 'vase', 76: 'scissors', 77: 'teddy bear', 78: 'hair drier', 79: 'toothbrush'}
def letterbox(img, new_shape=(640, 640), color=(114, 114, 114)):
shape = img.shape[:2]

View file

@ -48,19 +48,19 @@
```
cd model
Usage: ./adla_covnert.sh model_path adla_tookkit_path target_platform
Usage: ./adla_convert.sh model_path adla_toolkit_path target_platform
example
./adla_covnert.sh yolov8m.onnx /xxxx/adla-toolkit-binary-3.2.9.3 PRODUCT_PID0XA005
./adla_covnert.sh yolov8s.onnx /xxxx/adla-toolkit-binary-3.2.9.3 PRODUCT_PID0XA005
./adla_covnert.sh yolov8n.onnx /xxxx/adla-toolkit-binary-3.2.9.3 PRODUCT_PID0XA005
./adla_convert.sh yolov8m.onnx /xxxx/adla-toolkit-binary-3.2.9.3 PRODUCT_PID0XA005
./adla_convert.sh yolov8s.onnx /xxxx/adla-toolkit-binary-3.2.9.3 PRODUCT_PID0XA005
./adla_convert.sh yolov8n.onnx /xxxx/adla-toolkit-binary-3.2.9.3 PRODUCT_PID0XA005
```
| Parameter | Discription |
| Parameter | Description |
| ----------------- | ------------------------------------------------------------ |
| model_path | onnx model path |
| adla_tookkit_path | path to adla_toolkit |
| target_platform | Specify target platform. for A311D2 : PRODUCT_PID0XA003. for S905X5: PRODUCT_PID0XA005 |
| adla_toolkit_path | path to adla_toolkit |
| target_platform | Specify target platform. for A311D2 : PRODUCT_PID0XA003。for S905X5: PRODUCT_PID0XA005 |
@ -78,7 +78,7 @@ example
```bash
# Build for arm64-v8a
cd examples/yolov8/cpp
AMLNN_HOME=/path/to/amlnn-toolkit ./build-android.sh -a arm64-v8a
./build-android.sh -a arm64-v8a
```
The executable will be generated at `build/android/yolov8_demo` (Note: executable name may vary, verify in build folder).

0
examples/yolov8/model/.gitkeep Normal file → Executable file
View file

View file

@ -1,8 +1,24 @@
#
# Copyright (C) 2026 Amlogic, Inc. All rights reserved.
#
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.
#
# 1. $1: set ADLA_TOOL_PATH
# 2. $2: set target-plaftorm
# 2. $2: set target-platform
# for A311D2 target-platform is PRODUCT_PID0XA003
# for S905X5 target-platform is PRODUCT_PID0XA005
# Usage: ./adla_covnert.sh yolov8m.onnx /XXX/adla-toolkit-binary-3.2.9.3 PRODUCT_PID0XA005
# Usage: ./adla_convert.sh yolov8m.onnx /XXX/adla-toolkit-binary-3.2.9.3 PRODUCT_PID0XA005
model_path=$1
ADLA_TOOL_PATH=$2
@ -10,7 +26,7 @@ target_platform=$3
echo "model_path:[$model_path]"
echo "ADLA_TOOL_PATH:[$ADLA_TOOL_PATH]"
echo "target-plaftorm:[$target_platform]"
echo "target-platform:[$target_platform]"
adla_convert=${ADLA_TOOL_PATH}/bin/adla_convert

0
examples/yolov8/py/.gitkeep Normal file → Executable file
View file

View file

@ -1,3 +1,19 @@
#
# Copyright (C) 2026 Amlogic, Inc. All rights reserved.
#
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.
#
import numpy as np
import os
import glob
@ -206,7 +222,7 @@ def main():
# Initialize AMLNNLite
amlnn = AMLNNLite()
amlnn.config(
model_path=args.model_path, # Model file path, Support ADLD and quantized TFlite models
model_path=args.model_path, # Model file path, Support ADLA and quantized TFlite models
run_cycles=args.run_cycles
)
amlnn.init()

0
examples/yoloworld/cpp/.gitkeep Normal file → Executable file
View file

0
examples/yoloworld/model/.gitkeep Normal file → Executable file
View file

0
examples/yoloworld/py/.gitkeep Normal file → Executable file
View file

View file

@ -1,3 +1,19 @@
#
# Copyright (C) 2026 Amlogic, Inc. All rights reserved.
#
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.
#
import numpy as np
import os
import glob

View file

@ -1,4 +1,21 @@
#TODO
#!/bin/bash
#
# Copyright (C) 2026 Amlogic, Inc. All rights reserved.
#
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.
#
set -e
usage() {

0
examples/yolox/py/.gitkeep Normal file → Executable file
View file

View file

@ -1,4 +1,20 @@
# -*- coding: utf-8 -*-
#
# Copyright (C) 2026 Amlogic, Inc. All rights reserved.
#
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.
#
import numpy as np
import os
import glob
@ -369,7 +385,7 @@ def main():
# Initialize AMLNNLite
amlnn = AMLNNLite()
amlnn.config(
model_path=args.model_path, # Model file path, Support ADLD and quantized TFlite models
model_path=args.model_path, # Model file path, Support ADLA and quantized TFlite models
run_cycles=args.run_cycles
)
amlnn.init()

0
resource/.gitkeep Normal file → Executable file
View file