docs: Update README and compilation guides for clarity and consistency, including path corrections and improved formatting. Add copyright notices to source files and adjust file permissions for several scripts and directories.
This commit is contained in:
parent
f960c5030d
commit
bd891a96dd
136 changed files with 14413 additions and 9399 deletions
|
|
@ -0,0 +1,104 @@
|
|||
# blazepose_landmark
|
||||
|
||||
## 1.Overview
|
||||
|
||||
BlazePose Landmark builds upon the detected ROI to predict 33 precise body keypoints, enabling full-body pose estimation with high accuracy and temporal stability. Its efficient design makes it suitable for applications such as fitness tracking, motion analysis, augmented reality, and real-time human–computer interaction.
|
||||
|
||||
## 2.Model Download
|
||||
|
||||
- **Open Source model**
|
||||
|
||||
- **Open Source projects:** https://github.com/google-ai-edge/mediapipe/tree/master
|
||||
|
||||
- **Download weights**
|
||||
|
||||
wget https://storage.googleapis.com/mediapipe-assets/pose_landmark_full.tflite
|
||||
|
||||
## 3. Model Conversion
|
||||
|
||||
```
|
||||
cd model
|
||||
Usage: ./adla_convert.sh model_path adla_toolkit_path target_platform
|
||||
|
||||
example
|
||||
./adla_convert.sh pose_detection.tflite /xxxx/adla-toolkit-binary-3.2.9.3 PRODUCT_PID0XA005
|
||||
./adla_convert.sh pose_detection.tflite /xxxx/adla-toolkit-binary-3.2.9.3 PRODUCT_PID0XA005
|
||||
./adla_convert.sh pose_detection.tflite /xxxx/adla-toolkit-binary-3.2.9.3 PRODUCT_PID0XA005
|
||||
```
|
||||
|
||||
| Parameter | Description |
|
||||
| ----------------- | ------------------------------------------------------------ |
|
||||
| model_path | onnx model path |
|
||||
| adla_toolkit_path | path to adla_toolkit |
|
||||
| target_platform | Specify target platform. for A311D2: PRODUCT_PID0XA003. for S905X5: PRODUCT_PID0XA005 |
|
||||
|
||||
|
||||
|
||||
## 4. Demo Run
|
||||
|
||||
### CPP
|
||||
|
||||
#### 1. Compile
|
||||
|
||||
**Prerequisites:**
|
||||
|
||||
- Android NDK (r25e recommended)
|
||||
- `ANDROID_NDK_PATH` environment variable set
|
||||
|
||||
**Build:**
|
||||
```bash
|
||||
# Build for arm64-v8a
|
||||
cd examples/blazepose_landmark/cpp
|
||||
./build-android.sh -a arm64-v8a
|
||||
```
|
||||
|
||||
The executable will be generated at `build/android/blazepose_landmark_demo` (Note: executable name may vary, verify in build folder).
|
||||
|
||||
#### 2. Run
|
||||
|
||||
```bash
|
||||
# Push executable to device
|
||||
adb push build/android/blazepose_landmark_demo /data/local/tmp/
|
||||
adb push model/blazepose_landmark_full_int16_A311D2.adla /data/local/tmp/
|
||||
adb push test_image.jpg /data/local/tmp/
|
||||
|
||||
# Run on device
|
||||
adb shell
|
||||
cd /data/local/tmp
|
||||
chmod +x blazepose_landmark_demo
|
||||
export LD_LIBRARY_PATH=/vendor/lib64 or (/vendor/lib)
|
||||
|
||||
# Usage: ./blazepose_landmark_demo <model_path> <image_path>
|
||||
./blazepose_landmark_demo blazepose_landmark_full_int16_A311D2.adla test_image.jpg"
|
||||
```
|
||||
|
||||
**Note:** Replace `blazepose_landmark_full_int16_A311D2.adla` with your actual model file path.
|
||||
|
||||
### Python
|
||||
|
||||
**Prerequisites:**
|
||||
- Python 3.10
|
||||
- Required packages: `numpy`, `opencv-python`, `amlnnlite`
|
||||
|
||||
**Install dependencies:**
|
||||
```bash
|
||||
pip install numpy opencv-python amlnnlite-1.0.0-cp310-cp310-linux_aarch64.whl
|
||||
```
|
||||
|
||||
**Run on device:**
|
||||
```bash
|
||||
python blazepose_landmark.py --model-path ./blazepose_landmark_full_int16_A311D2.adla
|
||||
```
|
||||
|
||||
The script will automatically process all image files (`.jpg`, `.jpeg`, `.png`, `.bmp`) in the current directory and save results to a `{model_name}_result` folder.
|
||||
|
||||
## 5.Results
|
||||
The program will print the detection count and inference time. The result image with bounding boxes will be saved to the specified output path (`result.jpg` by default).
|
||||
|
||||
|
||||
You can pull the result image back to view it:
|
||||
```bash
|
||||
adb pull result.jpg.
|
||||
```
|
||||

|
||||
|
||||
0
examples/blazepose_landmark/cpp/.gitkeep
Normal file → Executable file
0
examples/blazepose_landmark/cpp/.gitkeep
Normal file → Executable file
77
examples/blazepose_landmark/cpp/build-android.sh
Executable file
77
examples/blazepose_landmark/cpp/build-android.sh
Executable file
|
|
@ -0,0 +1,77 @@
|
|||
#!/bin/bash
|
||||
set -e
|
||||
|
||||
#
|
||||
# Copyright (C) 2024–2025 Amlogic, Inc. All rights reserved.
|
||||
#
|
||||
# Licensed under the Apache License, Version 2.0 (the "License");
|
||||
# you may not use this file except in compliance with the License.
|
||||
# You may obtain a copy of the License at
|
||||
#
|
||||
# http://www.apache.org/licenses/LICENSE-2.0
|
||||
#
|
||||
# Unless required by applicable law or agreed to in writing, software
|
||||
# distributed under the License is distributed on an "AS IS" BASIS,
|
||||
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
|
||||
# See the License for the specific language governing permissions and
|
||||
# limitations under the License.
|
||||
#
|
||||
|
||||
usage() {
|
||||
echo "Usage: $0 [-a <target_abi>]"
|
||||
echo " -a <target_abi> : Target ABI (default: arm64-v8a)"
|
||||
echo " -h : Show this help message"
|
||||
exit 1
|
||||
}
|
||||
|
||||
# Default values
|
||||
TARGET_ABI=arm64-v8a
|
||||
|
||||
# Parse arguments
|
||||
while getopts 'a:h' opt; do
|
||||
case "$opt" in
|
||||
a)
|
||||
TARGET_ABI=$OPTARG
|
||||
;;
|
||||
h)
|
||||
usage
|
||||
;;
|
||||
*)
|
||||
usage
|
||||
;;
|
||||
esac
|
||||
done
|
||||
|
||||
if [ -z "${ANDROID_NDK_PATH}" ]; then
|
||||
if [ -n "${ANDROID_NDK}" ]; then
|
||||
ANDROID_NDK_PATH=${ANDROID_NDK}
|
||||
elif [ -n "${ANDROID_NDK_HOME}" ]; then
|
||||
ANDROID_NDK_PATH=${ANDROID_NDK_HOME}
|
||||
else
|
||||
echo "Error: ANDROID_NDK_PATH is not set."
|
||||
echo "Please set ANDROID_NDK_PATH to your Android NDK directory."
|
||||
exit 1
|
||||
fi
|
||||
fi
|
||||
|
||||
ROOT_PWD=$(cd "$(dirname $0)" && pwd)
|
||||
BUILD_DIR=${ROOT_PWD}/build/android
|
||||
|
||||
echo "Building for Android..."
|
||||
echo "NDK_PATH: ${ANDROID_NDK_PATH}"
|
||||
echo "TARGET_ABI: ${TARGET_ABI}"
|
||||
echo "BUILD_DIR: ${BUILD_DIR}"
|
||||
|
||||
mkdir -p ${BUILD_DIR}
|
||||
cd ${BUILD_DIR}
|
||||
|
||||
cmake ../../src \
|
||||
-DCMAKE_TOOLCHAIN_FILE=${ANDROID_NDK_PATH}/build/cmake/android.toolchain.cmake \
|
||||
-DANDROID_ABI=${TARGET_ABI} \
|
||||
-DANDROID_PLATFORM=android-24 \
|
||||
-DCMAKE_BUILD_TYPE=Release \
|
||||
-DOpenCV_DIR=${ROOT_PWD}/../../../dependency/opencv/opencv-android-sdk-build/sdk/native/jni/abi-${TARGET_ABI}
|
||||
|
||||
make -j4
|
||||
|
||||
echo "Build complete. Executable in ${BUILD_DIR}/blazepose_landmark_demo"
|
||||
168
examples/blazepose_landmark/cpp/build-linux.sh
Executable file
168
examples/blazepose_landmark/cpp/build-linux.sh
Executable file
|
|
@ -0,0 +1,168 @@
|
|||
#!/bin/bash
|
||||
set -e
|
||||
|
||||
#
|
||||
# Copyright (C) 2024–2025 Amlogic, Inc. All rights reserved.
|
||||
#
|
||||
# Licensed under the Apache License, Version 2.0 (the "License");
|
||||
# you may not use this file except in compliance with the License.
|
||||
# You may obtain a copy of the License at
|
||||
#
|
||||
# http://www.apache.org/licenses/LICENSE-2.0
|
||||
#
|
||||
# Unless required by applicable law or agreed to in writing, software
|
||||
# distributed under the License is distributed on an "AS IS" BASIS,
|
||||
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
|
||||
# See the License for the specific language governing permissions and
|
||||
# limitations under the License.
|
||||
#
|
||||
|
||||
usage() {
|
||||
echo "Usage: $0 [-m <mode>] [-a <target_arch>] [-b <arch_bits>] [-s <yocto_sdk_root>] [-t <toolchain_file>]"
|
||||
echo " -m <mode> : Build mode: 'linux' or 'yocto' (default: linux)"
|
||||
echo " -a <target> : Target arch for linux mode (default: aarch64)"
|
||||
echo " -b <arch_bits> : Arch bits for yocto mode: 32 or 64 (default: 64)"
|
||||
echo " -s <sdk_root> : Yocto SDK root path (overrides YOCTO_SDK_ROOT env var)"
|
||||
echo " -t <toolchain> : CMake toolchain file (overrides TOOLCHAIN_FILE env var)"
|
||||
echo " -h : Show this help message"
|
||||
exit 1
|
||||
}
|
||||
|
||||
# Default values
|
||||
BUILD_MODE=linux
|
||||
TARGET_ARCH=aarch64
|
||||
ARCH_BITS=64
|
||||
CLI_SDK_ROOT=""
|
||||
CLI_TOOLCHAIN_FILE=""
|
||||
|
||||
# Parse arguments
|
||||
while getopts 'm:a:b:s:t:h' opt; do
|
||||
case "$opt" in
|
||||
m)
|
||||
BUILD_MODE=$OPTARG
|
||||
;;
|
||||
a)
|
||||
TARGET_ARCH=$OPTARG
|
||||
;;
|
||||
b)
|
||||
ARCH_BITS=$OPTARG
|
||||
;;
|
||||
s)
|
||||
CLI_SDK_ROOT=$OPTARG
|
||||
;;
|
||||
t)
|
||||
CLI_TOOLCHAIN_FILE=$OPTARG
|
||||
;;
|
||||
h)
|
||||
usage
|
||||
;;
|
||||
*)
|
||||
usage
|
||||
;;
|
||||
esac
|
||||
done
|
||||
|
||||
ROOT_PWD=$(cd "$(dirname $0)" && pwd)
|
||||
|
||||
# ===========================================================================
|
||||
# Yocto build
|
||||
# ===========================================================================
|
||||
if [[ "${BUILD_MODE}" == "yocto" ]]; then
|
||||
|
||||
if [[ "${ARCH_BITS}" != "32" && "${ARCH_BITS}" != "64" ]]; then
|
||||
echo "Unsupported ARCH_BITS \"${ARCH_BITS}\". Must be 32 or 64." >&2
|
||||
exit 1
|
||||
fi
|
||||
|
||||
# Configurable via environment variables (CLI args > env vars > defaults)
|
||||
CMAKE_BIN="${CMAKE_BIN:-cmake}"
|
||||
YOCTO_SDK_ROOT="${CLI_SDK_ROOT:-${YOCTO_SDK_ROOT:-/data/yuandian/tools/poky/4.0.20}}"
|
||||
TOOLCHAIN_FILE="${CLI_TOOLCHAIN_FILE:-${TOOLCHAIN_FILE:-${ROOT_PWD}/../../cmake/yocto-toolchain.cmake}}"
|
||||
|
||||
# Export variables for CMake
|
||||
export YOCTO_SDK_ROOT
|
||||
export ARCH_BITS
|
||||
|
||||
BUILD_DIR="${ROOT_PWD}/build/yocto/${ARCH_BITS}"
|
||||
|
||||
echo "==> Building Yocto ${ARCH_BITS}-bit"
|
||||
echo " toolchain : ${TOOLCHAIN_FILE}"
|
||||
echo " SDK root : ${YOCTO_SDK_ROOT}"
|
||||
echo " BUILD_DIR : ${BUILD_DIR}"
|
||||
|
||||
mkdir -p "${BUILD_DIR}"
|
||||
rm -rf "${BUILD_DIR}"
|
||||
|
||||
# Select OpenCV based on target architecture
|
||||
if [[ "${ARCH_BITS}" == "32" ]]; then
|
||||
OPENCV_DIR="${ROOT_PWD}/../../../dependency/opencv/opencv-linux-armhf/share/OpenCV"
|
||||
else
|
||||
OPENCV_DIR="${ROOT_PWD}/../../../dependency/opencv/opencv-linux-aarch64/share/OpenCV"
|
||||
fi
|
||||
|
||||
"${CMAKE_BIN}" \
|
||||
-S "${ROOT_PWD}/src" \
|
||||
-B "${BUILD_DIR}" \
|
||||
-DCMAKE_TOOLCHAIN_FILE="${TOOLCHAIN_FILE}" \
|
||||
-DYOCTO_SDK_ROOT="${YOCTO_SDK_ROOT}" \
|
||||
-DARCH_BITS="${ARCH_BITS}" \
|
||||
-DCMAKE_BUILD_TYPE=Release \
|
||||
-DOpenCV_DIR="${OPENCV_DIR}"
|
||||
|
||||
"${CMAKE_BIN}" --build "${BUILD_DIR}" --config Release
|
||||
|
||||
# Strip (best-effort)
|
||||
HOST_SYSROOT="${YOCTO_SDK_ROOT}/sysroots/x86_64-pokysdk-linux"
|
||||
if [[ "${ARCH_BITS}" == "32" ]]; then
|
||||
CROSS_TRIPLE="arm-poky-linux-gnueabi"
|
||||
else
|
||||
CROSS_TRIPLE="aarch64-poky-linux"
|
||||
fi
|
||||
STRIP_TOOL="${HOST_SYSROOT}/usr/bin/${CROSS_TRIPLE}/${CROSS_TRIPLE}-strip"
|
||||
if [[ -x "${STRIP_TOOL}" ]]; then
|
||||
"${STRIP_TOOL}" --strip-unneeded "${BUILD_DIR}/blazepose_landmark_demo"
|
||||
else
|
||||
echo "warning: strip tool not found; keeping debug info." >&2
|
||||
fi
|
||||
|
||||
echo "Build complete. Executable in ${BUILD_DIR}/blazepose_landmark_demo"
|
||||
exit 0
|
||||
fi
|
||||
|
||||
# ===========================================================================
|
||||
# Standard Linux cross-compile build
|
||||
# ===========================================================================
|
||||
|
||||
# Default to aarch64-linux-gnu if GCC_COMPILER is not set
|
||||
GCC_COMPILER=${GCC_COMPILER:-aarch64-linux-gnu}
|
||||
|
||||
# Set compilers
|
||||
export CC=${GCC_COMPILER}-gcc
|
||||
export CXX=${GCC_COMPILER}-g++
|
||||
|
||||
# Validate compiler
|
||||
if ! command -v ${CC} &> /dev/null; then
|
||||
echo "Error: Compiler ${CC} not found."
|
||||
echo "Please set GCC_COMPILER environment variable to your cross-compiler path prefix."
|
||||
echo "Example: export GCC_COMPILER=/path/to/toolchain/bin/aarch64-linux-gnu"
|
||||
exit 1
|
||||
fi
|
||||
|
||||
BUILD_DIR=${ROOT_PWD}/build/linux
|
||||
|
||||
echo "Building for Linux..."
|
||||
echo "COMPILER: ${CC}"
|
||||
echo "TARGET_ARCH: ${TARGET_ARCH}"
|
||||
echo "BUILD_DIR: ${BUILD_DIR}"
|
||||
|
||||
mkdir -p ${BUILD_DIR}
|
||||
cd ${BUILD_DIR}
|
||||
|
||||
cmake ../../src \
|
||||
-DCMAKE_SYSTEM_NAME=Linux \
|
||||
-DCMAKE_SYSTEM_PROCESSOR=${TARGET_ARCH} \
|
||||
-DCMAKE_BUILD_TYPE=Release
|
||||
|
||||
make -j4
|
||||
|
||||
echo "Build complete. Executable in ${BUILD_DIR}/blazepose_landmark_demo"
|
||||
36
examples/blazepose_landmark/cpp/src/CMakeLists.txt
Executable file
36
examples/blazepose_landmark/cpp/src/CMakeLists.txt
Executable file
|
|
@ -0,0 +1,36 @@
|
|||
cmake_minimum_required(VERSION 3.10...3.27)
|
||||
project(blazepose_landmark_demo)
|
||||
|
||||
set(CMAKE_CXX_STANDARD 17)
|
||||
|
||||
list(APPEND CMAKE_MODULE_PATH "${CMAKE_SOURCE_DIR}/../../../../cmake")
|
||||
find_package(AMLNN REQUIRED)
|
||||
include_directories(${AMLNN_INCLUDE_DIR})
|
||||
link_directories(${AMLNN_LIBRARY_DIR})
|
||||
|
||||
include_directories(${CMAKE_SOURCE_DIR}/../../../../common)
|
||||
|
||||
# Set 3rdparty path
|
||||
set(3RDPARTY_DIR "${CMAKE_SOURCE_DIR}/../../../../dependency")
|
||||
|
||||
if(CMAKE_SYSTEM_NAME STREQUAL "Android")
|
||||
# Android needs log
|
||||
link_libraries(log)
|
||||
endif()
|
||||
|
||||
# Find OpenCV
|
||||
message(STATUS "OpenCV_DIR: ${OpenCV_DIR}")
|
||||
find_package(OpenCV REQUIRED)
|
||||
include_directories(${OpenCV_INCLUDE_DIRS})
|
||||
|
||||
add_executable(blazepose_landmark_demo
|
||||
main.cpp
|
||||
postprocess.cpp
|
||||
postprocess.h
|
||||
${CMAKE_SOURCE_DIR}/../../../../common/model_loader.cpp
|
||||
)
|
||||
|
||||
target_link_libraries(blazepose_landmark_demo
|
||||
${OpenCV_LIBS}
|
||||
${AMLNN_LIBRARY}
|
||||
)
|
||||
141
examples/blazepose_landmark/cpp/src/main.cpp
Executable file
141
examples/blazepose_landmark/cpp/src/main.cpp
Executable file
|
|
@ -0,0 +1,141 @@
|
|||
/*
|
||||
* Copyright (C) 2024–2025 Amlogic, Inc. All rights reserved.
|
||||
*
|
||||
* Licensed under the Apache License, Version 2.0 (the "License");
|
||||
* you may not use this file except in compliance with the License.
|
||||
* You may obtain a copy of the License at
|
||||
*
|
||||
* http://www.apache.org/licenses/LICENSE-2.0
|
||||
*
|
||||
* Unless required by applicable law or agreed to in writing, software
|
||||
* distributed under the License is distributed on an "AS IS" BASIS,
|
||||
* WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
|
||||
* See the License for the specific language governing permissions and
|
||||
* limitations under the License.
|
||||
*/
|
||||
|
||||
#include <iostream>
|
||||
#include <string>
|
||||
#include <vector>
|
||||
#include <chrono>
|
||||
#include <tuple>
|
||||
#include <iomanip>
|
||||
#include <fstream>
|
||||
#include <opencv2/opencv.hpp>
|
||||
#include "postprocess.h"
|
||||
#include "model_loader.h"
|
||||
|
||||
const std::string DEFAULT_OUTPUT_PATH = "./result.jpg";
|
||||
const int MODEL_INPUT_WIDTH = 256;
|
||||
const int MODEL_INPUT_HEIGHT = 256;
|
||||
const float SCORE_THRESHOLD = 0.5f;
|
||||
|
||||
int main(int argc, char **argv)
|
||||
{
|
||||
std::string model_path;
|
||||
std::string image_path;
|
||||
|
||||
if (argc != 3)
|
||||
{
|
||||
printf("%s <model_path> <image_path>\n", argv[0]);
|
||||
return -1;
|
||||
}
|
||||
|
||||
if (argc > 1)
|
||||
model_path = argv[1];
|
||||
if (argc > 2)
|
||||
image_path = argv[2];
|
||||
|
||||
std::cout << "Blazepose Detect Demo" << std::endl;
|
||||
std::cout << "Model: " << model_path << std::endl;
|
||||
std::cout << "Image: " << image_path << std::endl;
|
||||
std::cout << "Output: " << DEFAULT_OUTPUT_PATH << std::endl;
|
||||
|
||||
// 1. Load Image
|
||||
cv::Mat img = cv::imread(image_path);
|
||||
if (img.empty())
|
||||
{
|
||||
std::cerr << "Failed to load image from " << image_path << std::endl;
|
||||
return -1;
|
||||
}
|
||||
// Load detections
|
||||
// n * 13 detections
|
||||
// image_path -> txt_path
|
||||
std::vector<std::vector<float>> detections;
|
||||
|
||||
std::string txt_path = image_path.substr(0, image_path.find_last_of('.'));
|
||||
txt_path += ".txt";
|
||||
std::ifstream ifs(txt_path);
|
||||
for (std::string line; std::getline(ifs, line);)
|
||||
{
|
||||
std::istringstream iss(line);
|
||||
std::vector<float> det;
|
||||
float val;
|
||||
while (iss >> val)
|
||||
det.push_back(val);
|
||||
if (!det.empty())
|
||||
detections.push_back(det);
|
||||
}
|
||||
|
||||
// 2. Initialize Network
|
||||
void *context = init_network(model_path.c_str());
|
||||
if (!context)
|
||||
{
|
||||
std::cerr << "Failed to initialize network." << std::endl;
|
||||
return -1;
|
||||
}
|
||||
|
||||
// 3. Preprocess
|
||||
auto start_time = std::chrono::high_resolution_clock::now();
|
||||
|
||||
auto [preprocessed, affine] = preprocess(img, detections, std::make_tuple(MODEL_INPUT_HEIGHT, MODEL_INPUT_WIDTH));
|
||||
|
||||
// Quantize to int16 (model expects quantized input)
|
||||
cv::Mat quantized_img = quantize_input(preprocessed, 0.000030518509447574615f);
|
||||
|
||||
// 4. Set input and run inference
|
||||
nn_input inData;
|
||||
memset(&inData, 0, sizeof(nn_input));
|
||||
inData.input_type = BINARY_RAW_DATA;
|
||||
inData.input = quantized_img.data;
|
||||
inData.input_index = 0;
|
||||
inData.size = quantized_img.total() * quantized_img.elemSize();
|
||||
|
||||
if (aml_module_input_set(context, &inData) != 0)
|
||||
{
|
||||
std::cerr << "Failed to set input." << std::endl;
|
||||
uninit_network(context);
|
||||
return -1;
|
||||
}
|
||||
|
||||
aml_output_config_t outconfig;
|
||||
memset(&outconfig, 0, sizeof(aml_output_config_t));
|
||||
outconfig.typeSize = sizeof(aml_output_config_t);
|
||||
outconfig.format = AML_OUTDATA_FLOAT32;
|
||||
|
||||
nn_output *outdata = (nn_output *)aml_module_output_get(context, outconfig);
|
||||
if (!outdata)
|
||||
{
|
||||
std::cerr << "Failed to run network." << std::endl;
|
||||
uninit_network(context);
|
||||
return -1;
|
||||
}
|
||||
|
||||
// 5. Postprocess
|
||||
std::vector<BlazePoseLandmark> landmarks = postprocess(outdata, affine);
|
||||
auto end_time = std::chrono::high_resolution_clock::now();
|
||||
std::chrono::duration<double, std::milli> inference_time = end_time - start_time;
|
||||
|
||||
std::cout << "Inference time: " << inference_time.count() << " ms" << std::endl;
|
||||
std::cout << "Landmarks: " << landmarks.size() << std::endl;
|
||||
|
||||
// 6. Draw and Save
|
||||
cv::Mat result_img = draw_landmarks(img, landmarks, SCORE_THRESHOLD);
|
||||
cv::imwrite(DEFAULT_OUTPUT_PATH, result_img);
|
||||
std::cout << "Result saved to " << DEFAULT_OUTPUT_PATH << std::endl;
|
||||
|
||||
// 7. Cleanup
|
||||
uninit_network(context);
|
||||
|
||||
return 0;
|
||||
}
|
||||
337
examples/blazepose_landmark/cpp/src/postprocess.cpp
Executable file
337
examples/blazepose_landmark/cpp/src/postprocess.cpp
Executable file
|
|
@ -0,0 +1,337 @@
|
|||
/*
|
||||
* Copyright (C) 2024–2025 Amlogic, Inc. All rights reserved.
|
||||
*
|
||||
* Licensed under the Apache License, Version 2.0 (the "License");
|
||||
* you may not use this file except in compliance with the License.
|
||||
* You may obtain a copy of the License at
|
||||
*
|
||||
* http://www.apache.org/licenses/LICENSE-2.0
|
||||
*
|
||||
* Unless required by applicable law or agreed to in writing, software
|
||||
* distributed under the License is distributed on an "AS IS" BASIS,
|
||||
* WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
|
||||
* See the License for the specific language governing permissions and
|
||||
* limitations under the License.
|
||||
*/
|
||||
|
||||
#include "postprocess.h"
|
||||
#include <iostream>
|
||||
#include <cmath>
|
||||
#include <algorithm>
|
||||
#include <unordered_map>
|
||||
|
||||
#define LOGI(...) \
|
||||
do \
|
||||
{ \
|
||||
printf(__VA_ARGS__); \
|
||||
printf("\n"); \
|
||||
} while (0)
|
||||
#define LOGE(...) \
|
||||
do \
|
||||
{ \
|
||||
fprintf(stderr, __VA_ARGS__); \
|
||||
fprintf(stderr, "\n"); \
|
||||
} while (0)
|
||||
|
||||
// SHOW class names (1 classes)
|
||||
const char *SHOW_CLASSES[1] = {"lm"};
|
||||
|
||||
inline float sigmoid(float x)
|
||||
{
|
||||
return 1.0f / (1.0f + std::exp(-x));
|
||||
}
|
||||
|
||||
struct ROI
|
||||
{
|
||||
float x_center;
|
||||
float y_center;
|
||||
float scale;
|
||||
float theta;
|
||||
};
|
||||
|
||||
ROI detection_to_roi(const std::vector<float> &detection, int kp1 = 0, int kp2 = 1)
|
||||
{
|
||||
float theta0 = 90.f * M_PI / 180.f;
|
||||
float dscale = 1.1f; // 1.0 * 256 / 224; // 1.1f;
|
||||
float dy = 0.f;
|
||||
|
||||
float x_center = detection[4 + 2 * kp1];
|
||||
float y_center = detection[4 + 2 * kp1 + 1];
|
||||
float x1 = detection[4 + 2 * kp2];
|
||||
float y1 = detection[4 + 2 * kp2 + 1];
|
||||
|
||||
float roi_scale = std::sqrt((x_center - x1) * (x_center - x1) + (y_center - y1) * (y_center - y1)) * 2.f;
|
||||
y_center += dy * roi_scale;
|
||||
roi_scale *= dscale;
|
||||
|
||||
float theta = std::atan2(detection[4 + 2 * kp1 + 1] - detection[4 + 2 * kp2 + 1], detection[4 + 2 * kp1] - detection[4 + 2 * kp2]) - theta0;
|
||||
|
||||
return {x_center, y_center, roi_scale, theta};
|
||||
}
|
||||
|
||||
cv::Mat extract_roi(cv::Mat &frame, const ROI &roi, int resolution, cv::Mat &affine)
|
||||
{
|
||||
cv::Point2f src_pts[3];
|
||||
src_pts[0] = cv::Point2f(-roi.scale / 2.f, -roi.scale / 2.f); // will map to (0,0)
|
||||
src_pts[1] = cv::Point2f(-roi.scale / 2.f, roi.scale / 2.f); // will map to (0,res-1)
|
||||
src_pts[2] = cv::Point2f(roi.scale / 2.f, -roi.scale / 2.f); // will map to (res-1,0)
|
||||
float cos_theta = std::cos(roi.theta);
|
||||
float sin_theta = std::sin(roi.theta);
|
||||
for (int i = 0; i < 3; i++)
|
||||
{
|
||||
float x = src_pts[i].x;
|
||||
float y = src_pts[i].y;
|
||||
src_pts[i].x = roi.x_center + x * cos_theta - y * sin_theta;
|
||||
src_pts[i].y = roi.y_center + x * sin_theta + y * cos_theta;
|
||||
}
|
||||
cv::Point2f dst_pts[3] = {
|
||||
cv::Point2f(0.f, 0.f),
|
||||
cv::Point2f(0.f, resolution - 1.f),
|
||||
cv::Point2f(resolution - 1.f, 0.f)};
|
||||
|
||||
cv::Mat roi_img;
|
||||
cv::Mat M = cv::getAffineTransform(src_pts, dst_pts);
|
||||
cv::invertAffineTransform(M, affine);
|
||||
cv::warpAffine(frame, roi_img, M, cv::Size(resolution, resolution));
|
||||
|
||||
return roi_img;
|
||||
}
|
||||
|
||||
std::tuple<cv::Mat, cv::Mat> preprocess(cv::Mat img, std::vector<std::vector<float>> &detections, std::tuple<int, int> new_shape)
|
||||
{
|
||||
cv::Mat img_rgb;
|
||||
|
||||
if (img.empty())
|
||||
{
|
||||
LOGE("Preprocess received empty image");
|
||||
return {};
|
||||
}
|
||||
|
||||
// Convert to RGB
|
||||
if (img.channels() == 4)
|
||||
cv::cvtColor(img, img_rgb, cv::COLOR_RGBA2RGB);
|
||||
else if (img.channels() == 3)
|
||||
cv::cvtColor(img, img_rgb, cv::COLOR_BGR2RGB);
|
||||
else
|
||||
img_rgb = img.clone();
|
||||
|
||||
ROI roi = detection_to_roi(detections[0]); // get the first bounding box
|
||||
cv::Mat affine;
|
||||
cv::Mat roi_img = extract_roi(img_rgb, roi, IMAGE_SIZE, affine);
|
||||
|
||||
cv::Mat img_float;
|
||||
roi_img.convertTo(img_float, CV_32F, 1.0 / 255.0);
|
||||
|
||||
return std::make_tuple(img_float, affine);
|
||||
}
|
||||
|
||||
cv::Mat quantize_input(const cv::Mat &float_img, float scale, int16_t zero_point)
|
||||
{
|
||||
if (float_img.empty() || float_img.type() != CV_32FC3)
|
||||
{
|
||||
LOGE("quantize_input: Invalid input image (must be CV_32FC3)");
|
||||
return cv::Mat();
|
||||
}
|
||||
|
||||
cv::Mat quantized_img(float_img.rows, float_img.cols, CV_16SC3);
|
||||
const float *src_ptr = (const float *)float_img.data;
|
||||
int16_t *dst_ptr = (int16_t *)quantized_img.data;
|
||||
|
||||
int total_elements = float_img.total() * float_img.channels();
|
||||
// for (int i = 0; i < total_elements; ++i)
|
||||
// {
|
||||
// dst_ptr[i] = static_cast<int16_t>(std::round(src_ptr[i] / scale + zero_point));
|
||||
// }
|
||||
for (int i = 0; i < total_elements; ++i)
|
||||
{
|
||||
int32_t q = static_cast<int32_t>(std::round(src_ptr[i] / scale));
|
||||
q = std::max(-32768, std::min(32767, q));
|
||||
dst_ptr[i] = static_cast<int16_t>(q);
|
||||
}
|
||||
|
||||
return quantized_img;
|
||||
}
|
||||
|
||||
void blazepose_postprocess(const float *landmarks, float *normalized_landmarks)
|
||||
{
|
||||
if (!landmarks || !normalized_landmarks)
|
||||
return;
|
||||
|
||||
for (int j = 0; j < NUM_LANDMARKS; j++)
|
||||
{
|
||||
float x = landmarks[j * LANDMARK_FEATURE_DIM + 0] / IMAGE_SIZE;
|
||||
float y = landmarks[j * LANDMARK_FEATURE_DIM + 1] / IMAGE_SIZE;
|
||||
float z = landmarks[j * LANDMARK_FEATURE_DIM + 2] / IMAGE_SIZE;
|
||||
float visibility = landmarks[j * LANDMARK_FEATURE_DIM + 3];
|
||||
float presence = landmarks[j * LANDMARK_FEATURE_DIM + 4];
|
||||
|
||||
float score = sigmoid(fminf(visibility, presence));
|
||||
normalized_landmarks[j * LANDMARK_OUT_DIM + 0] = x;
|
||||
normalized_landmarks[j * LANDMARK_OUT_DIM + 1] = y;
|
||||
normalized_landmarks[j * LANDMARK_OUT_DIM + 2] = z;
|
||||
normalized_landmarks[j * LANDMARK_OUT_DIM + 3] = score;
|
||||
}
|
||||
}
|
||||
|
||||
/**
|
||||
* Denormalize landmarks: map normalized coordinates back to original image using affine
|
||||
* @param landmarks Input/Output: [NUM_LANDMARKS * LANDMARK_OUT_DIM], first three dimensions are x, y, z
|
||||
* @param affine Input: [2 x 3] affine matrix (CV_32F)
|
||||
*/
|
||||
void blazepose_denorm_landmarks(float *landmarks, const cv::Mat &affine)
|
||||
{
|
||||
if (!landmarks || affine.empty() || affine.rows != 2 || affine.cols != 3)
|
||||
{
|
||||
return;
|
||||
}
|
||||
|
||||
const double *a = affine.ptr<double>();
|
||||
double a00 = a[0], a01 = a[1], a02 = a[2];
|
||||
double a10 = a[3], a11 = a[4], a12 = a[5];
|
||||
for (int j = 0; j < NUM_LANDMARKS; j++)
|
||||
{
|
||||
float *p = landmarks + j * LANDMARK_OUT_DIM;
|
||||
// scale to input resolution
|
||||
float x = p[0] * IMAGE_SIZE;
|
||||
float y = p[1] * IMAGE_SIZE;
|
||||
float z = p[2] * IMAGE_SIZE;
|
||||
|
||||
// apply affine transform
|
||||
float new_x = a00 * x + a01 * y + a02;
|
||||
float new_y = a10 * x + a11 * y + a12;
|
||||
|
||||
p[0] = new_x;
|
||||
p[1] = new_y;
|
||||
p[2] = z;
|
||||
|
||||
}
|
||||
}
|
||||
|
||||
std::vector<BlazePoseLandmark> postprocess(nn_output *outdata, const cv::Mat &affine)
|
||||
{
|
||||
// keep all outputs, even if unused
|
||||
float *world_landmarks = (float *)outdata->out[0].buf;
|
||||
float *heatmap = (float *)outdata->out[1].buf;
|
||||
float *flags = (float *)outdata->out[2].buf;
|
||||
float *landmarks = (float *)outdata->out[4].buf;
|
||||
|
||||
float *normalized_landmarks =
|
||||
new float[NUM_LANDMARKS * LANDMARK_OUT_DIM]();
|
||||
|
||||
blazepose_postprocess(landmarks, normalized_landmarks);
|
||||
|
||||
// refine_landmark_from_heatmap(normalized_landmarks, 39, heatmap, 64, 64);
|
||||
|
||||
blazepose_denorm_landmarks(normalized_landmarks, affine);
|
||||
|
||||
std::vector<BlazePoseLandmark> pose_res;
|
||||
pose_res.reserve(1);
|
||||
|
||||
BlazePoseLandmark pose;
|
||||
pose.landmarks.resize(NUM_LANDMARKS);
|
||||
|
||||
for (int i = 0; i < NUM_LANDMARKS; ++i)
|
||||
{
|
||||
int base = i * LANDMARK_OUT_DIM;
|
||||
|
||||
double x = normalized_landmarks[base + 0]; // x
|
||||
double y = normalized_landmarks[base + 1]; // y
|
||||
double z = normalized_landmarks[base + 2]; // z
|
||||
double score = normalized_landmarks[base + 3]; // score
|
||||
|
||||
pose.landmarks[i] = {x, y, z, score};
|
||||
}
|
||||
|
||||
pose_res.push_back(pose);
|
||||
|
||||
delete[] normalized_landmarks;
|
||||
normalized_landmarks = nullptr;
|
||||
|
||||
return pose_res;
|
||||
}
|
||||
|
||||
static const std::vector<std::pair<int, int>> POSE_CONNECTIONS = {
|
||||
// Face
|
||||
{0, 1},
|
||||
{1, 2},
|
||||
{2, 3},
|
||||
{3, 7},
|
||||
{0, 4},
|
||||
{4, 5},
|
||||
{5, 6},
|
||||
{6, 8},
|
||||
// Mouth
|
||||
{9, 10},
|
||||
// Shoulders
|
||||
{11, 12},
|
||||
// Right arm
|
||||
{11, 13},
|
||||
{13, 15},
|
||||
{15, 17},
|
||||
{15, 19},
|
||||
{15, 21},
|
||||
{17, 19},
|
||||
// Left arm
|
||||
{12, 14},
|
||||
{14, 16},
|
||||
{16, 18},
|
||||
{16, 20},
|
||||
{16, 22},
|
||||
{18, 20},
|
||||
// Torso
|
||||
{11, 23},
|
||||
{12, 24},
|
||||
{23, 24},
|
||||
// Right leg
|
||||
{23, 25},
|
||||
{25, 27},
|
||||
{27, 29},
|
||||
{27, 31},
|
||||
{29, 31},
|
||||
// Left leg
|
||||
{24, 26},
|
||||
{26, 28},
|
||||
{28, 30},
|
||||
{28, 32},
|
||||
{30, 32}};
|
||||
|
||||
cv::Mat draw_landmarks(cv::Mat image, const std::vector<BlazePoseLandmark> &landmarks, float score_threshold)
|
||||
{
|
||||
cv::Mat out = image.clone();
|
||||
|
||||
for (const auto &lm : landmarks)
|
||||
{
|
||||
const auto &lms = lm.landmarks;
|
||||
|
||||
for (size_t i = 0; i < lms.size(); ++i)
|
||||
{
|
||||
int x = static_cast<int>(lms[i][0]);
|
||||
int y = static_cast<int>(lms[i][1]);
|
||||
double v = lms[i][3];
|
||||
|
||||
if (v < score_threshold)
|
||||
continue;
|
||||
|
||||
cv::circle(out, cv::Point(x, y), 3, cv::Scalar(0, 255, 0), -1);
|
||||
}
|
||||
|
||||
for (const auto &conn : POSE_CONNECTIONS)
|
||||
{
|
||||
int i0 = conn.first;
|
||||
int i1 = conn.second;
|
||||
|
||||
if (i0 >= lms.size() || i1 >= lms.size())
|
||||
continue;
|
||||
|
||||
if (lms[i0][3] < score_threshold || lms[i1][3] < score_threshold)
|
||||
continue;
|
||||
|
||||
cv::Point p0(static_cast<int>(lms[i0][0]), static_cast<int>(lms[i0][1]));
|
||||
cv::Point p1(static_cast<int>(lms[i1][0]), static_cast<int>(lms[i1][1]));
|
||||
|
||||
cv::line(out, p0, p1, cv::Scalar(255, 0, 0), 2);
|
||||
}
|
||||
}
|
||||
|
||||
return out;
|
||||
}
|
||||
50
examples/blazepose_landmark/cpp/src/postprocess.h
Executable file
50
examples/blazepose_landmark/cpp/src/postprocess.h
Executable file
|
|
@ -0,0 +1,50 @@
|
|||
/*
|
||||
* Copyright (C) 2024–2025 Amlogic, Inc. All rights reserved.
|
||||
*
|
||||
* Licensed under the Apache License, Version 2.0 (the "License");
|
||||
* you may not use this file except in compliance with the License.
|
||||
* You may obtain a copy of the License at
|
||||
*
|
||||
* http://www.apache.org/licenses/LICENSE-2.0
|
||||
*
|
||||
* Unless required by applicable law or agreed to in writing, software
|
||||
* distributed under the License is distributed on an "AS IS" BASIS,
|
||||
* WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
|
||||
* See the License for the specific language governing permissions and
|
||||
* limitations under the License.
|
||||
*/
|
||||
|
||||
#ifndef _AMLNN_BLAZEPOSE_LANDMARK_POSTPROCESS_H_
|
||||
#define _AMLNN_BLAZEPOSE_LANDMARK_POSTPROCESS_H_
|
||||
|
||||
#include <opencv2/opencv.hpp>
|
||||
#include <vector>
|
||||
#include <tuple>
|
||||
#include <string>
|
||||
|
||||
#include "model_loader.h"
|
||||
|
||||
#define NUM_LANDMARKS 33
|
||||
#define LANDMARK_OUT_DIM 4
|
||||
#define LANDMARK_FEATURE_DIM 5
|
||||
#define IMAGE_SIZE 256
|
||||
// BlazePoseLandmark result structure
|
||||
|
||||
struct BlazePoseLandmark
|
||||
{
|
||||
std::vector<std::vector<double>> landmarks; // [N][x,y,z,v]
|
||||
};
|
||||
|
||||
// Preprocess image with letterbox resizing
|
||||
std::tuple<cv::Mat, cv::Mat> preprocess(cv::Mat img, std::vector<std::vector<float>> &detections, std::tuple<int, int> new_shape);
|
||||
|
||||
// Quantize float32 image to int8 for model input
|
||||
cv::Mat quantize_input(const cv::Mat &float_img, float scale = 0.000030518509447574615f, int16_t zero_point = 0);
|
||||
|
||||
// Postprocess blazepose_landmark outputs with DFL decoding
|
||||
std::vector<BlazePoseLandmark> postprocess(nn_output *outdata, const cv::Mat &affine);
|
||||
|
||||
// Draw detections on image
|
||||
cv::Mat draw_landmarks(cv::Mat image, const std::vector<BlazePoseLandmark> &landmarks, float score_threshold = 0.5);
|
||||
|
||||
#endif // _AMLNN_BLAZEPOSE_LANDMARK_POSTPROCESS_H_
|
||||
0
examples/blazepose_landmark/model/.gitkeep
Normal file → Executable file
0
examples/blazepose_landmark/model/.gitkeep
Normal file → Executable file
40
examples/blazepose_landmark/model/adla_convert.sh
Executable file
40
examples/blazepose_landmark/model/adla_convert.sh
Executable file
|
|
@ -0,0 +1,40 @@
|
|||
#
|
||||
# Copyright (C) 2026 Amlogic, Inc. All rights reserved.
|
||||
#
|
||||
# Licensed under the Apache License, Version 2.0 (the "License");
|
||||
# you may not use this file except in compliance with the License.
|
||||
# You may obtain a copy of the License at
|
||||
#
|
||||
# http://www.apache.org/licenses/LICENSE-2.0
|
||||
#
|
||||
# Unless required by applicable law or agreed to in writing, software
|
||||
# distributed under the License is distributed on an "AS IS" BASIS,
|
||||
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
|
||||
# See the License for the specific language governing permissions and
|
||||
# limitations under the License.
|
||||
#
|
||||
|
||||
# 1. $1: set ADLA_TOOL_PATH
|
||||
# 2. $2: set target-platform
|
||||
# for A311D2 target-platform is PRODUCT_PID0XA003
|
||||
# for S905X5 target-platform is PRODUCT_PID0XA005
|
||||
# Usage: ./adla_convert.sh pose_landmark_full.tflite /XXX/adla-toolkit-binary-3.2.9.3 PRODUCT_PID0XA005
|
||||
|
||||
model_path=$1
|
||||
ADLA_TOOL_PATH=$2
|
||||
target_platform=$3
|
||||
|
||||
echo "model_path:[$model_path]"
|
||||
echo "ADLA_TOOL_PATH:[$ADLA_TOOL_PATH]"
|
||||
echo "target-platform:[$target_platform]"
|
||||
|
||||
adla_convert=${ADLA_TOOL_PATH}/bin/adla_convert
|
||||
|
||||
$adla_convert --model-type tflite \
|
||||
--model $model_path \
|
||||
--inputs input_1 --input-shapes "256,256,3" \
|
||||
--quantize-dtype int16 \
|
||||
--source-file dataset_coco.txt \
|
||||
--channel-mean-value "0,0,0,255" \
|
||||
--target-platform $target_platform \
|
||||
--disable-per-channel false
|
||||
50
examples/blazepose_landmark/model/dataset_coco.txt
Executable file
50
examples/blazepose_landmark/model/dataset_coco.txt
Executable file
|
|
@ -0,0 +1,50 @@
|
|||
../../../resource/coco_dataset/000000000139.jpg
|
||||
../../../resource/coco_dataset/000000000285.jpg
|
||||
../../../resource/coco_dataset/000000000632.jpg
|
||||
../../../resource/coco_dataset/000000000724.jpg
|
||||
../../../resource/coco_dataset/000000000776.jpg
|
||||
../../../resource/coco_dataset/000000000785.jpg
|
||||
../../../resource/coco_dataset/000000000802.jpg
|
||||
../../../resource/coco_dataset/000000000872.jpg
|
||||
../../../resource/coco_dataset/000000000885.jpg
|
||||
../../../resource/coco_dataset/000000001000.jpg
|
||||
../../../resource/coco_dataset/000000001268.jpg
|
||||
../../../resource/coco_dataset/000000001296.jpg
|
||||
../../../resource/coco_dataset/000000001353.jpg
|
||||
../../../resource/coco_dataset/000000001425.jpg
|
||||
../../../resource/coco_dataset/000000001490.jpg
|
||||
../../../resource/coco_dataset/000000001503.jpg
|
||||
../../../resource/coco_dataset/000000001532.jpg
|
||||
../../../resource/coco_dataset/000000001584.jpg
|
||||
../../../resource/coco_dataset/000000001675.jpg
|
||||
../../../resource/coco_dataset/000000001761.jpg
|
||||
../../../resource/coco_dataset/000000001818.jpg
|
||||
../../../resource/coco_dataset/000000001993.jpg
|
||||
../../../resource/coco_dataset/000000002006.jpg
|
||||
../../../resource/coco_dataset/000000002149.jpg
|
||||
../../../resource/coco_dataset/000000002153.jpg
|
||||
../../../resource/coco_dataset/000000002157.jpg
|
||||
../../../resource/coco_dataset/000000002261.jpg
|
||||
../../../resource/coco_dataset/000000002299.jpg
|
||||
../../../resource/coco_dataset/000000002431.jpg
|
||||
../../../resource/coco_dataset/000000002473.jpg
|
||||
../../../resource/coco_dataset/000000002532.jpg
|
||||
../../../resource/coco_dataset/000000002587.jpg
|
||||
../../../resource/coco_dataset/000000002592.jpg
|
||||
../../../resource/coco_dataset/000000002685.jpg
|
||||
../../../resource/coco_dataset/000000002923.jpg
|
||||
../../../resource/coco_dataset/000000003156.jpg
|
||||
../../../resource/coco_dataset/000000003255.jpg
|
||||
../../../resource/coco_dataset/000000003501.jpg
|
||||
../../../resource/coco_dataset/000000003553.jpg
|
||||
../../../resource/coco_dataset/000000003661.jpg
|
||||
../../../resource/coco_dataset/000000003845.jpg
|
||||
../../../resource/coco_dataset/000000003934.jpg
|
||||
../../../resource/coco_dataset/000000004134.jpg
|
||||
../../../resource/coco_dataset/000000004395.jpg
|
||||
../../../resource/coco_dataset/000000004495.jpg
|
||||
../../../resource/coco_dataset/000000004765.jpg
|
||||
../../../resource/coco_dataset/000000004795.jpg
|
||||
../../../resource/coco_dataset/000000005001.jpg
|
||||
../../../resource/coco_dataset/000000005037.jpg
|
||||
../../../resource/coco_dataset/000000005060.jpg
|
||||
BIN
examples/blazepose_landmark/model/test.png
Executable file
BIN
examples/blazepose_landmark/model/test.png
Executable file
Binary file not shown.
|
After Width: | Height: | Size: 2.4 MiB |
0
examples/blazepose_landmark/py/.gitkeep
Normal file → Executable file
0
examples/blazepose_landmark/py/.gitkeep
Normal file → Executable file
301
examples/blazepose_landmark/py/balzepose_landmark.py
Executable file
301
examples/blazepose_landmark/py/balzepose_landmark.py
Executable file
|
|
@ -0,0 +1,301 @@
|
|||
#
|
||||
# Copyright (C) 2026 Amlogic, Inc. All rights reserved.
|
||||
#
|
||||
# Licensed under the Apache License, Version 2.0 (the "License");
|
||||
# you may not use this file except in compliance with the License.
|
||||
# You may obtain a copy of the License at
|
||||
#
|
||||
# http:#www.apache.org/licenses/LICENSE-2.0
|
||||
#
|
||||
# Unless required by applicable law or agreed to in writing, software
|
||||
# distributed under the License is distributed on an "AS IS" BASIS,
|
||||
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
|
||||
# See the License for the specific language governing permissions and
|
||||
# limitations under the License.
|
||||
#
|
||||
|
||||
import numpy as np
|
||||
import os
|
||||
import glob
|
||||
import argparse
|
||||
import cv2
|
||||
from pathlib import Path
|
||||
from amlnnlite.api import AMLNNLite
|
||||
import math
|
||||
|
||||
def preprocess(img_path, detections, new_shape=(256, 256), data_format='NCHW', s=0.003921568859368563, zp=-128):
|
||||
original_img = cv2.imread(str(img_path))
|
||||
if original_img is None:
|
||||
raise ValueError(f"can't read image: {img_path}")
|
||||
|
||||
im_h, im_w, _ = original_img.shape
|
||||
if detections.shape[0] > 0:
|
||||
detections = detections[:1, :]
|
||||
else:
|
||||
raise ValueError("No detections input, please run blazepose_detect and generate the detections first.")
|
||||
|
||||
x_center, y_center = detections[4:6]
|
||||
x_scale, y_scale = detections[6:8]
|
||||
print(f"---------center {x_center}, {y_center}, x_scale {x_scale}, y_scale {y_scale}")
|
||||
|
||||
box_size = (((x_scale - x_center) ** 2 + (y_scale - y_center) ** 2) ** 0.5) * 2
|
||||
box_size *= 1.25
|
||||
|
||||
angle = (np.pi * 90 / 180) - math.atan2(-(y_scale - y_center), x_scale - x_center)
|
||||
rotation = angle - 2 * np.pi * np.floor((angle - (-np.pi)) / (2 * np.pi))
|
||||
|
||||
rotated_rect = ((x_center, y_center), (box_size, box_size), rotation * 180. / np.pi)
|
||||
pts1 = cv2.boxPoints(rotated_rect)
|
||||
|
||||
h, w = new_shape
|
||||
pts2 = np.float32([[0, h], [0, 0], [w, 0], [w, h]])
|
||||
M = cv2.getPerspectiveTransform(pts1, pts2)
|
||||
processed_img = cv2.warpPerspective(original_img, M, (w, h), flags=cv2.INTER_LINEAR, borderMode=cv2.BORDER_REPLICATE)
|
||||
|
||||
rgb_img = cv2.cvtColor(processed_img, cv2.COLOR_BGR2RGB)
|
||||
normalized_img = rgb_img.astype(np.float32) / 255.0
|
||||
|
||||
if data_format == 'NCHW':
|
||||
# HWC -> CHW -> BCHW (ONNX default format)
|
||||
input_tensor = np.transpose(normalized_img, (2, 0, 1))
|
||||
input_tensor = np.expand_dims(input_tensor, axis=0)
|
||||
elif data_format == 'NHWC':
|
||||
# HWC -> BHWC (TFLITE default format)
|
||||
input_tensor = np.expand_dims(normalized_img, axis=0)
|
||||
else:
|
||||
raise ValueError(f"Unsupported data format: {data_format}. Only 'NCHW' and 'NHWC' are supported.")
|
||||
|
||||
# Quantize to int16
|
||||
input_tensor = np.round(input_tensor / s + zp).astype(np.int16)
|
||||
|
||||
return input_tensor, original_img, [x_center, y_center, rotation, box_size]
|
||||
|
||||
def tensor_to_landmark(landmarks):
|
||||
num_landmarks = 39
|
||||
num_dimensions = landmarks.shape[1] # num_landmarks
|
||||
output = landmarks.reshape(-1, num_landmarks, num_dimensions).copy()
|
||||
if num_dimensions > 3:
|
||||
output[..., 3:5] = 1.0 / (1.0 + np.exp(-output[..., 3:5]))
|
||||
return output
|
||||
|
||||
def refine_landmark(landmarks, heatmap):#39*5, 64*64*39
|
||||
min_confidence = 0.5
|
||||
kernel_size = 9
|
||||
offset = kernel_size
|
||||
|
||||
hm_h, hm_w, _ = heatmap.shape
|
||||
|
||||
for i, lm in enumerate(landmarks):
|
||||
col = int(lm[0] * hm_w)
|
||||
row = int(lm[1] * hm_h)
|
||||
|
||||
if not (0 <= col < hm_w and 0 <= row < hm_h):
|
||||
continue
|
||||
|
||||
c0 = max(0, col - offset)
|
||||
c1 = min(hm_w, col + offset + 1)
|
||||
r0 = max(0, row - offset)
|
||||
r1 = min(hm_h, row + offset + 1)
|
||||
|
||||
val_sum = 0.0
|
||||
weighted_col = 0.0
|
||||
weighted_row = 0.0
|
||||
max_conf = 0.0
|
||||
|
||||
for r in range(r0, r1):
|
||||
for c in range(c0, c1):
|
||||
conf = 1.0 / (1.0 + np.exp(-heatmap[r, c, i]))
|
||||
val_sum += conf
|
||||
max_conf = max(max_conf, conf)
|
||||
weighted_col += c * conf
|
||||
weighted_row += r * conf
|
||||
|
||||
if max_conf >= min_confidence and val_sum > 0:
|
||||
lm[0] = weighted_col / (hm_w * val_sum)
|
||||
lm[1] = weighted_row / (hm_h * val_sum)
|
||||
|
||||
return landmarks
|
||||
|
||||
def postprocess(outputs, params, data_format='NCHW'):
|
||||
x_center, y_center, rotation, box_size = params
|
||||
flag, landmark_tensor, world_landmark_tensor, segment, heatmap_tensor = [], [], [], [], []
|
||||
for out in outputs:
|
||||
if len(out.shape) == 2:
|
||||
if out.shape == (1, 1) or out.shape == (1,):
|
||||
flag = out
|
||||
elif out.shape[1] == 195:
|
||||
landmark_tensor = out
|
||||
elif out.shape[1] == 117:
|
||||
world_landmark_tensor = out
|
||||
elif len(out.shape) == 4 and out.shape[3] == 1 and out.shape[1] == 256:
|
||||
segment = out
|
||||
elif len(out.shape) == 4 and out.shape[1] == 64 and out.shape[3] == 39:
|
||||
heatmap_tensor = out
|
||||
|
||||
raw_landmarks = tensor_to_landmark(landmark_tensor)
|
||||
all_world_landmarks = tensor_to_landmark(world_landmark_tensor)
|
||||
|
||||
h = w = 256
|
||||
raw_landmarks[:, :, 0] = raw_landmarks[:, :, 0] / w
|
||||
raw_landmarks[:, :, 1] = raw_landmarks[:, :, 1] / h
|
||||
raw_landmarks[:, :, 2] = raw_landmarks[:, :, 2] / w
|
||||
|
||||
# Refines landmarks with the heatmap tensor.
|
||||
all_landmarks = refine_landmark(raw_landmarks[0], heatmap_tensor[0])
|
||||
all_world_landmarks = all_world_landmarks[0]
|
||||
|
||||
print(f"rotation {rotation}")
|
||||
cosa = math.cos(rotation)
|
||||
sina = math.sin(rotation)
|
||||
for landmark in all_landmarks:
|
||||
x = landmark[0] - 0.5
|
||||
y = landmark[1] - 0.5
|
||||
landmark[0] = ((cosa * x - sina * y) * box_size + x_center)
|
||||
landmark[1] = ((sina * x + cosa * y) * box_size + y_center)
|
||||
landmark[2] = landmark[2] * box_size
|
||||
|
||||
# Projects the world landmarks from the letterboxed ROI to the full image.
|
||||
for landmark in all_world_landmarks:
|
||||
x = landmark[0]
|
||||
y = landmark[1]
|
||||
landmark[0] = cosa * x - sina * y
|
||||
landmark[1] = sina * x + cosa * y
|
||||
|
||||
return all_landmarks
|
||||
|
||||
def get_class_color(class_id):
|
||||
import colorsys
|
||||
hue = (class_id * 137.508) % 360
|
||||
rgb = colorsys.hsv_to_rgb(hue/360.0, 0.8, 0.9)
|
||||
bgr = (int(rgb[2]*255), int(rgb[1]*255), int(rgb[0]*255))
|
||||
return bgr
|
||||
|
||||
POSE_CONNECTIONS = [
|
||||
# Face
|
||||
(0, 1),(1, 2),(2, 3),(3, 7),
|
||||
(0, 4),(4, 5),(5, 6),(6, 8),
|
||||
# Mouth
|
||||
(9, 10),
|
||||
# Shoulders
|
||||
(11, 12),
|
||||
# Right arm
|
||||
(11, 13), (13, 15), (15, 17), (15, 19), (15, 21), (17, 19),
|
||||
# Left arm
|
||||
(12, 14), (14, 16), (16, 18), (16, 20), (16, 22), (18, 20),
|
||||
# Torso
|
||||
(11, 23), (12, 24), (23, 24),
|
||||
# Right leg
|
||||
(23, 25), (25, 27), (27, 29), (27, 31), (29, 31),
|
||||
# Left leg
|
||||
(24, 26), (26, 28), (28, 30), (28, 32), (30, 32)
|
||||
]
|
||||
|
||||
def draw_landmarks(img, landmarks, save_path, score_threshold=0.5):
|
||||
result_img = img.copy()
|
||||
for lm in landmarks:
|
||||
lms = lm.landmarks
|
||||
|
||||
for point in lms:
|
||||
x, y, score = int(point[0]), int(point[1]), point[3]
|
||||
if score < score_threshold:
|
||||
continue
|
||||
cv2.circle(result_img, (x, y), 3, (0, 255, 0), -1)
|
||||
|
||||
for i0, i1 in POSE_CONNECTIONS:
|
||||
if i0 >= len(lms) or i1 >= len(lms):
|
||||
continue
|
||||
if lms[i0][3] < score_threshold or lms[i1][3] < score_threshold:
|
||||
continue
|
||||
p0 = (int(lms[i0][0]), int(lms[i0][1]))
|
||||
p1 = (int(lms[i1][0]), int(lms[i1][1]))
|
||||
cv2.line(result_img, p0, p1, (255, 0, 0), 2)
|
||||
|
||||
cv2.imwrite(save_path, result_img)
|
||||
return result_img
|
||||
|
||||
def read_detections_from_txt(txt_path):
|
||||
with open(txt_path, "r") as f:
|
||||
detections = [[float(x) for x in line.split()] for line in f if line.strip()]
|
||||
return np.array(detections, dtype=np.float32)
|
||||
|
||||
def main():
|
||||
parser = argparse.ArgumentParser()
|
||||
parser.add_argument('--model-path', default='./blazepose_landmark_int8_A311D2.adla')
|
||||
parser.add_argument('--run-cycles', default= 1, type=int)
|
||||
args = parser.parse_args()
|
||||
|
||||
# Initialize AMLNNLite
|
||||
amlnn = AMLNNLite()
|
||||
amlnn.config(
|
||||
model_path=args.model_path, # Model file path, Support ADLA and quantized TFlite models
|
||||
run_cycles=args.run_cycles
|
||||
)
|
||||
amlnn.init()
|
||||
|
||||
# Find all image files in the 01_export_model directory
|
||||
image_dir = "./"
|
||||
image_extensions = ["*.jpg", "*.jpeg", "*.png", "*.bmp"]
|
||||
image_files = []
|
||||
for ext in image_extensions:
|
||||
image_files.extend(glob.glob(os.path.join(image_dir, ext)))
|
||||
image_files.extend(glob.glob(os.path.join(image_dir, ext.upper())))
|
||||
|
||||
if not image_files:
|
||||
print("No image files found in", image_dir)
|
||||
amlnn.uninit()
|
||||
return
|
||||
|
||||
print(f"Found {len(image_files)} image files to process:")
|
||||
for img_file in image_files:
|
||||
print(f" - {os.path.basename(img_file)}")
|
||||
print()
|
||||
|
||||
# Process each image
|
||||
for i, image_path in enumerate(image_files, 1):
|
||||
txt_path = os.path.splitext(image_path)[0] + ".txt"
|
||||
detections = read_detections_from_txt(txt_path=txt_path)
|
||||
print(f"=" * 60)
|
||||
print(f"Processing image {i}/{len(image_files)}: {os.path.basename(image_path)}")
|
||||
print(f"=" * 60)
|
||||
|
||||
try:
|
||||
# Preprocess input
|
||||
input_tensor, original_img, params = preprocess(image_path, detections, new_shape=(256, 256), data_format='NHWC', s=0.000030518509447574615, zp=0)
|
||||
|
||||
# Run inference
|
||||
outputs = amlnn.inference(inputs=[input_tensor])
|
||||
|
||||
# Postprocess results
|
||||
landmarks = postprocess(outputs, params, data_format='NHWC')
|
||||
|
||||
# Print detection results
|
||||
if landmarks:
|
||||
print(f" Detected {len(landmarks)} objects:")
|
||||
for i, lm in enumerate(landmarks, 1):
|
||||
print(f" {i}. {lm['class_name']} ({lm['confidence']:.2f})")
|
||||
else:
|
||||
print(" No objects detected")
|
||||
|
||||
# Save result image
|
||||
model_name = Path(args.model_path).stem
|
||||
result_dir = f"{model_name}_result"
|
||||
os.makedirs(result_dir, exist_ok=True)
|
||||
img_name = Path(image_path).stem
|
||||
save_path = os.path.join(result_dir, f"{img_name}_result.jpg")
|
||||
draw_landmarks(original_img, landmarks, str(save_path), score_threshold=0.5)
|
||||
print(f" Result saved to: {save_path}")
|
||||
|
||||
except Exception as e:
|
||||
print(f"Error processing {os.path.basename(image_path)}: {e}")
|
||||
|
||||
print()
|
||||
|
||||
# Optional visualization
|
||||
amlnn.visualize()
|
||||
|
||||
# Release resources
|
||||
amlnn.uninit()
|
||||
|
||||
if __name__ == "__main__":
|
||||
main()
|
||||
|
||||
BIN
examples/blazepose_landmark/result.jpg
Executable file
BIN
examples/blazepose_landmark/result.jpg
Executable file
Binary file not shown.
|
After Width: | Height: | Size: 468 KiB |
Loading…
Add table
Add a link
Reference in a new issue