Files

Chengfang Lu e72bc061c5 feat: HSAP platform v2 — modular navigation, quality review, audit log, world model simulation

Major changes:
- New frontend (platform/web/): Vite + React 18 + TypeScript + Tailwind
- 4-module navigation: 数据送标 / 模型管理 / 车队管理 / 系统管理
- Data catalog with charts (DMS/ADAS/Lane 3-tab view)
- Quality review workflow (标注质检): Good/Fine/Bad scoring with auto-advance
- Audit enhancements: batch operations, rejection categories, Feishu notifications
- Operation audit log (操作日志)
- World model simulation studio (仿真工坊)
- Dataset version management with snapshots and diff
- ADAS 7-class dataset integration (138K images organized + compressed)
- User management with Feishu integration and pagination
- CRUD/search/filter on all pages, card layout redesign
- PIL-optimized image overlay rendering
- Auto-snapshot on build, in_review workflow stage
- Removed embedded algorithm code (now in workspace)

2026-06-03 11:40:21 +08:00

38 KiB

Raw Blame History

🚀 YOLO26 RKNN Export Adaptation

⚡ Optimized for Rockchip NPU Performance

This repository includes optimized RKNN export support for YOLO26 models, designed for high-performance inference on Rockchip NPU devices.

✨ Key Features

🎯 Raw Output Export: Models export without post-processing (no NMS, no sigmoid, no decode)
⚡ CPU Post-processing: Move decode/NMS operations to CPU for better NPU utilization
🔧 Multi-task Support: Works with Detection, Segmentation, OBB, and Pose models

📋 Export Format

Detection Model Output Structure:

Input:  images [1, 3, 640, 640]

Outputs (6 tensors for 3 detection heads):
├─ output0_reg [1, 4*reg_max, 80, 80]  # Head 0 regression (raw DFL output)
├─ output0_cls [1, nc, 80, 80]         # Head 0 classification (raw logits)
├─ output1_reg [1, 4*reg_max, 40, 40]  # Head 1 regression
├─ output1_cls [1, nc, 40, 40]         # Head 1 classification
├─ output2_reg [1, 4*reg_max, 20, 20]  # Head 2 regression
└─ output2_cls [1, nc, 20, 20]         # Head 2 classification

🔨 Usage

Step 1: Export ONNX Model

# Export YOLO26 model to RKNN-compatible ONNX format
yolo export model=yolo26n.pt format=rknn

Step 2: Convert to RKNN Model

The rknn_export/ directory in this repository contains complete RKNN conversion tools:

convert.py: Conversion script from ONNX to RKNN
datasets/: Quantization calibration dataset

Environment Setup

⚠️ Important: It's recommended to create a new virtual environment, as some dependencies of rknn-toolkit2 conflict with ultralytics

# Install RKNN-Toolkit2
pip install -U rknn-toolkit2

Using the Conversion Script

View help information:

python rknn_export/convert.py -h

Required Arguments:

--model-path: Path to ONNX model file (.onnx file exported in Step 1)
--platform: Target platform, options:
- rk3562, rk3566, rk3568, rk3576, rk3588
- rv1126b, rv1109, rv1126, rk1808

Optional Arguments:

--dtype: Quantization data type (default: i8)
- i8 or fp: For rk3562, rk3566, rk3568, rk3576, rk3588, rv1126b
- u8 or fp: For rv1109, rv1126, rk1808
--rknn-path: Output path for RKNN model (default: ./<model_name>.rknn)
--data-path: Path to quantization calibration dataset (default: datasets/COCO/coco_subset_20.txt)
- For custom data, prepare a txt file containing image paths
--batch-size: Batch size (default: 1)
- Can be adjusted based on NPU cores (e.g., RK3588 has 3 cores, can set to 3)
- ⚠️ Note: This parameter will fix the model output dimensions

Example Commands

# Basic conversion (RK3588 platform, INT8 quantization)
python rknn_export/convert.py \
  --model-path best.onnx \
  --platform rk3588 \
  --dtype i8

# Specify output path and quantization dataset
python rknn_export/convert.py \
  --model-path yolo26n.onnx \
  --platform rk3588 \
  --dtype i8 \
  --rknn-path ./models/yolo26n_rk3588.rknn \
  --data-path ./my_dataset/images.txt

# Multi-core batch processing (RK3588)
python rknn_export/convert.py \
  --model-path best.onnx \
  --platform rk3588 \
  --dtype i8 \
  --batch-size 3

Upon completion, it will display:

rknn model saved to: ./best.rknn

For more deployment examples, refer to: RKNN Model Zoo

📝 Implementation Details

Modified Files

ultralytics/engine/exporter.py: Enhanced export_rknn() method
- Uses optimal ONNX opset version
- Embeds all weights in single file
- Sets meaningful output tensor names
ultralytics/nn/modules/head.py: Updated Detect, Segment, OBB, Pose classes
- Added RKNN-specific forward logic
- Returns raw predictions without activation functions
ultralytics/nn/autobackend.py: Added RKNN inference support notes

Training & Inference

✅ Training: Not affected - all modifications only apply during export
✅ Standard Export: Other export formats (ONNX, TensorRT, etc.) work as before
✅ RKNN Export: Special handling only when format=rknn

🎯 Performance Benefits

Faster Inference: Post-processing on CPU is faster than on NPU for models
Better NPU Utilization: NPU focuses on backbone and head computations
Flexible Deployment: Easy to customize post-processing logic

Ultralytics creates cutting-edge, state-of-the-art (SOTA) YOLO models built on years of foundational research in computer vision and AI. Constantly updated for performance and flexibility, our models are fast, accurate, and easy to use. They excel at object detection, tracking, instance segmentation, image classification, and pose estimation tasks.

Find detailed documentation in the Ultralytics Docs. Get support via GitHub Issues. Join discussions on Discord, Reddit, and the Ultralytics Community Forums!

Request an Enterprise License for commercial use at Ultralytics Licensing.

📄 Documentation

See below for quickstart installation and usage examples. For comprehensive guidance on training, validation, prediction, and deployment, refer to our full Ultralytics Docs.

Install

Install the ultralytics package, including all requirements, in a Python>=3.8 environment with PyTorch>=1.8.

pip install ultralytics

For alternative installation methods, including Conda, Docker, and building from source via Git, please consult the Quickstart Guide.

Usage

CLI

You can use Ultralytics YOLO directly from the Command Line Interface (CLI) with the yolo command:

# Predict using a pretrained YOLO model (e.g., YOLO26n) on an image
yolo predict model=yolo26n.pt source='https://ultralytics.com/images/bus.jpg'

The yolo command supports various tasks and modes, accepting additional arguments like imgsz=640. Explore the YOLO CLI Docs for more examples.

Python

Ultralytics YOLO can also be integrated directly into your Python projects. It accepts the same configuration arguments as the CLI:

from ultralytics import YOLO

# Load a pretrained YOLO26n model
model = YOLO("yolo26n.pt")

# Train the model on the COCO8 dataset for 100 epochs
train_results = model.train(
    data="coco8.yaml",  # Path to dataset configuration file
    epochs=100,  # Number of training epochs
    imgsz=640,  # Image size for training
    device="cpu",  # Device to run on (e.g., 'cpu', 0, [0,1,2,3])
)

# Evaluate the model's performance on the validation set
metrics = model.val()

# Perform object detection on an image
results = model("path/to/image.jpg")  # Predict on an image
results[0].show()  # Display results

# Export the model to ONNX format for deployment
path = model.export(format="onnx")  # Returns the path to the exported model

Discover more examples in the YOLO Python Docs.

✨ Models

Ultralytics supports a wide range of YOLO models, from early versions like YOLOv3 to the latest YOLO26. The tables below showcase YOLO26 models pretrained on the COCO dataset for Detection, Segmentation, and Pose Estimation. Additionally, Classification models pretrained on the ImageNet dataset are available. Tracking mode is compatible with all Detection, Segmentation, and Pose models. All Models are automatically downloaded from the latest Ultralytics release upon first use.

Detection (COCO)

Explore the Detection Docs for usage examples. These models are trained on the COCO dataset, featuring 80 object classes.

Model	size ^(pixels)	mAP^val 50-95	mAP^{val 50-95(e2e)}	Speed ^{CPU ONNX (ms)}	Speed ^{T4 TensorRT10 (ms)}	params ^(M)	FLOPs ^(B)
YOLO26n	640	40.9	40.1	38.9 ± 0.7	1.7 ± 0.0	2.4	5.4
YOLO26s	640	48.6	47.8	87.2 ± 0.9	2.5 ± 0.0	9.5	20.7
YOLO26m	640	53.1	52.5	220.0 ± 1.4	4.7 ± 0.1	20.4	68.2
YOLO26l	640	55.0	54.4	286.2 ± 2.0	6.2 ± 0.2	24.8	86.4
YOLO26x	640	57.5	56.9	525.8 ± 4.0	11.8 ± 0.2	55.7	193.9

mAP^val values refer to single-model single-scale performance on the COCO val2017 dataset. See YOLO Performance Metrics for details.
Reproduce with yolo val detect data=coco.yaml device=0
Speed metrics are averaged over COCO val images using an Amazon EC2 P4d instance. CPU speeds measured with ONNX export. GPU speeds measured with TensorRT export.
Reproduce with yolo val detect data=coco.yaml batch=1 device=0|cpu

Segmentation (COCO)

Refer to the Segmentation Docs for usage examples. These models are trained on COCO-Seg, including 80 classes.

Model	size ^(pixels)	mAP^{box 50-95(e2e)}	mAP^{mask 50-95(e2e)}	Speed ^{CPU ONNX (ms)}	Speed ^{T4 TensorRT10 (ms)}	params ^(M)	FLOPs ^(B)
YOLO26n-seg	640	39.6	33.9	53.3 ± 0.5	2.1 ± 0.0	2.7	9.1
YOLO26s-seg	640	47.3	40.0	118.4 ± 0.9	3.3 ± 0.0	10.4	34.2
YOLO26m-seg	640	52.5	44.1	328.2 ± 2.4	6.7 ± 0.1	23.6	121.5
YOLO26l-seg	640	54.4	45.5	387.0 ± 3.7	8.0 ± 0.1	28.0	139.8
YOLO26x-seg	640	56.5	47.0	787.0 ± 6.8	16.4 ± 0.1	62.8	313.5

mAP^val values are for single-model single-scale on the COCO val2017 dataset. See YOLO Performance Metrics for details.
Reproduce with yolo val segment data=coco.yaml device=0
Speed metrics are averaged over COCO val images using an Amazon EC2 P4d instance. CPU speeds measured with ONNX export. GPU speeds measured with TensorRT export.
Reproduce with yolo val segment data=coco.yaml batch=1 device=0|cpu

Classification (ImageNet)

Consult the Classification Docs for usage examples. These models are trained on ImageNet, covering 1000 classes.

Model	size ^(pixels)	acc ^top1	acc ^top5	Speed ^{CPU ONNX (ms)}	Speed ^{T4 TensorRT10 (ms)}	params ^(M)	FLOPs ^{(B) at 224}
YOLO26n-cls	224	71.4	90.1	5.0 ± 0.3	1.1 ± 0.0	2.8	0.5
YOLO26s-cls	224	76.0	92.9	7.9 ± 0.2	1.3 ± 0.0	6.7	1.6
YOLO26m-cls	224	78.1	94.2	17.2 ± 0.4	2.0 ± 0.0	11.6	4.9
YOLO26l-cls	224	79.0	94.6	23.2 ± 0.3	2.8 ± 0.0	14.1	6.2
YOLO26x-cls	224	79.9	95.0	41.4 ± 0.9	3.8 ± 0.0	29.6	13.6

acc values represent model accuracy on the ImageNet dataset validation set.
Reproduce with yolo val classify data=path/to/ImageNet device=0
Speed metrics are averaged over ImageNet val images using an Amazon EC2 P4d instance. CPU speeds measured with ONNX export. GPU speeds measured with TensorRT export.
Reproduce with yolo val classify data=path/to/ImageNet batch=1 device=0|cpu

Pose (COCO)

See the Pose Estimation Docs for usage examples. These models are trained on COCO-Pose, focusing on the 'person' class.

Model	size ^(pixels)	mAP^{pose 50-95(e2e)}	mAP^pose 50(e2e)	Speed ^{CPU ONNX (ms)}	Speed ^{T4 TensorRT10 (ms)}	params ^(M)	FLOPs ^(B)
YOLO26n-pose	640	57.2	83.3	40.3 ± 0.5	1.8 ± 0.0	2.9	7.5
YOLO26s-pose	640	63.0	86.6	85.3 ± 0.9	2.7 ± 0.0	10.4	23.9
YOLO26m-pose	640	68.8	89.6	218.0 ± 1.5	5.0 ± 0.1	21.5	73.1
YOLO26l-pose	640	70.4	90.5	275.4 ± 2.4	6.5 ± 0.1	25.9	91.3
YOLO26x-pose	640	71.6	91.6	565.4 ± 3.0	12.2 ± 0.2	57.6	201.7

mAP^val values are for single-model single-scale on the COCO Keypoints val2017 dataset. See YOLO Performance Metrics for details.
Reproduce with yolo val pose data=coco-pose.yaml device=0
Speed metrics are averaged over COCO val images using an Amazon EC2 P4d instance. CPU speeds measured with ONNX export. GPU speeds measured with TensorRT export.
Reproduce with yolo val pose data=coco-pose.yaml batch=1 device=0|cpu

Oriented Bounding Boxes (DOTAv1)

Check the OBB Docs for usage examples. These models are trained on DOTAv1, including 15 classes.

Model	size ^(pixels)	mAP^{test 50-95(e2e)}	mAP^test 50(e2e)	Speed ^{CPU ONNX (ms)}	Speed ^{T4 TensorRT10 (ms)}	params ^(M)	FLOPs ^(B)
YOLO26n-obb	1024	52.4	78.9	97.7 ± 0.9	2.8 ± 0.0	2.5	14.0
YOLO26s-obb	1024	54.8	80.9	218.0 ± 1.4	4.9 ± 0.1	9.8	55.1
YOLO26m-obb	1024	55.3	81.0	579.2 ± 3.8	10.2 ± 0.3	21.2	183.3
YOLO26l-obb	1024	56.2	81.6	735.6 ± 3.1	13.0 ± 0.2	25.6	230.0
YOLO26x-obb	1024	56.7	81.7	1485.7 ± 11.5	30.5 ± 0.9	57.6	516.5

mAP^test values are for single-model multiscale performance on the DOTAv1 test set.
Reproduce by yolo val obb data=DOTAv1.yaml device=0 split=test and submit merged results to the DOTA evaluation server.
Speed metrics are averaged over DOTAv1 val images using an Amazon EC2 P4d instance. CPU speeds measured with ONNX export. GPU speeds measured with TensorRT export.
Reproduce by yolo val obb data=DOTAv1.yaml batch=1 device=0|cpu

🧩 Integrations

Our key integrations with leading AI platforms extend the functionality of Ultralytics' offerings, enhancing tasks like dataset labeling, training, visualization, and model management. Discover how Ultralytics, in collaboration with partners like Weights & Biases, Comet ML, Roboflow, and Intel OpenVINO, can optimize your AI workflow. Explore more at Ultralytics Integrations.

Ultralytics Platform 🌟	Weights & Biases	Comet	Neural Magic
Streamline YOLO workflows: Label, train, and deploy effortlessly with Ultralytics Platform. Try now!	Track experiments, hyperparameters, and results with Weights & Biases.	Free forever, Comet ML lets you save YOLO models, resume training, and interactively visualize predictions.	Run YOLO inference up to 6x faster with Neural Magic DeepSparse.

🤝 Contribute

We thrive on community collaboration! Ultralytics YOLO wouldn't be the SOTA framework it is without contributions from developers like you. Please see our Contributing Guide to get started. We also welcome your feedback—share your experience by completing our Survey. A huge Thank You 🙏 to everyone who contributes!

We look forward to your contributions to help make the Ultralytics ecosystem even better!

📜 License

Ultralytics offers two licensing options to suit different needs:

AGPL-3.0 License: This OSI-approved open-source license is perfect for students, researchers, and enthusiasts. It encourages open collaboration and knowledge sharing. See the LICENSE file for full details.
Ultralytics Enterprise License: Designed for commercial use, this license allows for the seamless integration of Ultralytics software and AI models into commercial products and services, bypassing the open-source requirements of AGPL-3.0. If your use case involves commercial deployment, please contact us via Ultralytics Licensing.

📞 Contact

For bug reports and feature requests related to Ultralytics software, please visit GitHub Issues. For questions, discussions, and community support, join our active communities on Discord, Reddit, and the Ultralytics Community Forums. We're here to help with all things Ultralytics!

38 KiB Raw Blame History