Files

2026-06-24 09:35:46 +08:00

6.5 KiB

Executable File

Raw Permalink Blame History

Two-ROI Exported Model Inference

tools/model_inference contains a self-contained inference pipeline for the exported two-ROI ONNX or TorchScript model.

Layout

run_two_roi_exported_onnx_infer.py Compatibility entry point kept at the original path.
core/ Core inference pipeline, decode logic, geometry helpers, and shared types.
adapters/ Input-source adapters for video directories, PDCL clip exports, and event-id resolution.
scripts/ Shell launchers grouped by usage mode.
data_tools/ Small preprocessing helpers for CSV/XLSX conversion.
docs/ Design notes and usage background documents.
examples/ Sample JSON/CSV/XLSX/txt inputs used by the helper scripts.

Files

core/run_two_roi_exported_onnx_infer.py Main implementation. Reads one clip-export directory, runs two-ROI ONNX or TorchScript inference, decodes 2D/3D results, and saves visualizations plus predictions.json.
core/two_roi_infer_utils.py Minimal local utilities for ROI crop, calibration handling, 2D decode, top-k selection, and common serialization helpers.
core/two_roi_3d_utils.py Minimal local 3D geometry, projection, yaw decoding, and 3D drawing helpers.
scripts/run_two_roi_exported_onnx_infer.sh Example shell wrapper.

External Dependencies

This package does not depend on ultralytics at runtime.

Required Python packages:

numpy
opencv-python
pyyaml
onnxruntime for .onnx models
torch for .torchscript models

Expected Input Layout

The script can take a clip-export directory directly.

Expected structure:

clip_export_xxx/
├── images/
│   ├── *.png
│   └── ...
├── calib/
│   └── L2_calib/
│       └── camera4.json
├── manifest.json
└── calib_summary.json

The script automatically reads:

images from images/
calibration from calib/L2_calib/camera4.json or calib/camera4.json

Expected Exported Model Outputs

The exported model should be the raw-head merged artifact produced by tools/model_merging/merge_models_of_2roi_yolo26.py. The same output contract is used for both ONNX and TorchScript.

Required output tensor names:

roi0_boxes_head_raw
roi0_scores_head_raw
roi0_preds_3d_head_raw
roi1_boxes_head_raw
roi1_scores_head_raw
roi1_preds_3d_head_raw

Optional output tensor names:

roi0_preds_edge_head_raw
roi1_preds_edge_head_raw

If the merged model is exported with --edge-head-mode drop, the runtime keeps the same 2D/3D decode path and automatically disables edge-yaw reconstruction.

Basic Usage

python tools/model_inference/run_two_roi_exported_onnx_infer.py \
  --case-dir tools/pdcl_inference/clip_exports/clip_export_G1M3_G1Q3_6284_019cb7f4-a944-7c22-5427-5b75b25545c7 \
  --exported-model runs/export/train_mono3d_two_roi_202603251430/merged_model.onnx \
  --output-dir /tmp/two_roi_exported_model_run

For CNCAP JSON batch video inference:

python tools/model_inference/run_two_roi_exported_onnx_infer.py \
  --cncap-json-file tools/model_inference/examples/cncap/G1M3_AFS1616_CNCAP-202411.json \
  --cncap-path-prefix-src /mnt/hfs/project-G1M3 \
  --cncap-path-prefix-dst /mnt/G1M3 \
  --exported-model runs/export/train_mono3d_two_roi_20260403-raw-fuse/merged_model.onnx \
  --output-dir /tmp/two_roi_exported_model_cncap_run

Shell Wrapper

bash tools/model_inference/scripts/run_two_roi_exported_onnx_infer.sh

Update the paths in the shell script before handing it to downstream users if needed.

Important Arguments

--case-dir Clip-export directory containing images/ and either calib/L2_calib/camera4.json or calib/camera4.json.
--cncap-json-file CNCAP JSON file containing values entries that point to sigmastar.1 directories. The script rewrites each mounted path prefix, resolves camera4.bin plus test_data/calibs/camera4.json, and then reuses the video-case inference flow.
--exported-model Merged raw-head exported model path. Supports .onnx and .torchscript.
--output-dir Directory used to save visualization images and predictions.json.
--roi0-model, --roi1-model Training checkpoints used only for metadata-free ROI preset alignment are no longer required by any external framework, but are still used as plain path fields in the current CLI contract. Keep them aligned with your deployment pair.
--roi0-roi, --roi1-roi ROI crop sizes before resize.
--roi0-imgsz, --roi1-imgsz ROI input tensor sizes used by the exported model. If omitted, the script first tries the export manifest.
--classes Optional class-id filter.
--max-images Limit the number of images for quick smoke tests.
--providers Optional ONNX Runtime providers, for example CUDAExecutionProvider CPUExecutionProvider. Only used for .onnx models.

Outputs

The script writes:

one visualization image per input frame
predictions.json

predictions.json contains per-frame, per-ROI prediction records including:

2D box
confidence
class id and class name
yaw
edge-yaw diagnostics
decoded 3D center
ROI crop bounds

Notes

This pipeline intentionally runs decode and postprocess outside the exported graph.
It is useful for downstream deployment and migration because the runtime path only depends on common Python packages.
If the exported model export mode changes, make sure the output tensor names still match the names listed above.

Known Residuals

Validation against the batch PyTorch reference path tools/pdcl_inference/two_roi_inference.py on the first 20 frames of clip_export_G1M3_G1Q3_6284_019cb7f4-a944-7c22-5427-5b75b25545c7 shows that the self-contained ONNX path matches the 3D branch decisions after bbox-based matching:

visible_face_type mismatch: 0
visible_face_types mismatch: 0
edge_yaw_confident mismatch: 0

Two near-threshold count mismatches are still treated as known residuals. In both cases the PyTorch batch path keeps one extra cls_id=6 detection with confidence just above the 0.25 threshold, while the ONNX path drops it:

019cb7f4-a944-7c22-5427-5b75b25545c7_80364.png, roi0 Batch-only detection: conf=0.252197
019cb7f4-a944-7c22-5427-5b75b25545c7_80370.png, roi0 Batch-only detection: conf=0.251094

Current interpretation:

These residuals are consistent with small ONNX vs PyTorch numerical drift around the confidence threshold.
The implementation is intentionally kept unchanged; no extra confidence epsilon is applied just to eliminate these edge cases.

6.5 KiB Executable File Raw Permalink Blame History