10 KiB
Executable File
10 KiB
Executable File
Example: Using 2-Level Path Structure for Model Evaluation
This example demonstrates how to evaluate models with 2-level directory structures.
Scenario
You have organized your test data with an additional level of hierarchy:
/data/detections/
├── G1M3_AFS1616/ # Level 1: Dataset/configuration name
│ ├── case_001/ # Level 2: Individual test cases
│ │ └── txt_results/
│ │ ├── frame001.txt
│ │ └── frame002.txt
│ └── case_002/
│ └── txt_results/
│ └── frame001.txt
└── G1M3_AFS1920/
└── case_003/
└── txt_results/
└── frame001.txt
/data/ground_truth/
├── G1M3_AFS1616/
│ ├── case_001/
│ │ └── labels/
│ │ ├── frame001.txt
│ │ └── frame002.txt
│ └── case_002/
│ └── labels/
│ └── frame001.txt
└── G1M3_AFS1920/
└── case_003/
└── labels/
└── frame001.txt
Step 1: Create Configuration File
Create eval_tools/configs/eval_config_2level_example.yaml:
# Evaluation Configuration for 2-Level Path Structure
dataset:
det_path: "/data/detections"
gt_path: "/data/ground_truth"
path_depth: 2 # Enable 2-level structure
image:
width: 1920
height: 1080
model:
input_size: 704
min_box_size_at_input_scale: 8
performance:
num_workers: 32
roi_gt:
enabled: true
calib_root: "/data/ground_truth"
roi_config: [1920, 960]
roi:
enabled: false
region: [0, 120, 1920, 1080]
input_size: [704, 352]
classes:
3d_classes: [0, 1, 2, 3]
2d_classes: [4, 5, 6, 7, 8, 9, 10, 11, 12, 13]
class_names:
0: "vehicle"
1: "pedestrian"
2: "bicycle"
3: "rider"
4: "roadblock"
5: "head"
6: "tsr"
7: "guideboard"
8: "plate"
9: "wheel"
10: "tl_border"
11: "tl_wick"
12: "tl_num"
13: "tricycle"
matching:
iou_threshold: 0.5
metrics_2d:
enabled: true
conf_threshold: 0.3
ap_method: "voc2010"
metrics_3d:
enabled: true
heading_tolerance: "both"
distance_ranges:
- [0, 20]
- [20, 40]
- [40, 60]
- [60, 80]
- [80, 100]
- [100, 999]
lateral_distance_ranges:
- [-50, -40]
- [-40, -30]
- [-30, -20]
- [-20, -10]
- [-10, 0]
- [0, 10]
- [10, 20]
- [20, 30]
- [30, 40]
- [40, 50]
output:
save_path: "evaluation_results/2level_example/{timestamp}"
formats: ["json", "txt"]
print_details: true
per_case_reports: true
Step 2: Run Evaluation
Method 1: Using Config File
python eval_tools/core/eval.py \
--config eval_tools/configs/eval_config_2level_example.yaml \
--save-detailed-matches
Method 2: Command Line Override
python eval_tools/core/eval.py \
--config eval_tools/configs/eval_config_2level_example.yaml \
--path-depth 2 \
--det-path /data/detections \
--gt-path /data/ground_truth \
--output-dir evaluation_results/custom_output
Method 3: Without Config File
python eval_tools/core/eval.py \
--det-path /data/detections \
--gt-path /data/ground_truth \
--path-depth 2 \
--img-width 1920 \
--img-height 1080 \
--iou-threshold 0.5 \
--conf-threshold 0.3 \
--heading-tolerance both \
--output-dir evaluation_results/2level_test
Step 3: Compare Two Models with 2-Level Paths
Create Config for Model 1
eval_tools/configs/eval_config_model1_2level.yaml:
dataset:
det_path: "/data/model1/detections"
gt_path: "/data/ground_truth"
path_depth: 2
# ... other settings ...
Create Config for Model 2
eval_tools/configs/eval_config_model2_2level.yaml:
dataset:
det_path: "/data/model2/detections"
gt_path: "/data/ground_truth"
path_depth: 2
# ... other settings ...
Update Comparison Script
Edit eval_tools/model_comparison/compare_models_with_common_matches.sh:
#!/bin/bash
set -e
# Configuration
MODEL1_CONFIG="eval_tools/configs/eval_config_model1_2level.yaml"
MODEL2_CONFIG="eval_tools/configs/eval_config_model2_2level.yaml"
OUTPUT_BASE="evaluation_results/2level_comparison"
TIMESTAMP=$(date +%Y%m%d_%H%M%S)
MODEL1_NAME="model1"
MODEL2_NAME="model2"
# Step 1: Evaluate Model 1
echo "Step 1: Evaluating Model 1..."
MODEL1_OUTPUT="${OUTPUT_BASE}/${MODEL1_NAME}/${TIMESTAMP}"
python eval_tools/core/eval.py \
--config ${MODEL1_CONFIG} \
--output-dir ${MODEL1_OUTPUT} \
--save-detailed-matches
# Step 2: Evaluate Model 2
echo "Step 2: Evaluating Model 2..."
MODEL2_OUTPUT="${OUTPUT_BASE}/${MODEL2_NAME}/${TIMESTAMP}"
python eval_tools/core/eval.py \
--config ${MODEL2_CONFIG} \
--output-dir ${MODEL2_OUTPUT} \
--save-detailed-matches
# Step 3: Find common matches
echo "Step 3: Finding common matches..."
COMMON_MATCHES_DIR="${OUTPUT_BASE}/common_matches_${TIMESTAMP}"
mkdir -p ${COMMON_MATCHES_DIR}
python eval_tools/model_comparison/find_common_matches.py \
--model1-matches ${MODEL1_OUTPUT}/detailed_3d_matches.json \
--model2-matches ${MODEL2_OUTPUT}/detailed_3d_matches.json \
--output ${COMMON_MATCHES_DIR}/common_matches.json \
--model1-name "${MODEL1_NAME}" \
--model2-name "${MODEL2_NAME}"
# Step 4: Compare models
echo "Step 4: Comparing models..."
COMPARISON_DIR="${OUTPUT_BASE}/comparison_${TIMESTAMP}"
python eval_tools/model_comparison/compare_models.py \
--model1 ${MODEL1_OUTPUT}/evaluation_report.json \
--model2 ${MODEL2_OUTPUT}/evaluation_report.json \
--model1-name "${MODEL1_NAME}" \
--model2-name "${MODEL2_NAME}" \
--common-matches ${COMMON_MATCHES_DIR}/common_matches.json \
--output-dir ${COMPARISON_DIR}
echo "✓ Comparison complete!"
echo "Results: ${COMPARISON_DIR}/comparison_report.txt"
Run Comparison
bash eval_tools/model_comparison/compare_models_with_common_matches.sh
Expected Output
================================================================================
YOLOv5-3D Model Evaluation
================================================================================
Detection path: /data/detections
Ground truth path: /data/ground_truth
Path depth: 2
Output directory: evaluation_results/2level_example/20260211_143022
Image size: 1920x1080
IoU threshold: 0.5
Confidence threshold: 0.3
AP method: voc2010
Heading tolerance: both
Number of workers: 32
Evaluate 2D: True
Evaluate 3D: True
Save detailed matches: Yes
================================================================================
Loading data...
Found 3 case(s) in detection root: /data/detections (path_depth=2)
Loaded 5 image pairs for evaluation
==================================================
Evaluating 2D Detection Metrics
==================================================
Processing case [1/3]: case_001 (2 frames)
case_001: 100%|████████████████████| 2/2 [00:00<00:00, 45.23it/s]
Processing case [2/3]: case_002 (1 frames)
case_002: 100%|████████████████████| 1/1 [00:00<00:00, 48.12it/s]
Processing case [3/3]: case_003 (2 frames)
case_003: 100%|████████████████████| 2/2 [00:00<00:00, 46.87it/s]
==================================================
Evaluating 3D Detection Metrics
==================================================
Processing case [1/3]: case_001 (2 frames)
case_001: 100%|████████████████████| 2/2 [00:00<00:00, 42.15it/s]
Processing case [2/3]: case_002 (1 frames)
case_002: 100%|████████████████████| 1/1 [00:00<00:00, 43.89it/s]
Processing case [3/3]: case_003 (2 frames)
case_003: 100%|████████████████████| 2/2 [00:00<00:00, 41.76it/s]
Detailed 3D matches saved to: evaluation_results/2level_example/20260211_143022/detailed_3d_matches.json
JSON report saved to: evaluation_results/2level_example/20260211_143022/evaluation_report.json
Text report saved to: evaluation_results/2level_example/20260211_143022/evaluation_report.txt
Per-case reports saved to: evaluation_results/2level_example/20260211_143022/per_case_reports/ (3 cases)
================================================================================
EVALUATION SUMMARY - OVERALL
================================================================================
2D Metrics:
Precision: 0.8542
Recall: 0.8123
mAP: 0.8234
3D Metrics:
vehicle [overall]: Lat=0.234m, Long=0.456m, Head=0.123rad (relaxed=0.098rad, rev=12) (n=145)
pedestrian [overall]: Lat=0.189m, Long=0.312m, Head=0.234rad (relaxed=0.187rad, rev=8) (n=67)
================================================================================
✓ Evaluation completed successfully!
Troubleshooting
Issue: No cases found
Check:
- Verify
path_depthis set correctly (1 or 2) - Ensure directory structure matches the expected format
- Check that level1 directory names match between det_path and gt_path
Issue: Some cases are skipped
Check:
- Each case must have
txt_results/subdirectory (for detections) - Each case must have
labels/subdirectory (for ground truth) - Frame names must match between detections and labels
Issue: GT case directory not found
Check:
- Level1 directory names must be identical in both paths
- Case names must be identical in both paths
- Path structure must be consistent
Benefits of 2-Level Structure
- Better Organization: Group related test cases by dataset or configuration
- Flexible Comparison: Compare models across different datasets easily
- Scalability: Handle large numbers of test cases more efficiently
- Backward Compatible: Existing 1-level structures continue to work
Migration from 1-Level to 2-Level
If you have existing 1-level structure and want to migrate:
# Original structure
/data/detections/case1/txt_results/
/data/detections/case2/txt_results/
# Create 2-level structure
mkdir -p /data/detections_2level/dataset_A
mv /data/detections/case1 /data/detections_2level/dataset_A/
mv /data/detections/case2 /data/detections_2level/dataset_A/
# Update config
# path_depth: 1 → path_depth: 2
# det_path: /data/detections → /data/detections_2level