# Example: Using 2-Level Path Structure for Model Evaluation

This example demonstrates how to evaluate models with 2-level directory structures.

## Scenario

You have organized your test data with an additional level of hierarchy:

```
/data/detections/
    ├── G1M3_AFS1616/          # Level 1: Dataset/configuration name
    │   ├── case_001/          # Level 2: Individual test cases
    │   │   └── txt_results/
    │   │       ├── frame001.txt
    │   │       └── frame002.txt
    │   └── case_002/
    │       └── txt_results/
    │           └── frame001.txt
    └── G1M3_AFS1920/
        └── case_003/
            └── txt_results/
                └── frame001.txt

/data/ground_truth/
    ├── G1M3_AFS1616/
    │   ├── case_001/
    │   │   └── labels/
    │   │       ├── frame001.txt
    │   │       └── frame002.txt
    │   └── case_002/
    │       └── labels/
    │           └── frame001.txt
    └── G1M3_AFS1920/
        └── case_003/
            └── labels/
                └── frame001.txt
```

## Step 1: Create Configuration File

Create `eval_tools/configs/eval_config_2level_example.yaml`:

```yaml
# Evaluation Configuration for 2-Level Path Structure
dataset:
  det_path: "/data/detections"
  gt_path: "/data/ground_truth"
  path_depth: 2  # Enable 2-level structure

image:
  width: 1920
  height: 1080

model:
  input_size: 704
  min_box_size_at_input_scale: 8

performance:
  num_workers: 32

roi_gt:
  enabled: true
  calib_root: "/data/ground_truth"
  roi_config: [1920, 960]

roi:
  enabled: false
  region: [0, 120, 1920, 1080]
  input_size: [704, 352]

classes:
  3d_classes: [0, 1, 2, 3]
  2d_classes: [4, 5, 6, 7, 8, 9, 10, 11, 12, 13]
  class_names:
    0: "vehicle"
    1: "pedestrian"
    2: "bicycle"
    3: "rider"
    4: "roadblock"
    5: "head"
    6: "tsr"
    7: "guideboard"
    8: "plate"
    9: "wheel"
    10: "tl_border"
    11: "tl_wick"
    12: "tl_num"
    13: "tricycle"

matching:
  iou_threshold: 0.5

metrics_2d:
  enabled: true
  conf_threshold: 0.3
  ap_method: "voc2010"

metrics_3d:
  enabled: true
  heading_tolerance: "both"
  distance_ranges:
    - [0, 20]
    - [20, 40]
    - [40, 60]
    - [60, 80]
    - [80, 100]
    - [100, 999]
  lateral_distance_ranges:
    - [-50, -40]
    - [-40, -30]
    - [-30, -20]
    - [-20, -10]
    - [-10, 0]
    - [0, 10]
    - [10, 20]
    - [20, 30]
    - [30, 40]
    - [40, 50]

output:
  save_path: "evaluation_results/2level_example/{timestamp}"
  formats: ["json", "txt"]
  print_details: true
  per_case_reports: true
```

## Step 2: Run Evaluation

### Method 1: Using Config File

```bash
python eval_tools/core/eval.py \
  --config eval_tools/configs/eval_config_2level_example.yaml \
  --save-detailed-matches
```

### Method 2: Command Line Override

```bash
python eval_tools/core/eval.py \
  --config eval_tools/configs/eval_config_2level_example.yaml \
  --path-depth 2 \
  --det-path /data/detections \
  --gt-path /data/ground_truth \
  --output-dir evaluation_results/custom_output
```

### Method 3: Without Config File

```bash
python eval_tools/core/eval.py \
  --det-path /data/detections \
  --gt-path /data/ground_truth \
  --path-depth 2 \
  --img-width 1920 \
  --img-height 1080 \
  --iou-threshold 0.5 \
  --conf-threshold 0.3 \
  --heading-tolerance both \
  --output-dir evaluation_results/2level_test
```

## Step 3: Compare Two Models with 2-Level Paths

### Create Config for Model 1

`eval_tools/configs/eval_config_model1_2level.yaml`:

```yaml
dataset:
  det_path: "/data/model1/detections"
  gt_path: "/data/ground_truth"
  path_depth: 2

# ... other settings ...
```

### Create Config for Model 2

`eval_tools/configs/eval_config_model2_2level.yaml`:

```yaml
dataset:
  det_path: "/data/model2/detections"
  gt_path: "/data/ground_truth"
  path_depth: 2

# ... other settings ...
```

### Update Comparison Script

Edit `eval_tools/model_comparison/compare_models_with_common_matches.sh`:

```bash
#!/bin/bash

set -e

# Configuration
MODEL1_CONFIG="eval_tools/configs/eval_config_model1_2level.yaml"
MODEL2_CONFIG="eval_tools/configs/eval_config_model2_2level.yaml"
OUTPUT_BASE="evaluation_results/2level_comparison"
TIMESTAMP=$(date +%Y%m%d_%H%M%S)

MODEL1_NAME="model1"
MODEL2_NAME="model2"

# Step 1: Evaluate Model 1
echo "Step 1: Evaluating Model 1..."
MODEL1_OUTPUT="${OUTPUT_BASE}/${MODEL1_NAME}/${TIMESTAMP}"
python eval_tools/core/eval.py \
  --config ${MODEL1_CONFIG} \
  --output-dir ${MODEL1_OUTPUT} \
  --save-detailed-matches

# Step 2: Evaluate Model 2
echo "Step 2: Evaluating Model 2..."
MODEL2_OUTPUT="${OUTPUT_BASE}/${MODEL2_NAME}/${TIMESTAMP}"
python eval_tools/core/eval.py \
  --config ${MODEL2_CONFIG} \
  --output-dir ${MODEL2_OUTPUT} \
  --save-detailed-matches

# Step 3: Find common matches
echo "Step 3: Finding common matches..."
COMMON_MATCHES_DIR="${OUTPUT_BASE}/common_matches_${TIMESTAMP}"
mkdir -p ${COMMON_MATCHES_DIR}

python eval_tools/model_comparison/find_common_matches.py \
  --model1-matches ${MODEL1_OUTPUT}/detailed_3d_matches.json \
  --model2-matches ${MODEL2_OUTPUT}/detailed_3d_matches.json \
  --output ${COMMON_MATCHES_DIR}/common_matches.json \
  --model1-name "${MODEL1_NAME}" \
  --model2-name "${MODEL2_NAME}"

# Step 4: Compare models
echo "Step 4: Comparing models..."
COMPARISON_DIR="${OUTPUT_BASE}/comparison_${TIMESTAMP}"

python eval_tools/model_comparison/compare_models.py \
  --model1 ${MODEL1_OUTPUT}/evaluation_report.json \
  --model2 ${MODEL2_OUTPUT}/evaluation_report.json \
  --model1-name "${MODEL1_NAME}" \
  --model2-name "${MODEL2_NAME}" \
  --common-matches ${COMMON_MATCHES_DIR}/common_matches.json \
  --output-dir ${COMPARISON_DIR}

echo "✓ Comparison complete!"
echo "Results: ${COMPARISON_DIR}/comparison_report.txt"
```

### Run Comparison

```bash
bash eval_tools/model_comparison/compare_models_with_common_matches.sh
```

## Expected Output

```
================================================================================
YOLOv5-3D Model Evaluation
================================================================================
Detection path: /data/detections
Ground truth path: /data/ground_truth
Path depth: 2
Output directory: evaluation_results/2level_example/20260211_143022
Image size: 1920x1080
IoU threshold: 0.5
Confidence threshold: 0.3
AP method: voc2010
Heading tolerance: both
Number of workers: 32
Evaluate 2D: True
Evaluate 3D: True
Save detailed matches: Yes
================================================================================

Loading data...
Found 3 case(s) in detection root: /data/detections (path_depth=2)
Loaded 5 image pairs for evaluation

==================================================
Evaluating 2D Detection Metrics
==================================================

Processing case [1/3]: case_001 (2 frames)
  case_001: 100%|████████████████████| 2/2 [00:00<00:00, 45.23it/s]

Processing case [2/3]: case_002 (1 frames)
  case_002: 100%|████████████████████| 1/1 [00:00<00:00, 48.12it/s]

Processing case [3/3]: case_003 (2 frames)
  case_003: 100%|████████████████████| 2/2 [00:00<00:00, 46.87it/s]

==================================================
Evaluating 3D Detection Metrics
==================================================

Processing case [1/3]: case_001 (2 frames)
  case_001: 100%|████████████████████| 2/2 [00:00<00:00, 42.15it/s]

Processing case [2/3]: case_002 (1 frames)
  case_002: 100%|████████████████████| 1/1 [00:00<00:00, 43.89it/s]

Processing case [3/3]: case_003 (2 frames)
  case_003: 100%|████████████████████| 2/2 [00:00<00:00, 41.76it/s]

Detailed 3D matches saved to: evaluation_results/2level_example/20260211_143022/detailed_3d_matches.json
JSON report saved to: evaluation_results/2level_example/20260211_143022/evaluation_report.json
Text report saved to: evaluation_results/2level_example/20260211_143022/evaluation_report.txt
Per-case reports saved to: evaluation_results/2level_example/20260211_143022/per_case_reports/ (3 cases)

================================================================================
EVALUATION SUMMARY - OVERALL
================================================================================

2D Metrics:
  Precision: 0.8542
  Recall:    0.8123
  mAP:       0.8234

3D Metrics:
  vehicle [overall]: Lat=0.234m, Long=0.456m, Head=0.123rad (relaxed=0.098rad, rev=12) (n=145)
  pedestrian [overall]: Lat=0.189m, Long=0.312m, Head=0.234rad (relaxed=0.187rad, rev=8) (n=67)
================================================================================

✓ Evaluation completed successfully!
```

## Troubleshooting

### Issue: No cases found

**Check:**
1. Verify `path_depth` is set correctly (1 or 2)
2. Ensure directory structure matches the expected format
3. Check that level1 directory names match between det_path and gt_path

### Issue: Some cases are skipped

**Check:**
1. Each case must have `txt_results/` subdirectory (for detections)
2. Each case must have `labels/` subdirectory (for ground truth)
3. Frame names must match between detections and labels

### Issue: GT case directory not found

**Check:**
1. Level1 directory names must be identical in both paths
2. Case names must be identical in both paths
3. Path structure must be consistent

## Benefits of 2-Level Structure

1. **Better Organization**: Group related test cases by dataset or configuration
2. **Flexible Comparison**: Compare models across different datasets easily
3. **Scalability**: Handle large numbers of test cases more efficiently
4. **Backward Compatible**: Existing 1-level structures continue to work

## Migration from 1-Level to 2-Level

If you have existing 1-level structure and want to migrate:

```bash
# Original structure
/data/detections/case1/txt_results/
/data/detections/case2/txt_results/

# Create 2-level structure
mkdir -p /data/detections_2level/dataset_A
mv /data/detections/case1 /data/detections_2level/dataset_A/
mv /data/detections/case2 /data/detections_2level/dataset_A/

# Update config
# path_depth: 1 → path_depth: 2
# det_path: /data/detections → /data/detections_2level
```