GROUND_DETECTION_GUIDE.md

# Ground 2D Detection Training Guide

This guide explains how to train YOLO26 models on custom ground 2D detection datasets with difficulty-based loss weighting.

## Overview

The ground 2D detection implementation supports:
- Custom annotation format with string class names
- Difficulty scores for each bounding box
- Difficulty-based loss weighting (loss_weight = 1.0 / (1.0 + difficulty))
- Minimum box size filtering
- Optional YUV444 color space support

## Dataset Structure

### Directory Layout
```
dataset/
├── images/
│   ├── train/
│   │   ├── img001.jpg
│   │   ├── img002.jpg
│   │   └── ...
│   └── val/
│       ├── img101.jpg
│       ├── img102.jpg
│       └── ...
├── labels/
│   ├── train/
│   │   ├── img001.txt
│   │   ├── img002.txt
│   │   └── ...
│   └── val/
│       ├── img101.txt
│       ├── img102.txt
│       └── ...
└── dataset.yaml
```

### Label Format

Each label file contains one line per object with 7 columns:
```
class_name x_center y_center width height difficulty1 difficulty2
```

Example:
```
car 0.5 0.5 0.3 0.2 0.0 0.0
pedestrian 0.3 0.4 0.1 0.15 0.5 0.5
bicycle 0.7 0.6 0.15 0.2 1.0 1.0
```

Where:
- `class_name`: String class name (e.g., "car", "pedestrian")
- `x_center, y_center, width, height`: Normalized coordinates [0, 1]
- `difficulty1, difficulty2`: Difficulty values that will be combined (difficulty = difficulty1 + difficulty2)

### Dataset YAML Configuration

Create a `dataset.yaml` file:

```yaml
# Dataset paths
path: /path/to/dataset
train: images/train
val: images/val

# Class mapping: string names to numeric IDs
class_map:
  car: 0
  pedestrian: 1
  bicycle: 2
  truck: 3

# Optional parameters
min_wh: 2.0  # Keep boxes whose width or height is at least this many pixels
use_yuv444: false  # Use YUV444 color space (default: false)
```

## Training

### Python API

```python
from ultralytics.models.yolo.detect import GroundDetectionTrainer

# Initialize trainer
trainer = GroundDetectionTrainer(
    overrides={
        "model": "yolo26n.pt",  # or yolo26s.pt, yolo26m.pt, etc.
        "data": "path/to/dataset.yaml",
        "epochs": 100,
        "imgsz": 640,
        "batch": 16,
        "device": 0,
    }
)

# Start training
trainer.train()
```

### Command Line

```bash
# Using GroundDetectionTrainer directly
python -c "from ultralytics.models.yolo.detect import GroundDetectionTrainer; \
trainer = GroundDetectionTrainer(overrides={'model': 'yolo26n.pt', 'data': 'dataset.yaml', 'epochs': 100}); \
trainer.train()"
```

## Difficulty-Based Loss Weighting

The implementation applies difficulty-based weighting to all loss components:

```python
loss_weight = 1.0 / (1.0 + difficulty)
```

Examples:
- difficulty = 0.0 (easy): weight = 1.0
- difficulty = 1.0 (medium): weight = 0.5
- difficulty = 2.0 (hard): weight = 0.33

This allows the model to focus more on easier, more detectable objects while still learning from harder examples.

## Class Mapping

The `class_map` in the YAML allows flexible class merging:

```yaml
class_map:
  car: 0
  suv: 0      # Maps to same class as car
  van: 0      # Maps to same class as car
  bus: 1
  truck: 2
  pedestrian: 3
```

## Minimum Box Size Filtering

Boxes are filtered out only when both width and height are smaller than `min_wh` pixels:

```yaml
min_wh: 2.0  # Filter boxes only if both sides are smaller than 2 px
```

This is useful for removing very small objects that are difficult to detect.

## YUV444 Color Space

If your images are in YUV444 format, enable conversion:

```yaml
use_yuv444: true
```

The dataset will automatically convert images from YUV444 to BGR during loading.

## Validation

The trained model can be validated using the standard YOLO validation:

```python
from ultralytics import YOLO

model = YOLO("runs/detect/train/weights/best.pt")
results = model.val(data="dataset.yaml")
```

## Inference

Use the trained model for inference:

```python
from ultralytics import YOLO

model = YOLO("runs/detect/train/weights/best.pt")
results = model.predict("path/to/image.jpg")
```

## Key Implementation Files

- `ultralytics/utils/instance.py`: Extended `Instances` class with difficulty support
- `ultralytics/data/utils.py`: Added `verify_image_label_ground()` function
- `ultralytics/data/dataset.py`: Added `YOLOGroundDataset` class
- `ultralytics/utils/loss.py`: Added `v8DetectionLossGround` class
- `ultralytics/data/build.py`: Modified `build_yolo_dataset()` to detect ground datasets
- `ultralytics/models/yolo/detect/train.py`: Added `GroundDetectionTrainer` class
- `ultralytics/cfg/datasets/ground_template.yaml`: Dataset configuration template

## Troubleshooting

### Issue: "class_map not found in data"
Make sure your dataset YAML has a `class_map` dictionary instead of a `names` list.

### Issue: "labels require 6 columns"
Check that your label files have exactly 7 columns (class_name + 4 coords + 2 difficulties).

### Issue: "negative label values"
Ensure all coordinates and difficulty values are non-negative.

### Issue: "non-normalized coordinates"
Coordinates must be normalized to [0, 1] range.
单目3D初始代码 2026-06-24 09:35:46 +08:00			`# Ground 2D Detection Training Guide`

			`This guide explains how to train YOLO26 models on custom ground 2D detection datasets with difficulty-based loss weighting.`

			`## Overview`

			`The ground 2D detection implementation supports:`
			`- Custom annotation format with string class names`
			`- Difficulty scores for each bounding box`
			`- Difficulty-based loss weighting (loss_weight = 1.0 / (1.0 + difficulty))`
			`- Minimum box size filtering`
			`- Optional YUV444 color space support`

			`## Dataset Structure`

			`### Directory Layout`
			```
			`dataset/`
			`├── images/`
			`│ ├── train/`
			`│ │ ├── img001.jpg`
			`│ │ ├── img002.jpg`
			`│ │ └── ...`
			`│ └── val/`
			`│ ├── img101.jpg`
			`│ ├── img102.jpg`
			`│ └── ...`
			`├── labels/`
			`│ ├── train/`
			`│ │ ├── img001.txt`
			`│ │ ├── img002.txt`
			`│ │ └── ...`
			`│ └── val/`
			`│ ├── img101.txt`
			`│ ├── img102.txt`
			`│ └── ...`
			`└── dataset.yaml`
			```

			`### Label Format`

			`Each label file contains one line per object with 7 columns:`
			```
			`class_name x_center y_center width height difficulty1 difficulty2`
			```

			`Example:`
			```
			`car 0.5 0.5 0.3 0.2 0.0 0.0`
			`pedestrian 0.3 0.4 0.1 0.15 0.5 0.5`
			`bicycle 0.7 0.6 0.15 0.2 1.0 1.0`
			```

			`Where:`
			- `class_name`: String class name (e.g., "car", "pedestrian")
			- `x_center, y_center, width, height`: Normalized coordinates [0, 1]
			- `difficulty1, difficulty2`: Difficulty values that will be combined (difficulty = difficulty1 + difficulty2)

			`### Dataset YAML Configuration`

			Create a `dataset.yaml` file:

			```yaml
			`# Dataset paths`
			`path: /path/to/dataset`
			`train: images/train`
			`val: images/val`

			`# Class mapping: string names to numeric IDs`
			`class_map:`
			`car: 0`
			`pedestrian: 1`
			`bicycle: 2`
			`truck: 3`

			`# Optional parameters`
			`min_wh: 2.0 # Keep boxes whose width or height is at least this many pixels`
			`use_yuv444: false # Use YUV444 color space (default: false)`
			```

			`## Training`

			`### Python API`

			```python
			`from ultralytics.models.yolo.detect import GroundDetectionTrainer`

			`# Initialize trainer`
			`trainer = GroundDetectionTrainer(`
			`overrides={`
			`"model": "yolo26n.pt", # or yolo26s.pt, yolo26m.pt, etc.`
			`"data": "path/to/dataset.yaml",`
			`"epochs": 100,`
			`"imgsz": 640,`
			`"batch": 16,`
			`"device": 0,`
			`}`
			`)`

			`# Start training`
			`trainer.train()`
			```

			`### Command Line`

			```bash
			`# Using GroundDetectionTrainer directly`
			`python -c "from ultralytics.models.yolo.detect import GroundDetectionTrainer; \`
			`trainer = GroundDetectionTrainer(overrides={'model': 'yolo26n.pt', 'data': 'dataset.yaml', 'epochs': 100}); \`
			`trainer.train()"`
			```

			`## Difficulty-Based Loss Weighting`

			`The implementation applies difficulty-based weighting to all loss components:`

			```python
			`loss_weight = 1.0 / (1.0 + difficulty)`
			```

			`Examples:`
			`- difficulty = 0.0 (easy): weight = 1.0`
			`- difficulty = 1.0 (medium): weight = 0.5`
			`- difficulty = 2.0 (hard): weight = 0.33`

			`This allows the model to focus more on easier, more detectable objects while still learning from harder examples.`

			`## Class Mapping`

			The `class_map` in the YAML allows flexible class merging:

			```yaml
			`class_map:`
			`car: 0`
			`suv: 0 # Maps to same class as car`
			`van: 0 # Maps to same class as car`
			`bus: 1`
			`truck: 2`
			`pedestrian: 3`
			```

			`## Minimum Box Size Filtering`

			Boxes are filtered out only when both width and height are smaller than `min_wh` pixels:

			```yaml
			`min_wh: 2.0 # Filter boxes only if both sides are smaller than 2 px`
			```

			`This is useful for removing very small objects that are difficult to detect.`

			`## YUV444 Color Space`

			`If your images are in YUV444 format, enable conversion:`

			```yaml
			`use_yuv444: true`
			```

			`The dataset will automatically convert images from YUV444 to BGR during loading.`

			`## Validation`

			`The trained model can be validated using the standard YOLO validation:`

			```python
			`from ultralytics import YOLO`

			`model = YOLO("runs/detect/train/weights/best.pt")`
			`results = model.val(data="dataset.yaml")`
			```

			`## Inference`

			`Use the trained model for inference:`

			```python
			`from ultralytics import YOLO`

			`model = YOLO("runs/detect/train/weights/best.pt")`
			`results = model.predict("path/to/image.jpg")`
			```

			`## Key Implementation Files`

			- `ultralytics/utils/instance.py`: Extended `Instances` class with difficulty support
			- `ultralytics/data/utils.py`: Added `verify_image_label_ground()` function
			- `ultralytics/data/dataset.py`: Added `YOLOGroundDataset` class
			- `ultralytics/utils/loss.py`: Added `v8DetectionLossGround` class
			- `ultralytics/data/build.py`: Modified `build_yolo_dataset()` to detect ground datasets
			- `ultralytics/models/yolo/detect/train.py`: Added `GroundDetectionTrainer` class
			- `ultralytics/cfg/datasets/ground_template.yaml`: Dataset configuration template

			`## Troubleshooting`

			`### Issue: "class_map not found in data"`
			Make sure your dataset YAML has a `class_map` dictionary instead of a `names` list.

			`### Issue: "labels require 6 columns"`
			`Check that your label files have exactly 7 columns (class_name + 4 coords + 2 difficulties).`

			`### Issue: "negative label values"`
			`Ensure all coordinates and difficulty values are non-negative.`

			`### Issue: "non-normalized coordinates"`
			`Coordinates must be normalized to [0, 1] range.`