feat: initial HSAP platform
Huaxu Sentinel Active Safety Platform with embedded algorithm code, Docker Compose setup, and vendored dataset scaffolds for clone-and-run. Co-authored-by: Cursor <cursoragent@cursor.com>
This commit is contained in:
250
algorithms/dms_yolo/code/docs/en/datasets/obb/dota-v2.md
Normal file
250
algorithms/dms_yolo/code/docs/en/datasets/obb/dota-v2.md
Normal file
@@ -0,0 +1,250 @@
|
||||
---
|
||||
comments: true
|
||||
description: Explore the DOTA dataset for object detection in aerial images, featuring 1.7M Oriented Bounding Boxes across 18 categories. Ideal for aerial image analysis.
|
||||
keywords: DOTA dataset, object detection, aerial images, oriented bounding boxes, OBB, DOTA v1.0, DOTA v1.5, DOTA v2.0, multiscale detection, Ultralytics
|
||||
---
|
||||
|
||||
# DOTA Dataset with OBB
|
||||
|
||||
[DOTA](https://captain-whu.github.io/DOTA/index.html) stands as a specialized dataset, emphasizing [object detection](https://www.ultralytics.com/glossary/object-detection) in aerial images. Originating from the DOTA series of datasets, it offers annotated images capturing a diverse array of aerial scenes with [Oriented Bounding Boxes (OBB)](https://docs.ultralytics.com/datasets/obb/).
|
||||
|
||||

|
||||
|
||||
## Key Features
|
||||
|
||||
<p align="center">
|
||||
<br>
|
||||
<iframe loading="lazy" width="720" height="405" src="https://www.youtube.com/embed/JjQ-URE0LJE"
|
||||
title="YouTube video player" frameborder="0"
|
||||
allow="accelerometer; autoplay; clipboard-write; encrypted-media; gyroscope; picture-in-picture; web-share"
|
||||
allowfullscreen>
|
||||
</iframe>
|
||||
<br>
|
||||
<strong>Watch:</strong> How to Train Ultralytics YOLO26 on the DOTA Dataset for Oriented Bounding Boxes in Google Colab
|
||||
</p>
|
||||
|
||||
- Collection from various sensors and platforms, with image sizes ranging from 800 × 800 to 20,000 × 20,000 pixels.
|
||||
- Features more than 1.7M oriented bounding boxes across 18 categories.
|
||||
- Encompasses multiscale object detection thanks to the wide spread of object sizes per image.
|
||||
- Instances are annotated by experts using arbitrary (8 d.o.f.) quadrilaterals, capturing objects of different scales, orientations, and shapes.
|
||||
|
||||
## Dataset Versions
|
||||
|
||||
### DOTA-v1.0
|
||||
|
||||
- Contains 15 common categories.
|
||||
- Comprises 2,806 images with 188,282 instances.
|
||||
- Split ratios: 1/2 for training, 1/6 for validation, and 1/3 for testing.
|
||||
|
||||
### DOTA-v1.5
|
||||
|
||||
- Incorporates the same images as DOTA-v1.0.
|
||||
- Very small instances (less than 10 pixels) are also annotated.
|
||||
- Addition of a new category: "container crane".
|
||||
- A total of 403,318 instances.
|
||||
- Released for the [DOAI Challenge 2019 on Object Detection in Aerial Images](https://captain-whu.github.io/DOAI2019/challenge.html).
|
||||
|
||||
### DOTA-v2.0
|
||||
|
||||
- Collections from Google Earth, GF-2 Satellite, and other aerial images.
|
||||
- Contains 18 common categories.
|
||||
- Comprises 11,268 images with a whopping 1,793,658 instances.
|
||||
- New categories introduced: "airport" and "helipad".
|
||||
- Image splits:
|
||||
- Training: 1,830 images with 268,627 instances.
|
||||
- Validation: 593 images with 81,048 instances.
|
||||
- Test-dev: 2,792 images with 353,346 instances.
|
||||
- Test-challenge: 6,053 images with 1,090,637 instances.
|
||||
|
||||
## Dataset Structure
|
||||
|
||||
DOTA exhibits a structured layout tailored for OBB object detection challenges:
|
||||
|
||||
- **Images**: A vast collection of high-resolution aerial images capturing diverse terrains and structures.
|
||||
- **Oriented Bounding Boxes**: Annotations in the form of rotated rectangles encapsulating objects irrespective of their orientation, ideal for capturing objects like airplanes, ships, and buildings.
|
||||
|
||||
## Applications
|
||||
|
||||
DOTA serves as a benchmark for training and evaluating models specifically tailored for aerial image analysis. With the inclusion of OBB annotations, it provides a unique challenge, enabling the development of specialized [object detection](https://docs.ultralytics.com/tasks/detect/) models that cater to aerial imagery's nuances. The dataset is particularly valuable for applications in remote sensing, surveillance, and environmental monitoring.
|
||||
|
||||
## Dataset YAML
|
||||
|
||||
A dataset YAML (Yet Another Markup Language) file specifies image/label roots, class names, and other important metadata. Ultralytics maintains official YAML files for the two most commonly used releases:
|
||||
|
||||
- [`DOTAv1.yaml`](https://github.com/ultralytics/ultralytics/blob/main/ultralytics/cfg/datasets/DOTAv1.yaml)
|
||||
- [`DOTAv1.5.yaml`](https://github.com/ultralytics/ultralytics/blob/main/ultralytics/cfg/datasets/DOTAv1.5.yaml)
|
||||
|
||||
Use the YAML that matches the release you downloaded, or author a custom YAML if you are working with DOTA-v2 or another derivative.
|
||||
|
||||
!!! example "DOTAv1.yaml"
|
||||
|
||||
```yaml
|
||||
--8<-- "ultralytics/cfg/datasets/DOTAv1.yaml"
|
||||
```
|
||||
|
||||
## Split DOTA images
|
||||
|
||||
The raw imagery routinely exceeds 10,000 pixels on a side, so tiling is required before feeding the data to YOLO. Use the helper below to slice the source imagery into overlapping 1024 × 1024 crops at multiple scales while keeping the annotations in sync.
|
||||
|
||||
!!! example "Split images"
|
||||
|
||||
=== "Python"
|
||||
|
||||
```python
|
||||
from ultralytics.data.split_dota import split_test, split_trainval
|
||||
|
||||
# Split train and val set, with labels.
|
||||
split_trainval(
|
||||
data_root="path/to/DOTAv1.0/",
|
||||
save_dir="path/to/DOTAv1.0-split/",
|
||||
rates=[0.5, 1.0, 1.5], # multiscale
|
||||
gap=500,
|
||||
)
|
||||
# Split test set, without labels.
|
||||
split_test(
|
||||
data_root="path/to/DOTAv1.0/",
|
||||
save_dir="path/to/DOTAv1.0-split/",
|
||||
rates=[0.5, 1.0, 1.5], # multiscale
|
||||
gap=500,
|
||||
)
|
||||
```
|
||||
|
||||
!!! tip
|
||||
|
||||
Keep the output directory organized in the standard YOLO layout (`images/train`, `labels/train`, etc.) so you can reference it directly from the dataset YAML.
|
||||
|
||||
## Usage
|
||||
|
||||
To train a model on the DOTA v1 dataset, you can utilize the following code snippets. Always refer to your model's documentation for a thorough list of available arguments. For those looking to experiment with a smaller subset first, consider using the [DOTA8 dataset](https://docs.ultralytics.com/datasets/obb/dota8/), which contains just 8 images for quick testing.
|
||||
|
||||
!!! warning
|
||||
|
||||
Please note that all images and associated annotations in the DOTAv1 dataset can be used for academic purposes, but commercial use is prohibited. Your understanding and respect for the dataset creators' wishes are greatly appreciated!
|
||||
|
||||
!!! example "Train Example"
|
||||
|
||||
=== "Python"
|
||||
|
||||
```python
|
||||
from ultralytics import YOLO
|
||||
|
||||
# Create a new YOLO26n-OBB model from scratch
|
||||
model = YOLO("yolo26n-obb.yaml")
|
||||
|
||||
# Train the model on the DOTAv1 dataset
|
||||
results = model.train(data="DOTAv1.yaml", epochs=100, imgsz=1024)
|
||||
```
|
||||
|
||||
=== "CLI"
|
||||
|
||||
```bash
|
||||
# Train a new YOLO26n-OBB model on the DOTAv1 dataset
|
||||
yolo obb train data=DOTAv1.yaml model=yolo26n-obb.pt epochs=100 imgsz=1024
|
||||
```
|
||||
|
||||
## Sample Data and Annotations
|
||||
|
||||
Having a glance at the dataset illustrates its depth:
|
||||
|
||||

|
||||
|
||||
- **DOTA examples**: This snapshot underlines the complexity of aerial scenes and the significance of Oriented [Bounding Box](https://www.ultralytics.com/glossary/bounding-box) annotations, capturing objects in their natural orientation.
|
||||
|
||||
The dataset's richness offers invaluable insights into object detection challenges exclusive to aerial imagery. The [DOTA-v2.0 dataset](https://www.ultralytics.com/blog/exploring-the-best-computer-vision-datasets-in-2025) has become particularly popular for remote sensing and aerial surveillance projects due to its comprehensive annotations and diverse object categories.
|
||||
|
||||
## Citations and Acknowledgments
|
||||
|
||||
If you use DOTA in your work, please cite the relevant research papers:
|
||||
|
||||
!!! quote ""
|
||||
|
||||
=== "BibTeX"
|
||||
|
||||
```bibtex
|
||||
@article{9560031,
|
||||
author={Ding, Jian and Xue, Nan and Xia, Gui-Song and Bai, Xiang and Yang, Wen and Yang, Michael and Belongie, Serge and Luo, Jiebo and Datcu, Mihai and Pelillo, Marcello and Zhang, Liangpei},
|
||||
journal={IEEE Transactions on Pattern Analysis and Machine Intelligence},
|
||||
title={Object Detection in Aerial Images: A Large-Scale Benchmark and Challenges},
|
||||
year={2021},
|
||||
volume={},
|
||||
number={},
|
||||
pages={1-1},
|
||||
doi={10.1109/TPAMI.2021.3117983}
|
||||
}
|
||||
```
|
||||
|
||||
A special note of gratitude to the team behind the DOTA datasets for their commendable effort in curating this dataset. For an exhaustive understanding of the dataset and its nuances, please visit the [official DOTA website](https://captain-whu.github.io/DOTA/index.html).
|
||||
|
||||
## FAQ
|
||||
|
||||
### What is the DOTA dataset and why is it important for object detection in aerial images?
|
||||
|
||||
The [DOTA dataset](https://captain-whu.github.io/DOTA/index.html) is a specialized dataset focused on object detection in aerial images. It features Oriented Bounding Boxes (OBB), providing annotated images from diverse aerial scenes. DOTA's diversity in object orientation, scale, and shape across its 1.7M annotations and 18 categories makes it ideal for developing and evaluating models tailored for aerial imagery analysis, such as those used in surveillance, environmental monitoring, and disaster management.
|
||||
|
||||
### How does the DOTA dataset handle different scales and orientations in images?
|
||||
|
||||
DOTA utilizes Oriented Bounding Boxes (OBB) for annotation, which are represented by rotated rectangles encapsulating objects regardless of their orientation. This method ensures that objects, whether small or at different angles, are accurately captured. The dataset's multiscale images, ranging from 800 × 800 to 20,000 × 20,000 pixels, further allow for the detection of both small and large objects effectively. This approach is particularly valuable for aerial imagery where objects appear at various angles and scales.
|
||||
|
||||
### How can I train a model using the DOTA dataset?
|
||||
|
||||
To train a model on the DOTA dataset, you can use the following example with [Ultralytics YOLO](https://docs.ultralytics.com/tasks/obb/):
|
||||
|
||||
!!! example "Train Example"
|
||||
|
||||
=== "Python"
|
||||
|
||||
```python
|
||||
from ultralytics import YOLO
|
||||
|
||||
# Create a new YOLO26n-OBB model from scratch
|
||||
model = YOLO("yolo26n-obb.yaml")
|
||||
|
||||
# Train the model on the DOTAv1 dataset
|
||||
results = model.train(data="DOTAv1.yaml", epochs=100, imgsz=1024)
|
||||
```
|
||||
|
||||
=== "CLI"
|
||||
|
||||
```bash
|
||||
# Train a new YOLO26n-OBB model on the DOTAv1 dataset
|
||||
yolo obb train data=DOTAv1.yaml model=yolo26n-obb.pt epochs=100 imgsz=1024
|
||||
```
|
||||
|
||||
For more details on how to split and preprocess the DOTA images, refer to the [split DOTA images section](#split-dota-images).
|
||||
|
||||
### What are the differences between DOTA-v1.0, DOTA-v1.5, and DOTA-v2.0?
|
||||
|
||||
- **DOTA-v1.0**: Includes 15 common categories across 2,806 images with 188,282 instances. The dataset is split into training, validation, and testing sets.
|
||||
- **DOTA-v1.5**: Builds upon DOTA-v1.0 by annotating very small instances (less than 10 pixels) and adding a new category, "container crane," totaling 403,318 instances.
|
||||
- **DOTA-v2.0**: Expands further with annotations from Google Earth and GF-2 Satellite, featuring 11,268 images and 1,793,658 instances. It includes new categories like "airport" and "helipad."
|
||||
|
||||
For a detailed comparison and additional specifics, check the [dataset versions section](#dataset-versions).
|
||||
|
||||
### How can I prepare high-resolution DOTA images for training?
|
||||
|
||||
DOTA images, which can be very large, are split into smaller resolutions for manageable training. Here's a Python snippet to split images:
|
||||
|
||||
!!! example
|
||||
|
||||
=== "Python"
|
||||
|
||||
```python
|
||||
from ultralytics.data.split_dota import split_test, split_trainval
|
||||
|
||||
# split train and val set, with labels.
|
||||
split_trainval(
|
||||
data_root="path/to/DOTAv1.0/",
|
||||
save_dir="path/to/DOTAv1.0-split/",
|
||||
rates=[0.5, 1.0, 1.5], # multiscale
|
||||
gap=500,
|
||||
)
|
||||
# split test set, without labels.
|
||||
split_test(
|
||||
data_root="path/to/DOTAv1.0/",
|
||||
save_dir="path/to/DOTAv1.0-split/",
|
||||
rates=[0.5, 1.0, 1.5], # multiscale
|
||||
gap=500,
|
||||
)
|
||||
```
|
||||
|
||||
This process facilitates better training efficiency and model performance. For detailed instructions, visit the [split DOTA images section](#split-dota-images).
|
||||
141
algorithms/dms_yolo/code/docs/en/datasets/obb/dota8.md
Normal file
141
algorithms/dms_yolo/code/docs/en/datasets/obb/dota8.md
Normal file
@@ -0,0 +1,141 @@
|
||||
---
|
||||
comments: true
|
||||
description: Explore the DOTA8 dataset - a small, versatile oriented object detection dataset ideal for testing and debugging object detection models using Ultralytics YOLO26.
|
||||
keywords: DOTA8 dataset, Ultralytics, YOLO26, object detection, debugging, training models, oriented object detection, dataset YAML
|
||||
---
|
||||
|
||||
# DOTA8 Dataset
|
||||
|
||||
## Introduction
|
||||
|
||||
[Ultralytics](https://www.ultralytics.com/) DOTA8 is a small but versatile oriented [object detection](https://www.ultralytics.com/glossary/object-detection) dataset composed of the first 8 images of the split DOTAv1 set, 4 for training and 4 for validation. This dataset is ideal for testing and debugging object detection models, or for experimenting with new detection approaches. With 8 images, it is small enough to be easily manageable, yet diverse enough to test training pipelines for errors and act as a sanity check before training larger datasets.
|
||||
|
||||
## Dataset Structure
|
||||
|
||||
- **Images**: 8 aerial tiles (4 train, 4 val) sourced from DOTAv1.
|
||||
- **Classes**: Inherits the 15 DOTAv1 categories such as plane, ship, and large vehicle.
|
||||
- **Labels**: YOLO-format oriented bounding boxes saved as `.txt` files beside each image.
|
||||
- **Recommended layout**:
|
||||
|
||||
```
|
||||
datasets/dota8/
|
||||
├── images/
|
||||
│ ├── train/
|
||||
│ └── val/
|
||||
└── labels/
|
||||
├── train/
|
||||
└── val/
|
||||
```
|
||||
|
||||
This dataset is intended for use with [Ultralytics Platform](https://platform.ultralytics.com/) and [YOLO26](https://github.com/ultralytics/ultralytics).
|
||||
|
||||
## Dataset YAML
|
||||
|
||||
A YAML (Yet Another Markup Language) file is used to define the dataset configuration. It contains information about the dataset's paths, classes, and other relevant information. In the case of the DOTA8 dataset, the `dota8.yaml` file is maintained at [https://github.com/ultralytics/ultralytics/blob/main/ultralytics/cfg/datasets/dota8.yaml](https://github.com/ultralytics/ultralytics/blob/main/ultralytics/cfg/datasets/dota8.yaml).
|
||||
|
||||
!!! example "ultralytics/cfg/datasets/dota8.yaml"
|
||||
|
||||
```yaml
|
||||
--8<-- "ultralytics/cfg/datasets/dota8.yaml"
|
||||
```
|
||||
|
||||
## Usage
|
||||
|
||||
To train a YOLO26n-obb model on the DOTA8 dataset for 100 [epochs](https://www.ultralytics.com/glossary/epoch) with an image size of 640, you can use the following code snippets. For a comprehensive list of available arguments, refer to the model [Training](../../modes/train.md) page.
|
||||
|
||||
!!! example "Train Example"
|
||||
|
||||
=== "Python"
|
||||
|
||||
```python
|
||||
from ultralytics import YOLO
|
||||
|
||||
# Load a model
|
||||
model = YOLO("yolo26n-obb.pt") # load a pretrained model (recommended for training)
|
||||
|
||||
# Train the model
|
||||
results = model.train(data="dota8.yaml", epochs=100, imgsz=640)
|
||||
```
|
||||
|
||||
=== "CLI"
|
||||
|
||||
```bash
|
||||
# Start training from a pretrained *.pt model
|
||||
yolo obb train data=dota8.yaml model=yolo26n-obb.pt epochs=100 imgsz=640
|
||||
```
|
||||
|
||||
## Sample Images and Annotations
|
||||
|
||||
Here are some examples of images from the DOTA8 dataset, along with their corresponding annotations:
|
||||
|
||||
<img src="https://cdn.jsdelivr.net/gh/ultralytics/assets@main/docs/mosaiced-training-batch.avif" alt="DOTA8 oriented bounding box dataset training mosaic" width="800">
|
||||
|
||||
- **Mosaiced Image**: This image demonstrates a training batch composed of mosaiced dataset images. Mosaicing is a technique used during training that combines multiple images into a single image to increase the variety of objects and scenes within each training batch. This helps improve the model's ability to generalize to different object sizes, aspect ratios, and contexts.
|
||||
|
||||
The example showcases the variety and complexity of the images in the DOTA8 dataset and the benefits of using mosaicing during the training process.
|
||||
|
||||
## Citations and Acknowledgments
|
||||
|
||||
If you use the DOTA dataset in your research or development work, please cite the following paper:
|
||||
|
||||
!!! quote ""
|
||||
|
||||
=== "BibTeX"
|
||||
|
||||
```bibtex
|
||||
@article{9560031,
|
||||
author={Ding, Jian and Xue, Nan and Xia, Gui-Song and Bai, Xiang and Yang, Wen and Yang, Michael and Belongie, Serge and Luo, Jiebo and Datcu, Mihai and Pelillo, Marcello and Zhang, Liangpei},
|
||||
journal={IEEE Transactions on Pattern Analysis and Machine Intelligence},
|
||||
title={Object Detection in Aerial Images: A Large-Scale Benchmark and Challenges},
|
||||
year={2021},
|
||||
volume={},
|
||||
number={},
|
||||
pages={1-1},
|
||||
doi={10.1109/TPAMI.2021.3117983}
|
||||
}
|
||||
```
|
||||
|
||||
A special note of gratitude to the team behind the DOTA datasets for their commendable effort in curating this dataset. For an exhaustive understanding of the dataset and its nuances, please visit the [official DOTA website](https://captain-whu.github.io/DOTA/index.html).
|
||||
|
||||
## FAQ
|
||||
|
||||
### What is the DOTA8 dataset and how can it be used?
|
||||
|
||||
The DOTA8 dataset is a small, versatile oriented object detection dataset made up of the first 8 images from the DOTAv1 split set, with 4 images designated for training and 4 for validation. It's ideal for testing and debugging object detection models like Ultralytics YOLO26. Due to its manageable size and diversity, it helps in identifying pipeline errors and running sanity checks before deploying larger datasets. Learn more about object detection with [Ultralytics YOLO26](https://github.com/ultralytics/ultralytics).
|
||||
|
||||
### How do I train a YOLO26 model using the DOTA8 dataset?
|
||||
|
||||
To train a YOLO26n-obb model on the DOTA8 dataset for 100 epochs with an image size of 640, you can use the following code snippets. For comprehensive argument options, refer to the model [Training](../../modes/train.md) page.
|
||||
|
||||
!!! example "Train Example"
|
||||
|
||||
=== "Python"
|
||||
|
||||
```python
|
||||
from ultralytics import YOLO
|
||||
|
||||
# Load a model
|
||||
model = YOLO("yolo26n-obb.pt") # load a pretrained model (recommended for training)
|
||||
|
||||
# Train the model
|
||||
results = model.train(data="dota8.yaml", epochs=100, imgsz=640)
|
||||
```
|
||||
|
||||
=== "CLI"
|
||||
|
||||
```bash
|
||||
# Start training from a pretrained *.pt model
|
||||
yolo obb train data=dota8.yaml model=yolo26n-obb.pt epochs=100 imgsz=640
|
||||
```
|
||||
|
||||
### What are the key features of the DOTA dataset and where can I access the YAML file?
|
||||
|
||||
The DOTA dataset is known for its large-scale benchmark and the challenges it presents for object detection in aerial images. The DOTA8 subset is a smaller, manageable dataset ideal for initial tests. You can access the `dota8.yaml` file, which contains paths, classes, and configuration details, at this [GitHub link](https://github.com/ultralytics/ultralytics/blob/main/ultralytics/cfg/datasets/dota8.yaml).
|
||||
|
||||
### How does mosaicing enhance model training with the DOTA8 dataset?
|
||||
|
||||
Mosaicing combines multiple images into one during training, increasing the variety of objects and contexts within each batch. This improves a model's ability to generalize to different object sizes, aspect ratios, and scenes. This technique can be visually demonstrated through a training batch composed of mosaiced DOTA8 dataset images, helping in robust model development. Explore more about mosaicing and training techniques on our [Training](../../modes/train.md) page.
|
||||
|
||||
### Why should I use Ultralytics YOLO26 for object detection tasks?
|
||||
|
||||
Ultralytics YOLO26 provides state-of-the-art real-time object detection capabilities, including features like oriented bounding boxes (OBB), [instance segmentation](https://www.ultralytics.com/glossary/instance-segmentation), and a highly versatile training pipeline. It's suitable for various applications and offers pretrained models for efficient fine-tuning. Explore further about the advantages and usage in the [Ultralytics YOLO26 documentation](https://github.com/ultralytics/ultralytics).
|
||||
157
algorithms/dms_yolo/code/docs/en/datasets/obb/index.md
Normal file
157
algorithms/dms_yolo/code/docs/en/datasets/obb/index.md
Normal file
@@ -0,0 +1,157 @@
|
||||
---
|
||||
comments: true
|
||||
description: Discover OBB dataset formats for Ultralytics YOLO models. Learn about their structure, application, and format conversions to enhance your object detection training.
|
||||
keywords: Oriented Bounding Box, OBB Datasets, YOLO, Ultralytics, Object Detection, Dataset Formats
|
||||
---
|
||||
|
||||
# Oriented Bounding Box (OBB) Datasets Overview
|
||||
|
||||
Training a precise [object detection](https://www.ultralytics.com/glossary/object-detection) model with oriented bounding boxes (OBB) requires a thorough dataset. This guide explains the various OBB dataset formats compatible with Ultralytics YOLO models, offering insights into their structure, application, and methods for format conversions.
|
||||
|
||||
## Supported OBB Dataset Formats
|
||||
|
||||
### YOLO OBB Format
|
||||
|
||||
The YOLO OBB format designates bounding boxes by their four corner points with coordinates normalized between 0 and 1. It follows this format:
|
||||
|
||||
```bash
|
||||
class_index x1 y1 x2 y2 x3 y3 x4 y4
|
||||
```
|
||||
|
||||
Internally, YOLO processes losses and outputs in the `xywhr` format, which represents the [bounding box](https://www.ultralytics.com/glossary/bounding-box)'s center point (xy), width, height, and rotation.
|
||||
|
||||
<p align="center"><img width="800" src="https://cdn.jsdelivr.net/gh/ultralytics/assets@main/docs/obb-format-examples.avif" alt="Oriented bounding box annotation format examples"></p>
|
||||
|
||||
An example of a `*.txt` label file for the above image, which contains an object of class `0` in OBB format, could look like:
|
||||
|
||||
```bash
|
||||
0 0.780811 0.743961 0.782371 0.74686 0.777691 0.752174 0.776131 0.749758
|
||||
```
|
||||
|
||||
### Dataset YAML format
|
||||
|
||||
The Ultralytics framework uses a YAML file format to define the dataset and model configuration for training OBB models. Here is an example of the YAML format used for defining an OBB dataset:
|
||||
|
||||
!!! example "ultralytics/cfg/datasets/dota8.yaml"
|
||||
|
||||
```yaml
|
||||
--8<-- "ultralytics/cfg/datasets/dota8.yaml"
|
||||
```
|
||||
|
||||
## Usage
|
||||
|
||||
To train a model using these OBB formats:
|
||||
|
||||
!!! example
|
||||
|
||||
=== "Python"
|
||||
|
||||
```python
|
||||
from ultralytics import YOLO
|
||||
|
||||
# Create a new YOLO26n-OBB model from scratch
|
||||
model = YOLO("yolo26n-obb.yaml")
|
||||
|
||||
# Train the model on the DOTAv1 dataset
|
||||
results = model.train(data="DOTAv1.yaml", epochs=100, imgsz=1024)
|
||||
```
|
||||
|
||||
=== "CLI"
|
||||
|
||||
```bash
|
||||
# Train a new YOLO26n-OBB model on the DOTAv1 dataset
|
||||
yolo obb train data=DOTAv1.yaml model=yolo26n-obb.pt epochs=100 imgsz=1024
|
||||
```
|
||||
|
||||
## Supported Datasets
|
||||
|
||||
Currently, the following datasets with oriented bounding boxes are supported:
|
||||
|
||||
- [DOTA-v1](dota-v2.md#dota-v10): The first version of the DOTA dataset, providing a comprehensive set of aerial images with oriented bounding boxes for object detection.
|
||||
- [DOTA-v1.5](dota-v2.md#dota-v15): An intermediate version of the DOTA dataset, offering additional annotations and improvements over DOTA-v1 for enhanced object detection tasks.
|
||||
- [DOTA-v2](dota-v2.md#dota-v20): DOTA (A Large-scale Dataset for Object Detection in Aerial Images) version 2, emphasizes detection from aerial perspectives and contains oriented bounding boxes with 1.7 million instances and 11,268 images.
|
||||
- [DOTA8](dota8.md): A small, 8-image subset of the full DOTA dataset suitable for testing workflows and Continuous Integration (CI) checks of OBB training in the `ultralytics` repository.
|
||||
|
||||
### Incorporating your own OBB dataset
|
||||
|
||||
For those looking to introduce their own datasets with oriented bounding boxes, ensure compatibility with the "YOLO OBB format" mentioned above. Convert your annotations to this required format and detail the paths, classes, and class names in a corresponding YAML configuration file.
|
||||
|
||||
## Convert Label Formats
|
||||
|
||||
### DOTA Dataset Format to YOLO OBB Format
|
||||
|
||||
Transitioning labels from the DOTA dataset format to the YOLO OBB format can be achieved with this script:
|
||||
|
||||
!!! example
|
||||
|
||||
=== "Python"
|
||||
|
||||
```python
|
||||
from ultralytics.data.converter import convert_dota_to_yolo_obb
|
||||
|
||||
convert_dota_to_yolo_obb("path/to/DOTA")
|
||||
```
|
||||
|
||||
This conversion mechanism is instrumental for datasets in the DOTA format, ensuring alignment with the [Ultralytics YOLO](../../models/yolo26.md) OBB format.
|
||||
|
||||
It's imperative to validate the compatibility of the dataset with your model and adhere to the necessary format conventions. Properly structured datasets are pivotal for training efficient object detection models with oriented bounding boxes.
|
||||
|
||||
## FAQ
|
||||
|
||||
### What are Oriented Bounding Boxes (OBB) and how are they used in Ultralytics YOLO models?
|
||||
|
||||
Oriented Bounding Boxes (OBB) are a type of bounding box annotation where the box can be rotated to align more closely with the object being detected, rather than just being axis-aligned. This is particularly useful in aerial or satellite imagery where objects might not be aligned with the image axes. In [Ultralytics YOLO](../../tasks/obb.md) models, OBBs are represented by their four corner points in the YOLO OBB format. This allows for more accurate object detection since the bounding boxes can rotate to fit the objects better.
|
||||
|
||||
### How do I convert my existing DOTA dataset labels to YOLO OBB format for use with Ultralytics YOLO26?
|
||||
|
||||
You can convert DOTA dataset labels to YOLO OBB format using the [`convert_dota_to_yolo_obb`](../../reference/data/converter.md) function from Ultralytics. This conversion ensures compatibility with the Ultralytics YOLO models, enabling you to leverage the OBB capabilities for enhanced object detection. Here's a quick example:
|
||||
|
||||
```python
|
||||
from ultralytics.data.converter import convert_dota_to_yolo_obb
|
||||
|
||||
convert_dota_to_yolo_obb("path/to/DOTA")
|
||||
```
|
||||
|
||||
This script will reformat your DOTA annotations into a YOLO-compatible format.
|
||||
|
||||
### How do I train a YOLO26 model with oriented bounding boxes (OBB) on my dataset?
|
||||
|
||||
Training a YOLO26 model with OBBs involves ensuring your dataset is in the YOLO OBB format and then using the [Ultralytics API](../../usage/python.md) to train the model. Here's an example in both Python and CLI:
|
||||
|
||||
!!! example
|
||||
|
||||
=== "Python"
|
||||
|
||||
```python
|
||||
from ultralytics import YOLO
|
||||
|
||||
# Create a new YOLO26n-OBB model from scratch
|
||||
model = YOLO("yolo26n-obb.yaml")
|
||||
|
||||
# Train the model on the custom dataset
|
||||
results = model.train(data="your_dataset.yaml", epochs=100, imgsz=640)
|
||||
```
|
||||
|
||||
=== "CLI"
|
||||
|
||||
```bash
|
||||
# Train a new YOLO26n-OBB model on the custom dataset
|
||||
yolo obb train data=your_dataset.yaml model=yolo26n-obb.yaml epochs=100 imgsz=640
|
||||
```
|
||||
|
||||
This ensures your model leverages the detailed OBB annotations for improved detection [accuracy](https://www.ultralytics.com/glossary/accuracy).
|
||||
|
||||
### What datasets are currently supported for OBB training in Ultralytics YOLO models?
|
||||
|
||||
Currently, Ultralytics supports the following datasets for OBB training:
|
||||
|
||||
- [DOTA-v1](dota-v2.md): The first version of the DOTA dataset, providing a comprehensive set of aerial images with oriented bounding boxes for object detection.
|
||||
- [DOTA-v1.5](dota-v2.md): An intermediate version of the DOTA dataset, offering additional annotations and improvements over DOTA-v1 for enhanced object detection tasks.
|
||||
- [DOTA-v2](dota-v2.md): This dataset includes 1.7 million instances with oriented bounding boxes and 11,268 images, primarily focusing on aerial object detection.
|
||||
- [DOTA8](dota8.md): A smaller, 8-image subset of the DOTA dataset used for testing and [continuous integration](../../help/CI.md) (CI) checks.
|
||||
|
||||
These datasets are tailored for scenarios where OBBs offer a significant advantage, such as aerial and satellite image analysis.
|
||||
|
||||
### Can I use my own dataset with oriented bounding boxes for YOLO26 training, and if so, how?
|
||||
|
||||
Yes, you can use your own dataset with oriented bounding boxes for YOLO26 training. Ensure your dataset annotations are converted to the YOLO OBB format, which involves defining bounding boxes by their four corner points. You can then create a [YAML configuration file](../../usage/cfg.md) specifying the dataset paths, classes, and other necessary details. For more information on creating and configuring your datasets, refer to the [Supported Datasets](#supported-datasets) section.
|
||||
Reference in New Issue
Block a user