单目3D初始代码

2026-06-24 09:35:46 +08:00
commit 04a5895b6b
1153 changed files with 340700 additions and 0 deletions
--- a/CLAUDE.md
+++ b/CLAUDE.md
@@ -0,0 +1,74 @@
+# CLAUDE.md
+
+This file provides guidance to Claude Code (claude.ai/code) when working with code in this repository.
+
+## Common commands
+
+- Install editable package: `pip install -e .`
+- Install a CI-like local test environment: `uv pip install --system -e ".[export,solutions]" pytest-cov --extra-index-url https://download.pytorch.org/whl/cpu --index-strategy unsafe-best-match`
+- Install docs/lint dependencies: `uv pip install --system -e ".[dev]" ruff black --extra-index-url https://download.pytorch.org/whl/cpu`
+- Sanity check the environment: `yolo checks`
+
+### Lint and docs
+
+- Lint with the same Ruff rules used in CI: `ruff check --extend-select F,I,D,UP,RUF,FA --target-version py39 --ignore D100,D104,D203,D205,D212,D213,D401,D406,D407,D413,RUF001,RUF002,RUF012 .`
+- Apply the same Ruff autofixes used in docs CI: `ruff check --fix --unsafe-fixes --extend-select F,I,D,UP,RUF,FA --target-version py39 --ignore D100,D104,D203,D205,D212,D213,D401,D406,D407,D413,RUF001,RUF002,RUF012 .`
+- Rebuild autogenerated API reference docs: `python docs/build_reference.py`
+- Build docs locally: `python docs/build_docs.py`
+
+### Tests
+
+- Run the default test suite: `pytest --cov=ultralytics/ --cov-report=xml tests/`
+- Run slow tests too: `pytest --slow --cov=ultralytics/ --cov-report=xml tests/`
+- Run one test file: `pytest tests/test_engine.py -v -s`
+- Run one test by node id: `pytest tests/test_engine.py::test_name -v -s`
+- Run the GPU-focused test file: `pytest tests/test_cuda.py -sv`
+
+### Repo-specific training entry points
+
+- Ground 2D training: `python train_mono2d.py --model yolo26s.pt --data ultralytics/cfg/datasets/mono2d_ground.yaml --epochs 100 --device 0`
+- Ground 2D DDP training: `python -m torch.distributed.run --nproc_per_node 4 train_mono2d.py --model yolo26s.pt --data ultralytics/cfg/datasets/mono2d_ground.yaml --epochs 100`
+- Ground 3D training from pretrained 2D weights: `python train_mono3d.py --pretrained yolo26s-pretrain.pt --epochs 100 --device 0`
+- Ground 3D DDP training from pretrained 2D weights: `python -m torch.distributed.run --nproc_per_node 4 train_mono3d.py --pretrained yolo26s-pretrain.pt --epochs 100`
+- Resume 3D training: `python train_mono3d.py --resume runs/detect/train_mono3d/weights/last.pt`
+
+## Big-picture architecture
+
+- This repo still uses the standard Ultralytics split of `cfg -> engine -> task package -> nn/data/utils`, but this checkout adds a substantial custom ground-detection branch for 2D and 3D training on top of the upstream abstractions instead of replacing them.
+- CLI entry is `ultralytics.cfg:entrypoint`. `ultralytics/cfg/__init__.py` defines supported tasks/modes and default task-to-model/data mappings.
+- The public Python orchestration layer is `ultralytics/engine/model.py`. `Model.train()`, `val()`, `predict()`, `track()`, and `export()` all route into task-specific trainer/validator/predictor classes.
+- Task binding lives in `ultralytics/models/yolo/model.py` via `task_map`. That is the main switchboard from a high-level `YOLO(...)` object to the task-specific model/trainer/validator/predictor implementations.
+- Model YAMLs under `ultralytics/cfg/models/` are converted into PyTorch graphs by `ultralytics/nn/tasks.py`. Standard heads live in `ultralytics/nn/modules/head.py`; this fork adds `Detect3D` there.
+- Shared runtime loops are in `ultralytics/engine/trainer.py`, `ultralytics/engine/validator.py`, and `ultralytics/engine/predictor.py`. Most repo-specific behavior plugs in by subclassing these rather than rewriting the loops.
+- Dataset selection and dataloader wiring are in `ultralytics/data/build.py`. Core dataset classes are in `ultralytics/data/dataset.py`.
+
+## Ground 2D custom path
+
+- `train_mono2d.py` is the practical entry point for the custom ground 2D workflow.
+- `ultralytics/models/yolo/detect/train.py` defines `GroundDetectionModel`, `GroundDetectionTrainer`, and `GroundDetectionValidator`.
+- `ultralytics/data/dataset.py` adds `YOLOGroundDataset`, which extends the normal YOLO dataset flow with class mapping, difficulty-aware targets, min-box filtering, and optional YUV444 support.
+- Dataset YAMLs that include `class_map` are routed into `YOLOGroundDataset` by `ultralytics/data/build.py`.
+- Ground-specific losses live in `ultralytics/utils/loss.py` (`v8DetectionLossGround` / `E2EGroundLoss`).
+- YUV444 images are not converted in the loader; they are converted to BGR in trainer/validator preprocessing. If image colors or TensorBoard examples look wrong, inspect `GroundDetectionTrainer.preprocess_batch()` and `GroundDetectionValidator.preprocess()` before touching the dataset loader.
+
+## Ground 3D custom path
+
+- `train_mono3d.py` is the main entry point for the joint 2D+3D workflow.
+- `ultralytics/models/yolo/detect/train.py` also defines `Ground3DDetectionModel`, `Ground3DDetectionTrainer`, and `Ground3DDetectionValidator`.
+- The 3D model config is `ultralytics/cfg/models/26/yolo26-3d.yaml`; it uses `Detect3D` from `ultralytics/nn/modules/head.py`, which extends the normal detection head with a dedicated 3D branch.
+- `ultralytics/data/dataset.py` adds `YOLOGround3DDataset`, which loads mixed 2D/3D labels, performs ROI crop or virtual-camera transforms, and returns standard 2D fields plus `labels_3d` and calibration data.
+- Geometry-heavy preprocessing lives in `ultralytics/data/ground3d_augment.py`. That file is the source of truth for ROI cropping, virtual-camera simulation, calibration updates, cut-in/cut-out handling, and depth scaling.
+- 3D loss, validation metrics, and visualization are split across `ultralytics/utils/loss.py`, `ultralytics/utils/metrics_3d.py`, and `ultralytics/utils/plotting_3d.py`.
+- 3D metrics and TensorBoard visualizations depend on calibration stored in `batch["calib"]`. Do not re-read calibration from disk inside callbacks or validators.
+
+## Fast navigation hints
+
+- If a command or API call is being routed somewhere unexpected, start with `ultralytics/cfg/__init__.py`, then `ultralytics/engine/model.py`, then `ultralytics/models/yolo/model.py`.
+- If a model YAML change is not taking effect, inspect `ultralytics/nn/tasks.py` and the relevant head in `ultralytics/nn/modules/`.
+- If a batch field looks wrong, trace `ultralytics/data/dataset.py` -> `ultralytics/data/build.py` -> the relevant trainer `preprocess_batch()`.
+- If 3D training, metrics, or TensorBoard visualizations look inconsistent, trace `YOLOGround3DDataset.get_image_and_label()` -> `Ground3DDetectionTrainer` / `Ground3DDetectionValidator` -> `Detect3D` -> `ultralytics/utils/loss.py` / `ultralytics/utils/metrics_3d.py` / `ultralytics/utils/plotting_3d.py`.
+
+## Repository notes
+
+- There is no `Makefile` or committed `.pre-commit-config.yaml` in this repo. The most authoritative local commands come from `pyproject.toml`, `train_mono2d.py`, `train_mono3d.py`, and the GitHub Actions workflows.
+- API reference docs under `docs/en/reference/` are generated; if source signatures or docstrings change, rebuild them with `python docs/build_reference.py`.