Major changes: - New frontend (platform/web/): Vite + React 18 + TypeScript + Tailwind - 4-module navigation: 数据送标 / 模型管理 / 车队管理 / 系统管理 - Data catalog with charts (DMS/ADAS/Lane 3-tab view) - Quality review workflow (标注质检): Good/Fine/Bad scoring with auto-advance - Audit enhancements: batch operations, rejection categories, Feishu notifications - Operation audit log (操作日志) - World model simulation studio (仿真工坊) - Dataset version management with snapshots and diff - ADAS 7-class dataset integration (138K images organized + compressed) - User management with Feishu integration and pagination - CRUD/search/filter on all pages, card layout redesign - PIL-optimized image overlay rendering - Auto-snapshot on build, in_review workflow stage - Removed embedded algorithm code (now in workspace)
6.1 KiB
Welcome to PytorchAutoDrive benchmark
The current benchmark's FLOPs & Param count is entirely based on thop to identify underlying basic ops, which might be inaccurate. But FLOPs count is an estimate to begin with. What we are doing here, is simply providing a relatively fair benchmark for comparing different methods.
Lane detection performance
| method | backbone | resolution | FPS | FLOPS(G) | Params(M) |
|---|---|---|---|---|---|
| Baseline | VGG16 | 360 x 640 | 56.36 | 214.50 | 20.37 |
| Baseline | ResNet18 | 360 x 640 | 148.59 | 85.24 | 12.04 |
| Baseline | ResNet34 | 360 x 640 | 79.97 | 159.60 | 22.15 |
| Baseline | ResNet50 | 360 x 640 | 50.58 | 177.62 | 24.57 |
| Baseline | ResNet101 | 360 x 640 | 27.41 | 314.36 | 43.56 |
| Baseline | ERFNet | 360 x 640 | 85.87 | 26.32 | 2.67 |
| Baseline | ENet | 360 x 640 | 56.63 | 4.26 | 0.95 |
| Baseline | MobileNetV2 | 360 x 640 | 126.54 | 4.49 | 2.06 |
| Baseline | MobileNetV3-Large | 360 x 640 | 104.34 | 3.63 | 3.30 |
| SCNN | VGG16 | 360 x 640 | 21.18 | 218.64 | 20.96 |
| SCNN | ResNet18 | 360 x 640 | 21.12 | 89.38 | 12.63 |
| SCNN | ResNet34 | 360 x 640 | 20.77 | 163.74 | 22.74 |
| SCNN | ResNet50 | 360 x 640 | 19.59 | 181.76 | 25.16 |
| SCNN | ResNet101 | 360 x 640 | 13.50 | 318.50 | 44.15 |
| SCNN | ERFNet | 360 x 640 | 18.40 | 30.46 | 3.26 |
| LSTR | ResNet18s | 360 x 640 | 98.13 | 1.15 | 0.77 |
| LSTR | ResNet18s-2x | 360 x 640 | 97.27 | 4.05 | 3.05 |
| LSTR | ResNet18s | 1080 x 1920 | 91.23 | 10.20 | 0.77 |
| LSTR | ResNet18s | 2160 x 4320 | 23.60 | 40.75 | 0.77 |
| LSTR | ResNet34 | 360 x 640 | 63.52 | 34.54 | 22.34 |
| RESA | ResNet18 | 360 x 640 | 67.66 | 61.35 | 6.61 |
| RESA | ResNet34 | 360 x 640 | 54.49 | 101.74 | 11.99 |
| RESA | ResNet50 | 360 x 640 | 44.80 | 105.71 | 12.46 |
| RESA | ResNet101 | 360 x 640 | 25.14 | 242.45 | 31.46 |
| RESA | MobileNetV2 | 360 x 640 | 60.53 | 12.80 | 4.63 |
| RESA | MobileNetV3-Large | 360 x 640 | 54.39 | 11.95 | 5.88 |
| LaneATT | ResNet18 | 360 x 640 | 198.29 | 18.67 | 12.02 |
| LaneATT | ResNet34 | 360 x 640 | 133.84 | 36.01 | 22.12 |
| BézierLaneNet | ResNet18 | 360 x 640 | 212.83 | 14.77 | 4.10 |
| BézierLaneNet | ResNet34 | 360 x 640 | 149.52 | 29.85 | 9.49 |
| Baseline | VGG16 | 288 x 800 | 55.31 | 214.50 | 20.15 |
| Baseline | ResNet18 | 288 x 800 | 136.28 | 85.22 | 11.82 |
| Baseline | ResNet34 | 288 x 800 | 72.42 | 159.60 | 21.93 |
| Baseline | ResNet50 | 288 x 800 | 49.41 | 177.60 | 24.35 |
| Baseline | ResNet101 | 288 x 800 | 27.19 | 314.34 | 43.34 |
| Baseline | ERFNet | 288 x 800 | 88.76 | 26.26 | 2.68 |
| Baseline | ENet | 288 x 800 | 57.99 | 4.12 | 0.96 |
| Baseline | MobileNetV2 | 288 x 800 | 129.24 | 4.41 | 2.00 |
| Baseline | MobileNetV3-Large | 288 x 800 | 107.83 | 3.56 | 3.25 |
| Baseline | RepVGG-A0 | 288 x 800 | 162.61 | 207.81 | 9.06 |
| Baseline | RepVGG-A1 | 288 x 800 | 117.30 | 339.83 | 13.54 |
| Baseline | RepVGG-B0 | 288 x 800 | 103.68 | 390.83 | 15.09 |
| Baseline | RepVGG-B1g2 | 288 x 800 | 36.91 | 1166.76 | 42.20 |
| Baseline | RepVGG-B2 | 288 x 800 | 18.98 | 2310.13 | 81.23 |
| Baseline | Swin-Tiny | 288 x 800 | 51.90 | 44.24 | 27.72 |
| SCNN | VGG16 | 288 x 800 | 21.40 | 218.62 | 20.74 |
| SCNN | ResNet18 | 288 x 800 | 20.80 | 89.34 | 12.42 |
| SCNN | ResNet34 | 288 x 800 | 19.77 | 163.72 | 22.52 |
| SCNN | ResNet50 | 288 x 800 | 18.88 | 181.72 | 24.94 |
| SCNN | ResNet101 | 288 x 800 | 13.42 | 318.46 | 43.94 |
| SCNN | ERFNet | 288 x 800 | 18.80 | 30.40 | 3.27 |
| SCNN | RepVGG-A1 | 288 x 800 | 20.53 | 343.96 | 14.13 |
| RESA | ResNet18 | 288 x 800 | 69.58 | 61.33 | 6.62 |
| RESA | ResNet34 | 288 x 800 | 55.61 | 101.72 | 12.01 |
| RESA | ResNet50 | 288 x 800 | 46.75 | 105.70 | 12.48 |
| RESA | ResNet101 | 288 x 800 | 26.08 | 242.44 | 31.47 |
| RESA | MobileNetV2 | 288 x 800 | 59.49 | 12.55 | 4.63 |
| RESA | MobileNetV3-Large | 288 x 800 | 53.85 | 11.70 | 5.88 |
| LSTR | ResNet34 | 288 x 800 | 65.39 | 33.86 | 22.34 |
| BézierLaneNet | ResNet18 | 288 x 800 | 210.79 | 14.66 | 4.10 |
| BézierLaneNet | ResNet34 | 288 x 800 | 144.65 | 29.54 | 9.49 |
Segmentation performance:
| method | resolution | FPS | FLOPS(G) | Params(M) |
|---|---|---|---|---|
| FCN | 256 x 512 | 43.32 | 216.42 | 51.95 |
| FCN | 512 x 1024 | 12.06 | 865.69 | 51.95 |
| FCN | 1024 x 2048 | 3.06 | 3462.77 | 51.95 |
| ERFNet | 256 x 512 | 91.20 | 15.03 | 2.07 |
| ERFNet | 512 x 1024 | 85.51 | 60.11 | 2.07 |
| ERFNet | 1024 x 2048 | 21.53 | 240.44 | 2.07 |
| ENet | 256 x 512 | 59.31 | 2.72 | 0.35 |
| ENet | 512 x 1024 | 55.69 | 10.88 | 0.35 |
| ENet | 1024 x 2048 | 30.88 | 43.53 | 0.35 |
| DeeplabV2 | 256 x 512 | 44.87 | 180.59 | 43.90 |
| DeeplabV2 | 512 x 1024 | 12.93 | 722.37 | 43.90 |
| DeeplabV2 | 1024 x 2048 | 3.23 | 2889.49 | 43.90 |
| DeeplabV3 | 256 x 512 | 35.26 | 241.65 | 58.63 |
| DeeplabV3 | 512 x 1024 | 10.26 | 966.61 | 58.63 |
| DeeplabV3 | 1024 x 2048 | 2.56 | 3866.45 | 58.63 |
All results are the maximum value of 3 times on a RTX 2080Ti.
Lane detection post-processing are not counted.
LaneATT NMS is not counted yet.
Profiling Models Yourself
In the setting of mode=simple, we employ a random tensor to replace the real image.
Therefore, we can avoid using the DataLoader to obtain the best performance of models.
This is also the setting for the above benchmark.
python tools/profiling.py --mode=simple \
--config=<config file path> \
--times=3 \
--height=<image height in pixels> \
--width=<image width in pixels>
Same config mechanism and commandline overwrite by --cfg-options as in training/testing.
In the setting of mode=real, so as to simulate that the real camera transmit frames to models, we set 'batch_size=1' and 'num_workers=0' in the DataLoader. Just use --mode=real and probably provide an actual model by --checkpoint.
For detailed instructions and commandline shortcuts available, run:
python tools/profiling.py --help