121 lines
6.1 KiB
Markdown
121 lines
6.1 KiB
Markdown
|
|
# Welcome to PytorchAutoDrive benchmark
|
||
|
|
|
||
|
|
*The current benchmark's FLOPs & Param count is entirely based on [thop](https://github.com/Lyken17/pytorch-OpCounter) to identify underlying basic ops, which might be inaccurate. But FLOPs count is an [estimate](https://discuss.pytorch.org/t/correct-way-to-calculate-flops-in-model/67198/6) to begin with. What we are doing here, is simply providing a relatively fair benchmark for comparing different methods.*
|
||
|
|
|
||
|
|
## Lane detection performance
|
||
|
|
|
||
|
|
| method | backbone | resolution | FPS | FLOPS(G) | Params(M) |
|
||
|
|
| :---: | :---: | :---: | :---: | :---: | :---: |
|
||
|
|
| Baseline | VGG16 | 360 x 640 | 56.36 | 214.50 | 20.37 |
|
||
|
|
| Baseline | ResNet18 | 360 x 640 | 148.59 | 85.24 | 12.04 |
|
||
|
|
| Baseline | ResNet34 | 360 x 640 | 79.97 | 159.60 | 22.15 |
|
||
|
|
| Baseline | ResNet50 | 360 x 640 | 50.58 | 177.62 | 24.57 |
|
||
|
|
| Baseline | ResNet101 | 360 x 640 | 27.41 | 314.36 | 43.56 |
|
||
|
|
| Baseline | ERFNet | 360 x 640 | 85.87 | 26.32 | 2.67 |
|
||
|
|
| Baseline | ENet | 360 x 640 | 56.63 | 4.26 | 0.95 |
|
||
|
|
| Baseline | MobileNetV2 | 360 x 640 | 126.54 | 4.49 | 2.06 |
|
||
|
|
| Baseline | MobileNetV3-Large | 360 x 640 | 104.34 | 3.63 | 3.30 |
|
||
|
|
| SCNN | VGG16 | 360 x 640 | 21.18 | 218.64 | 20.96 |
|
||
|
|
| SCNN | ResNet18 | 360 x 640 | 21.12 | 89.38 | 12.63 |
|
||
|
|
| SCNN | ResNet34 | 360 x 640 | 20.77 | 163.74 | 22.74 |
|
||
|
|
| SCNN | ResNet50 | 360 x 640 | 19.59 | 181.76 | 25.16 |
|
||
|
|
| SCNN | ResNet101 | 360 x 640 | 13.50 | 318.50 | 44.15 |
|
||
|
|
| SCNN | ERFNet | 360 x 640 | 18.40 | 30.46 | 3.26 |
|
||
|
|
| LSTR | ResNet18s | 360 x 640 | 98.13 | 1.15 | 0.77 |
|
||
|
|
| LSTR | ResNet18s-2x | 360 x 640 | 97.27 | 4.05 | 3.05 |
|
||
|
|
| LSTR | ResNet18s | 1080 x 1920 | 91.23 | 10.20 | 0.77 |
|
||
|
|
| LSTR | ResNet18s | 2160 x 4320 | 23.60 | 40.75 | 0.77 |
|
||
|
|
| LSTR | ResNet34 | 360 x 640 | 63.52 | 34.54 | 22.34 |
|
||
|
|
| RESA | ResNet18 | 360 x 640 | 67.66 | 61.35 | 6.61 |
|
||
|
|
| RESA | ResNet34 | 360 x 640 | 54.49 | 101.74 | 11.99 |
|
||
|
|
| RESA | ResNet50 | 360 x 640 | 44.80 | 105.71 | 12.46 |
|
||
|
|
| RESA | ResNet101 | 360 x 640 | 25.14 | 242.45 | 31.46 |
|
||
|
|
| RESA | MobileNetV2 | 360 x 640 | 60.53 | 12.80 | 4.63 |
|
||
|
|
| RESA | MobileNetV3-Large | 360 x 640 | 54.39 | 11.95 | 5.88 |
|
||
|
|
| LaneATT | ResNet18 | 360 x 640 | 198.29 | 18.67 | 12.02 |
|
||
|
|
| LaneATT | ResNet34 | 360 x 640 | 133.84 | 36.01 | 22.12 |
|
||
|
|
| BézierLaneNet | ResNet18 | 360 x 640 | 212.83 | 14.77 | 4.10 |
|
||
|
|
| BézierLaneNet | ResNet34 | 360 x 640 | 149.52 | 29.85 | 9.49 |
|
||
|
|
| Baseline | VGG16 | 288 x 800 | 55.31 | 214.50 | 20.15 |
|
||
|
|
| Baseline | ResNet18 | 288 x 800 | 136.28 | 85.22 | 11.82 |
|
||
|
|
| Baseline | ResNet34 | 288 x 800 | 72.42 | 159.60 | 21.93 |
|
||
|
|
| Baseline | ResNet50 | 288 x 800 | 49.41 | 177.60 | 24.35 |
|
||
|
|
| Baseline | ResNet101 | 288 x 800 | 27.19 | 314.34 | 43.34 |
|
||
|
|
| Baseline | ERFNet | 288 x 800 | 88.76 | 26.26 | 2.68 |
|
||
|
|
| Baseline | ENet | 288 x 800 | 57.99 | 4.12 | 0.96 |
|
||
|
|
| Baseline | MobileNetV2 | 288 x 800 | 129.24 | 4.41 | 2.00 |
|
||
|
|
| Baseline | MobileNetV3-Large | 288 x 800 | 107.83 | 3.56 | 3.25 |
|
||
|
|
| Baseline | RepVGG-A0 | 288 x 800 | 162.61 | 207.81 | 9.06 |
|
||
|
|
| Baseline | RepVGG-A1 | 288 x 800 | 117.30 | 339.83 | 13.54 |
|
||
|
|
| Baseline | RepVGG-B0 | 288 x 800 | 103.68 | 390.83 | 15.09 |
|
||
|
|
| Baseline | RepVGG-B1g2 | 288 x 800 | 36.91 | 1166.76 | 42.20 |
|
||
|
|
| Baseline | RepVGG-B2 | 288 x 800 | 18.98 | 2310.13 | 81.23 |
|
||
|
|
| Baseline | Swin-Tiny | 288 x 800 | 51.90 | 44.24 | 27.72 |
|
||
|
|
| SCNN | VGG16 | 288 x 800 | 21.40 | 218.62 | 20.74 |
|
||
|
|
| SCNN | ResNet18 | 288 x 800 | 20.80 | 89.34 | 12.42 |
|
||
|
|
| SCNN | ResNet34 | 288 x 800 | 19.77 | 163.72 | 22.52 |
|
||
|
|
| SCNN | ResNet50 | 288 x 800 | 18.88 | 181.72 | 24.94 |
|
||
|
|
| SCNN | ResNet101 | 288 x 800 | 13.42 | 318.46 | 43.94 |
|
||
|
|
| SCNN | ERFNet | 288 x 800 | 18.80 | 30.40 | 3.27 |
|
||
|
|
| SCNN | RepVGG-A1 | 288 x 800 | 20.53 | 343.96 | 14.13 |
|
||
|
|
| RESA | ResNet18 | 288 x 800 | 69.58 | 61.33 | 6.62 |
|
||
|
|
| RESA | ResNet34 | 288 x 800 | 55.61 | 101.72 | 12.01 |
|
||
|
|
| RESA | ResNet50 | 288 x 800 | 46.75 | 105.70 | 12.48 |
|
||
|
|
| RESA | ResNet101 | 288 x 800 | 26.08 | 242.44 | 31.47 |
|
||
|
|
| RESA | MobileNetV2 | 288 x 800 | 59.49 | 12.55 | 4.63 |
|
||
|
|
| RESA | MobileNetV3-Large | 288 x 800 | 53.85 | 11.70 | 5.88 |
|
||
|
|
| LSTR | ResNet34 | 288 x 800 | 65.39 | 33.86 | 22.34 |
|
||
|
|
| BézierLaneNet | ResNet18 | 288 x 800 | 210.79 | 14.66 | 4.10 |
|
||
|
|
| BézierLaneNet | ResNet34 | 288 x 800 | 144.65 | 29.54 | 9.49 |
|
||
|
|
|
||
|
|
## Segmentation performance:
|
||
|
|
|
||
|
|
| method | resolution | FPS | FLOPS(G) | Params(M) |
|
||
|
|
| :---: | :---: | :---: | :---: | :---: |
|
||
|
|
| FCN | 256 x 512 | 43.32 | 216.42 | 51.95 |
|
||
|
|
| FCN | 512 x 1024 | 12.06 | 865.69 | 51.95 |
|
||
|
|
| FCN | 1024 x 2048 | 3.06 | 3462.77 | 51.95 |
|
||
|
|
| ERFNet | 256 x 512 | 91.20 | 15.03 | 2.07 |
|
||
|
|
| ERFNet | 512 x 1024 | 85.51 | 60.11 | 2.07 |
|
||
|
|
| ERFNet | 1024 x 2048 | 21.53 | 240.44 | 2.07 |
|
||
|
|
| ENet | 256 x 512 | 59.31 | 2.72 | 0.35 |
|
||
|
|
| ENet | 512 x 1024 | 55.69 | 10.88 | 0.35 |
|
||
|
|
| ENet | 1024 x 2048 | 30.88 | 43.53 | 0.35 |
|
||
|
|
| DeeplabV2 | 256 x 512 | 44.87 | 180.59 | 43.90 |
|
||
|
|
| DeeplabV2 | 512 x 1024 | 12.93 | 722.37 | 43.90 |
|
||
|
|
| DeeplabV2 | 1024 x 2048 | 3.23 | 2889.49 | 43.90 |
|
||
|
|
| DeeplabV3 | 256 x 512 | 35.26 | 241.65 | 58.63 |
|
||
|
|
| DeeplabV3 | 512 x 1024 | 10.26 | 966.61 | 58.63 |
|
||
|
|
| DeeplabV3 | 1024 x 2048 | 2.56 | 3866.45| 58.63 |
|
||
|
|
|
||
|
|
*All results are the maximum value of 3 times on a RTX 2080Ti.*
|
||
|
|
|
||
|
|
*Lane detection post-processing are not counted.*
|
||
|
|
|
||
|
|
*LaneATT NMS is not counted yet.*
|
||
|
|
|
||
|
|
## Profiling Models Yourself
|
||
|
|
|
||
|
|
In the setting of `mode=simple`, we employ a random tensor to replace the real image.
|
||
|
|
Therefore, we can avoid using the DataLoader to obtain the best performance of models.
|
||
|
|
|
||
|
|
**This is also the setting for the above benchmark.**
|
||
|
|
|
||
|
|
```
|
||
|
|
python tools/profiling.py --mode=simple \
|
||
|
|
--config=<config file path> \
|
||
|
|
--times=3 \
|
||
|
|
--height=<image height in pixels> \
|
||
|
|
--width=<image width in pixels>
|
||
|
|
```
|
||
|
|
|
||
|
|
Same config mechanism and commandline overwrite by `--cfg-options` as in training/testing.
|
||
|
|
|
||
|
|
In the setting of `mode=real`, so as to simulate that the real camera transmit frames to models, we set 'batch_size=1' and 'num_workers=0' in the DataLoader. Just use `--mode=real` and probably provide an actual model by `--checkpoint`.
|
||
|
|
|
||
|
|
For detailed instructions and commandline shortcuts available, run:
|
||
|
|
|
||
|
|
```
|
||
|
|
python tools/profiling.py --help
|
||
|
|
```
|