Research Article
An Evaluation of Deep Learning Methods for Small Object Detection
Table 3
Comparative results on small object dataset.
| Method | Backbone | Clock | Faucet | Jar | Mouse | Outlet | Plate | Switch | Tel. | t. box | t. paper | mAP |
| YOLO 416 [16] | Darknet-19 | 22.8 | 30.8 | 4 | 52 | 20.4 | 13.1 | 13 | 6.1 | 0 | 35.3 | 19.39 | YOLO 448 [16] | 23 | 36.9 | 9 | 52.5 | 18.4 | 13.6 | 17.5 | 4.2 | 0 | 34.3 | 20.13 | YOLO 480 [16] | 34.2 | 37.3 | 9.1 | 53.3 | 21.4 | 13.6 | 15.8 | 9.1 | 9.1 | 34.2 | 23.71 | YOLO 512 [16] | 23.1 | 36.6 | 6.1 | 59.8 | 24.6 | 14.2 | 15.7 | 9.1 | 4.5 | 32.4 | 22.61 | YOLO 554 [16] | 23.4 | 37.2 | 9.1 | 60.1 | 27.2 | 13.4 | 19.9 | 9.1 | 4.5 | 34.5 | 23.84 | YOLO 640 [16] | 20.2 | 36.2 | 3.2 | 59.8 | 27.8 | 11.7 | 18.1 | 8.2 | 4.5 | 35.6 | 22.53 | YOLO 800 [16] | 27.6 | 36 | 2.3 | 60.2 | 32.8 | 13.1 | 23.3 | 9.1 | 9.1 | 26.7 | 24.02 | YOLO 1024 [16] | 21.7 | 29.3 | 1.4 | 58.3 | 26.4 | 11.8 | 17.5 | 9.1 | 9.1 | 15.7 | 20.03 | YOLO 320 | Darknet-53 | 26.22 | 38.38 | 4.55 | 56.46 | 36.42 | 13.34 | 24.8 | 10.65 | 4.55 | 42.96 | 25.83 | YOLO 416 | 28.47 | 47.15 | 10.83 | 60.49 | 43.15 | 15.87 | 30.73 | 15.15 | 2.62 | 48.3 | 30.28 | YOLO 608 | 29.98 | 47.89 | 10.76 | 65.88 | 48.02 | 18.09 | 31.22 | 14.62 | 17.99 | 46.56 | 33.1 | YOLO 320 | ResNet-50 | 19.57 | 25.73 | 0.67 | 45.17 | 14.37 | 9.38 | 13.84 | 9.09 | 9.09 | 23.7 | 17.06 | YOLO 416 | 23.78 | 36.65 | 0.4 | 54.23 | 18.37 | 13.75 | 19.78 | 9.84 | 9.42 | 35.68 | 22.19 | YOLO 608 | 26.92 | 40.65 | 1.77 | 61.86 | 29.18 | 15.04 | 20.24 | 10.09 | 13.29 | 36.01 | 25.5 | YOLO 320 | ResNet-101 | 20.52 | 27.9 | 0.57 | 44.68 | 16.98 | 13.05 | 13.66 | 9.66 | 9.09 | 24.36 | 18.05 | YOLO 416 | 25.72 | 35.6 | 3.03 | 55.73 | 22.4 | 15.61 | 17.26 | 9.32 | 3.03 | 38.71 | 22.64 | YOLO 608 | 28.79 | 44.59 | 9.42 | 62.18 | 33.34 | 15.53 | 23.88 | 13.24 | 15.83 | 39.17 | 28.6 | YOLO 320 | ResNet-152 | 21.64 | 27.56 | 3.03 | 48.06 | 17.39 | 11.12 | 14.51 | 9.09 | 4.55 | 31.88 | 18.88 | YOLO 416 | 25.7 | 36.54 | 0.89 | 53.81 | 20.6 | 14.13 | 20.21 | 11.49 | 0.29 | 33.06 | 21.67 | YOLO 608 | 26.01 | 44.54 | 4.55 | 61 | 31.76 | 13.02 | 22.67 | 12.35 | 9.93 | 39.99 | 26.58 | SSD300 [16] | ResNet-101 | 5.5 | 9.1 | 0 | 25.5 | 6.1 | 4.5 | 0 | 4.5 | 9.1 | 18.2 | 8.25 | SSD300 [16] | VGG16 | 9.1 | 17.1 | 0 | 26.1 | 9.1 | 9.1 | 0 | 4.5 | 0 | 16.7 | 9.16 | SSD512 [16] | VGG16 | 9.1 | 17.1 | 0 | 43 | 9.1 | 9.1 | 9.1 | 9.1 | 0 | 7.6 | 11.32 | RetinaNet | ResNet-50-FPN | 30.7 | 49.3 | 2 | 65.5 | 21.3 | 16.1 | 8.5 | 12.9 | 1 | 25.7 | 23.3 | RetinaNet | ResNet-101-FPN | 30.6 | 48.7 | 7.1 | 64.7 | 20 | 15.9 | 11.8 | 10.7 | 2.9 | 38.7 | 25.1 | RetinaNet | ResNeXT-101-32 8d-FPN | 35.5 | 55 | 12.1 | 66.5 | 23.9 | 18.4 | 9.8 | 16.2 | 9.4 | 53.7 | 30 | RetinaNet | ResNeXT-101-64 4d-FPN | 31.4 | 50.2 | 8.9 | 66.3 | 20.8 | 15.3 | 9.4 | 14 | 2.2 | 32.4 | 25.1 | R-CNN [13] | RPN prop. + VGG16 | 31.9 | 31.3 | 4.2 | 56.8 | 31.1 | 9.3 | 14.2 | 16.4 | 23.4 | 29.4 | 24.8 | R-CNN [13] | Alexnet, 7, 300 pro | 32.4 | 27.2 | 5.1 | 56.9 | 28 | 9.8 | 13.6 | 12.4 | 17.9 | 35.6 | 23.9 | R-CNN [13] | VGG16, 7, 300 pro | 37.3 | 30.3 | 7.2 | 60.6 | 41.5 | 15.8 | 21.5 | 13.7 | 22 | 33.3 | 28.4 | R-CNN [13] | ContextNet (Alexnet, 7) | 32.7 | 26.8 | 4.6 | 56.4 | 26.3 | 9.9 | 12.9 | 12.2 | 18.7 | 34 | 23.5 | Fast RCNN | ResNet-50-C4 | 32.4 | 46.3 | 6.5 | 65.8 | 38.3 | 20.1 | 25.3 | 16.6 | 14.1 | 52 | 31.7 | Fast RCNN | ResNet-50-FPN | 37.4 | 47.3 | 7.3 | 68.9 | 46.7 | 21 | 32.1 | 17.1 | 9.3 | 45.9 | 33.3 | Fast RCNN | ResNet-101-FPN | 39.3 | 50.3 | 10.6 | 68.3 | 47.1 | 20.4 | 33.3 | 18.6 | 15.4 | 51.4 | 35.5 | Fast RCNN | ResNeXT-101-32 8d-FPN | 47.5 | 54.8 | 10.3 | 71.8 | 54 | 21.4 | 34.4 | 21.7 | 17.7 | 53.5 | 38.7 | Fast RCNN | ResNeXT-101-64 4d-FPN | 45.4 | 55.7 | 10.9 | 72.5 | 53.3 | 24 | 36.9 | 22.9 | 16 | 58.1 | 39.6 | Faster R-CNN [16] | VGG16 | 23.76 | 37.65 | 8.03 | 54 | 16.16 | 11.88 | 15.12 | 9.1 | 6.25 | 37.29 | 21.92 | Faster RCNN | ResNet-50-C4 | 32.2 | 44.6 | 6.6 | 65.9 | 35.2 | 17.5 | 25.7 | 19.6 | 13.7 | 40 | 30.1 | Faster RCNN | ResNet-50-FPN | 35.7 | 49.9 | 7.3 | 68.4 | 48.9 | 18.8 | 29.6 | 14.7 | 11.4 | 53.3 | 33.8 | Faster RCNN | ResNet-101-FPN | 39.8 | 49.2 | 4.9 | 68.2 | 47 | 18.5 | 29.7 | 14 | 12.9 | 52.2 | 33.7 | Faster RCNN | ResNeXT-101-32 8d-FPN | 49.8 | 56.6 | 11.4 | 72.1 | 56.3 | 23.2 | 37 | 20.8 | 18.8 | 58.7 | 40.5 | Faster RCNN | ResNeXT-101-64 4d-FPN | 49.6 | 58.6 | 12.2 | 72.5 | 54.5 | 23.2 | 36.9 | 20.8 | 20.1 | 63.1 | 41.2 |
|
|
The values in bold represent the best in one-stage methods, and the ones in italics represent the highest in two-stage methods.
|