Research Article

An Evaluation of Deep Learning Methods for Small Object Detection

Table 4

The comparative results on subsets of PASCAL VOC 2007.

ApproachMethodVOC_MRA_0.058VOC_MRA_0.10VOC_MRA_0.20VOC_WH20

One stageYOLOv2 416 [16]3.0231.3842.8918.52
YOLOv2 448 [16]4.4732.960.1521.96
YOLOv2 480 [16]4.2633.4860.7826.67
YOLOv2 512 [16]5.4235.7461.1224.63
YOLOv2 544 [16]6.9736.566326.62
YOLOv2 640 [16]7.737.9761.2923.41
YOLOv2 800 [16]10.2437.361.9126.9
YOLOv2 1024 [16]10.6929.9355.1428.97
YOLOv3 3207.1834.5860.3620.4
YOLOv3 41610.238.9762.5324.12
YOLOv3 60811.742.6568.5628.86
SSD 300 [16]1.7132.7646.2616.91
SSD 512 [16]2.943.4657.1119.87
RetinaNet-ResNet-50-FPN8.8441.550.228.14
RetinaNet-ResNet-101-FPN8.9542.551.927.46
RetinaNet-ResNeXT-101-32 × 8d-FPN10.2945.454.530.08
RetinaNet-ResNeXT-101-64 × 4d-FPN10.7145.555.131.32

Two stageFast RCNN-ResNet-50-C40.2313.249.93.93
Fast RCNN-ResNet-50-FPN0.6313.555.63.45
Fast RCNN-ResNet-101-FPN0.3915.957.63.12
Fast RCNN-ResNeXT-101-32 × 8d-FPN0.5114.457.93.33
Fast RCNN-ResNeXT-101-64 × 4d-FPN0.2914.257.33.76
Faster RCNN-ResNet-50-C46.9839.948.726.04
Faster RCNN-ResNet-50-FPN10.7445.656.329.79
Faster RCNN-ResNet-101-FPN10.6346.957.630.57
Faster RCNN-ResNeXT-101-32 × 8d-FPN11.6447.357.632.12
Faster RCNN-ResNeXT-101-64 × 4d-FPN10.5447.156.931.64
Faster RCNN-VGG16 [16]5.7335.5844.1441.11

This table illustrates how well models adapt to different scales of objects. The values in bold represent the best in one-stage methods, and the ones in italics represent the highest in two-stage methods.