Table 3: The results of Fast RCNN using convolution-based regionwise classifier. “Inc” indicates an inception block and “Res” indicates a residual block. CRC is short for convolution-based regionwise classifier. The training time refers to the consuming time of 20 iterations.

NetworkCRCTraining timeModel sizemAP (%)

GoogleNetInc_5a, 5b, GAP15 s23.2 MB65.8
Inc_5b, GAP7 s62.9
GAP5 s41.2

Inception_v2Inc_4e, 5a, 5b, GAP17 s39.3 MB67.0
Inc_5a, 5b, GAP14 s66.6
Inc_5b, GAP10 s65.6
GAP7 s42.9

Inception_v3Inc_b, c1, c2, GAP27 s84.1 MB69.8
Inc_c1, c2, GAP23 s69.4
Inc_c2, GAP18 s68.9
GAP12 s49.7

ResNet_50Res_5a, 5b, 5c, GAP17 s90.7 MB71.5
Res_5b, 5c, GAP13 s70.9
Res_5c, GAP10 s69.5
GAP8 s50.4