SENetCount: An Optimized Encoder-Decoder Architecture with Squeeze-and-Excitation for Crowd Counting
Table 5
Ablation experiments.
Networks
Part_A
Part_B
UCF_QNRF
Mall
MAE
RMSE
MAE
RMSE
MAE
RMSE
MAE
RMSE
SE-ResNetCount50
76.0
120.5
7.7
12.2
111.1
194.1
1.22
1.58
SE-ResNeXtCount50
71.9
118.7
8.0
13.3
104.9
182.6
1.20
1.52
SE-ResNeXtCount50+SSIM
71.7
118.0
8.2
13.8
103.6
181.5
1.19
1.53
SE-ResNeXtCount50+MS-SSIM
71.8
117.0
8.0
13.2
104.8
182.9
1.19
1.52
SE-ResNeXtCount101
71.8
115.4
7.4
12.6
107.9
203.3
1.20
1.57
SE-ResNeXtCount101+SSIM
71.0
115.4
7.5
12.5
108.0
206.4
1.19
1.55
SE-ResNeXtCount101+MS-SSIM
71.0
115.0
7.3
12.1
107.7
201.1
1.15
1.49
SE-ResNetCount50 and SE-ResNeXtCount50/101, respectively, choose SE-ResNet or SE-ResNeXt as the backbone network and adopt the DASPP module and FFM module given in Figure 1. +SSIM and +MS-SSIM indicate that the objective loss function combines the Euclidean loss with the SSIM or MS-SSIM index.