Wireless Communications and Mobile Computing

Research Article

Multiactivation Pooling Method in Convolutional Neural Networks for Image Recognition

Table 1

Baseline architectures. Models A and E are two baseline architectures ‎[13] with slightly different number of filters in some layers. Others are architectures with large-scale pooling layers or MAP layers. Conv3 indicates that the filter kernel sizes are 3×3. The ReLU gate is not shown for simplicity. The structures between small-size datasets (CIFAR, SVHN) and large-scale datasets (ImageNet) are different in input layers and fully connected layers. Parameters on the right of the slash belong to large-scale datasets.


Networks architectures based on VGG
VGG-11				VGG-13
A	B	C	D	E	F	G	H

Small-size datasets: Input (3232 RGB image)/Large-scale datasets: Input (224224 RGB image)

conv3-64	conv3-64	conv3-64	conv3-64	conv3-64	conv3-64	conv3-64	conv3-64
22 maxpool	conv3-128	conv3-128	conv3-128	conv3-64	conv3-64	conv3-64	conv3-64
	conv3-128	conv3-128	conv3-128	22 maxpool	conv3-128	conv3-128	conv3-128
	44 maxpool/MAP	conv3-128	conv3-128		conv3-128	conv3-128	conv3-128
conv3-128		conv3-256	conv3-256	conv3-128	conv3-256	conv3-256	conv3-256
22 maxpool		88 maxpool/MAP	conv3-256	conv3-128	44 maxpool/MAP	conv3-256	conv3-256
			1616 maxpool/MAP	22 maxpool		conv3-256	conv3-256
						88 maxpool/MAP	conv3-512
conv3-256				conv3-256			1616 maxpool/MAP
conv3-256				conv3-256
22 maxpool				22 maxpool
	conv3-128				conv3-256
	conv3-256				conv3-256
	conv3-256				conv3-512
conv3-512	44 maxpool/MAP			conv3-512	44 maxpool/MAP
conv3-512				conv3-512
22 maxpool		conv3-256		22 maxpool		conv3-512
		conv3-512				conv3-512
		conv3-512				conv3-512
conv3-512	conv3-512	44 avgpool	conv3-512	conv3-512	conv3-512	44 avgpool	conv3-512
conv3-512	conv3-512		conv3-512	conv3-512	conv3-512		conv3-512
22avgpool	22avgpool		22avgpool	22avgpool	22avgpool		22avgpool

fc1-512512/250884096

fc2-512512/40964096

fc3-51210/40961000

Softmax-10/1000