Research Article

An Enhanced Visual Attention Siamese Network That Updates Template Features Online

Figure 4

Structure of stacked hourglass networks. The size of input feature map is represents downsampling, and represents upsampling, using nearest neighbor upsampling. The upper < > content in the blue box represents the number of input channels, and the lower < > represents the number of output channels. represents elementwise addition. The size of the output feature map is . Stacked hourglass network has nothing to do with the input size, and it only needs to provide the number of input and output channels. Also, it can gradually extract deeper features.