Research Article
Earf-YOLO: An Efficient Attention Receptive Field Model for Recognizing Symbols of Zhuang Minority Patterns
Figure 4
(a) represents the main architecture of GLocalT with the global-local training mechanism, (b) represents gating axial attention layer, a multiple attention block based on gating control to control the information amount of the key, query, and value provided by the position embedding, and (c) means that a gated axial transformer is used in GLocalT, and the gated axial transformer contains the gated axial attention layer encoded along the height, width, and channel.
(a) |
(b) |
(c) |