Research Article

Earf-YOLO: An Efficient Attention Receptive Field Model for Recognizing Symbols of Zhuang Minority Patterns

Figure 4

(a) represents the main architecture of GLocalT with the global-local training mechanism, (b) represents gating axial attention layer, a multiple attention block based on gating control to control the information amount of the key, query, and value provided by the position embedding, and (c) means that a gated axial transformer is used in GLocalT, and the gated axial transformer contains the gated axial attention layer encoded along the height, width, and channel.
(a)
(b)
(c)