Research Article

Earf-YOLO: An Efficient Attention Receptive Field Model for Recognizing Symbols of Zhuang Minority Patterns

Figure 3

The network structure of the Earf-YOLO. (a) Backbone stands for the backbone of CSPDarkNet53 and uses SRFB to replace redundant convolution, (b) neck uses a similar PANet structure and adds four SRFBs and four global-local-transformers, (c) transformer predict uses three global transformers to obtain the final prediction. It is worth noting that 1, 2, and 3 in the figure represent the three predictions of the model, respectively.