Research Article
Focal CTC Loss for Chinese Optical Character Recognition on Unbalanced Datasets
Figure 6
We provide distribution of labels for Chinese-ocr dataset. The first bar represents the most 300 frequently used words in dataset. Obviously, most Chinese words only account for a small part of all words.