Advanced Pattern Recognition Systems for Multimedia DataView this Special Issue
A Multimedia Learning for Chinese Character Image Recognition via Human-Computer Interaction Network
As science and technology continue to develop, Chinese character image recognition technology is being used in a wide range of fields. This computer-based technology is a practical way of automatically recognizing images of text. Typically used in Chinese character education, it provides a new form of human–computer interaction for students. In addition, multimedia technology can provide a rich learning environment for students, which can present information about Chinese characters in the form of pictures, sounds, and videos, thus compensating for the disadvantages of learning Chinese characters by rote in the traditional educational process. The combination of Chinese character image recognition technology and multimedia technology can not only enrich the process of learning Chinese characters, but also promote students’ motivation to learn, thus providing a new and more modern approach to Chinese character education. Based on the study of Chinese character image recognition technology, this research combines it with multimedia information, to achieve the image recognition of Chinese character and multimedia information representation. The combined technology can provide significant references for course design and Chinese learners.
Character recognition technology uses computers to quickly identify words on paper and to link symbol identities to images of characters . It is an important branch in the field of pattern recognition and an important part of the new generation of intelligent computer interfaces [2–4]. Nowadays, it has been applied in a wide variety of fields. For instance, in the field of information processing [5–8], the use of character recognition technology allows existing documents to be converted into a corresponding electronic format and transmitted via the Internet, thus increasing the efficiency of computer use. In addition, in the field of traffic management [9–12], the application of character recognition technology enables the automatic extraction of vehicle registration numbers, resulting in intelligent and automated traffic management. Also, in the field of identification [13, 14], the use of character recognition technology provides for the automatic extraction of fingerprints from fingerprint images, which leads to the automatic identification of identities.
As one of the most widely used scripts in the world, Chinese characters are an expression of the Chinese ideographic script . Also, it can reflect the Chinese culture, which is integrated with the Chinese way of thinking and cultural spirit. Therefore, in the process of learning Chinese characters, learners are not only introduced to the sounds, shapes, and meanings of Chinese characters, but are also naturally exposed to the culture of the Chinese nation, which contributes to the transmission of national culture. In addition, Chinese character education is also a response to the demands of globalization to highlight national identity and provide a strong impetus for the rise of the Chinese nation. Although the assumption is not constant, it still can be used in this field . However, if Chinese characters are only difficult to understand in terms of form and meaning, then rote memorization may be the best way for young children to learn Chinese characters, which is hardly conducive to the overall development of the child. In other words, in the field of Chinese language teaching, most teaching only focuses on students’ paper and pencil learning of Chinese character learning strategies . Today, with the rapid development of modern educational technology, multimedia technology provides new support for children’s learning of Chinese characters. It compensates for the shortcomings of traditional education and reduces the difficulty of learning Chinese characters, thus rendering them more accessible to young children. Nevertheless, traditional multimedia technology merely presents Chinese characters in the form of images, sound, video, and animation, which means that it only has the function of presenting educational content without any ability to process real-world information. Also, as digital products such as cameras and mobile phones become more popular in people’s daily lives, these products are increasingly used for digital image capture. Therefore, it is essential to combine character recognition technology with multimedia technology to form a multimedia learning-oriented Chinese character image recognition technology.
Chinese character image recognition is a type of text recognition. The development of text image recognition has been quite rapid. From the 1950s onwards, European and American countries began to research western character recognition technology in order to facilitate the processing of large amounts of text material by computers. According to the data, the study of recognition of printed Chinese characters started in the 1960s. In 1996, Casey and Nagy used the template matching method to recognize 1000 printed Chinese characters . Since then, Chinese character recognition technology has developed considerably, and there have been a number of systems for the recognition of printed Chinese characters in single and multiple fonts. Most of these systems require the use of a large amount of specialized hardware and are extremely expensive, so they are not been widely used.
With the rapid development of computer technology, there are now a large number of studies on Chinese character image recognition technology and the performance of this technology also keeps improving. Zhong et al. adopted GoogLeNet and directional feature maps to benefit handwritten Chinese character recognition to achieve higher performance . Dong et al. used support vector machine to create an improved Chinese character recognition system, which performs well . Li et al. proposed weighted average pooling for reducing the parameters in fully connected layer without loss in accuracy, and the results suggested that the accuracy of Chinese character recognition is extremely high . Xiao et al. developed fast and compact convolutional neural networks for Chinese character recognition, which can greatly improve the speed of recognition . Yu et al. used linear combination of trigonometric functions to construct the discrete dynamic system, and the improved system can cope with the issue of system convergence of Chinese character image . Coates et al. applied unsupervised feature learning to conduct character recognition . Wang built an improved Chinese character recognition model based on hidden Markov model and scale invariant feature transform algorithm, and the simulation result suggested that this model is able to accurately identify rare Chinese character images with much higher recognition accuracy . In general, the recognition rate of printed Chinese characters has reached over 98%. However, the research of Chinese character image recognition systems for different writing styles and for different image quality of Chinese characters is still in the development stage. The research of Chinese character image recognition systems for different writing styles and different image qualities is still in the research and development stage, and there is still a lot of work to be done for the recognition of images of Chinese characters with different printing qualities. Therefore, there is still a lot of work to be done. This research focuses on applying stroke density feature extraction to enable the recognition of images of Chinese characters. Then conflict processing mechanism is used to solve the problem of Chinese character conflict in the process of Chinese character image recognition. After that, this study adopts learning mechanism in order to extract Chinese character features and inputs multimedia information.
2. Stroke Density Feature Extraction
The stroke density feature is a widely used feature for character recognition. It describes the sparsity of a character’s strokes. This feature can reflect the basic topology of the character, and can be extracted from the center of mass of the character by drawing straight lines through the entire character space in the horizontal, vertical, and at a certain angle to the horizontal directions. In addition to this, the character rectangle can also be drawn from the lower left and upper left corners of the character rectangle in the horizontal, diagonal, vertical, and horizontal directions, and the intersection of each line with the character stroke is extracted as the feature to be extracted. The algorithm is simple and can fully reflect the overall structure of Chinese characters. Hence, this research uses the stroke density feature to classify the image, and because the extracted stroke density feature vector differs from one method to another. Therefore, this study applies four directional stroke density features, which are drawing straight lines through the image in four directions, such as horizontal, vertical, and diagonal intersections. In addition, the number of intersection points between each penetration line is recorded as the feature vector of stroke density. The method is described below.
2.1. Diagonal Cross Stroke Density Feature Extraction
The diagonal intersection stroke density feature extraction is a method of extracting stroke density by drawing straight lines from the diagonal direction of the Chinese character image, respectively. The total number of intersections between the penetrating lines and the character strokes is then recorded as , which is the feature vector extracted by the diagonal cross-stroke density feature extraction method. This method is effective in providing a preliminary count of the number of strokes in Chinese characters. This study has also previously investigated the cross feature extraction method. It was found that the cross method has many disadvantages for Chinese character differentiation. For example, the crosses of left and right structured characters tend to pass through the middle, resulting in a stroke density of 0 in the vertical direction, so the use of diagonal crosses is more beneficial to the extraction of stroke density features (Figures 1 and 2).
2.2. Horizontal Stroke Density Feature Extraction
The horizontal stroke density feature extraction method is to divide the Chinese character into parts, then there are horizontal lines through the text area, and record the number of intersection points when each line passes through the image, so that we can get the following feature vector:
The larger the value of , the greater the distinguishability of the feature vector and the greater the computational effort of the system, which can be seen in Figure 3.
2.3. Vertical Stroke Density Feature Extraction
The vertical stroke density feature extraction method is to divide the image vertically into parts, then there are vertical lines through the text area, and record the number of intersection points when each line passes through the image, and get the following feature vector:
The larger the value of , the greater the distinguishability of the feature vector and the greater the computational effort of the system, which can be seen in Figure 4.
2.4. Horizontally Inclined Stroke Density Feature Extraction
Stroke density extraction method of horizontal tilt means that the image is divided into parts evenly based on the diagonal of the image, then there are horizontal lines through the text area, and record the number of intersection points when each line passes through the image, and get the following feature vector:
The larger the value of , the greater the distinguishability of the feature vector and the greater the computational effort of the system, which can be seen in Figure 5.
3. Chinese Character Image Classifier
Pattern recognition classification is the process of assigning an object to a category based on the feature vector it presents. The question of how to make a reasonable decision is the problem of the classifier for pattern recognition, i.e., what criteria to use and what methods to use to divide the determined dimensional feature space into decision domains. There are two basic approaches to the design of classifiers: the template matching method and the discriminant function method. These two methods have their own advantages and disadvantages. A template matching classifier uses each sample in the training sample set as a standard template, compares the sample to be tested with the standard template, finds the most similar, nearest neighbor standard template, and uses the category of the standard template as its own category. In principle, the template matching classifier is the simplest and the most effective, but it is suitable for the identification of large sample pools because of its large data storage and computational power. The discriminant function method is generally divided into two types: the probability method and the geometric method. Probabilistic-based classifiers require a priori knowledge of the probability of identifying the sample, which is the necessary basis for the discriminant function. For those samples for which no a priori knowledge is available. The geometric classifier does not rely on conditional knowledge. Geometric classifiers, on the other hand, do not rely on knowledge of the conditional probability density. The geometric approach decomposes the feature space into subspaces corresponding to different categories and uses a linear separation. The geometric approach decomposes the feature space into subspaces corresponding to different classes and uses a linear separation function to determine them making the calculation simple.
This research adopts grey-scale projection template matching algorithm to perform the classification. The detailed process of this algorithm is as follows.
3.1. Obtain Grey-Scale Projection
It is necessary to obtain the horizontal and vertical grey-scale projections of the template image and the image to be matched. The horizontal and vertical grey-scale projections and of the template are obtained first, and then the horizontal and vertical grey-scale projections and of the selected image are obtained to obtain the projection sequence P. It is important to note that here the size of the template image and the image to be matched are the same. It should be noted that the size of the template image and the image to be matched are the same.
3.2. Check Similarity
In the process of finding the similarity between sequences, the minimum distance method is used to determine the similarity of two images. The minimum distance method is a simple template matching method based on a vector space model. The basic idea is to calculate the distance between the image to be measured and the template feature vector based on the feature vector . The final decision is whether the image to be measured is the closest distance to the template feature vector and whether the two match.
In this research, Euler distance is used, which is defined as follows:where refers to the Euler distance.
In the process of classification recognition, there is also the problem that the image to be tested does not match any of the images in the template library, i.e., even though converges to 0, the result of recognition is still incorrect. Then, we set a threshold value to restrict the condition. This means that when is greater than , it can be shown that the image to be measured does not match the template image. When is less than , it is possible that there is an image in the template library that matches the image to be tested. The next step is to make a judgement on this part of the images. The classifier design is shown in Figure 6.
4. Conflict Processing Mechanism
The conflict processing mechanism is designed to deal with situations where the recognition result is not unique, is misidentified or even rejected, and to take some measures to further determine or declare failure. In the process of using classifiers for Chinese character recognition, there is a mismatch between the image to be tested and the image features of the feature template library. There are two main types of mismatch: one character with multiple codes and one code with multiple characters. The conflict processing mechanism of this research is focused on these two cases.
4.1. One Character with Multiple Codes
This kind of conflict processing mechanism refers to the processing mechanism to be used when a Chinese character image is matched with features in the feature database and there are multiple different codes corresponding to a single character, i.e., one character with multiple codes. There are two scenarios, depending on where the conflict arises in different stroke density ranges.
The first conflict arises in class A, where the stroke density classifier cannot uniquely identify a Chinese character image, i.e., there are multiple different codes for the same character. In this case, all the stroke-density features in the feature library that lie within class A are searched. If there is an image of a character with the same code in the feature library, then the problem is transformed into a one-code-multiple-character situation and can be handled using the one-code-multiple-character processing mechanism. If the encoding feature of the character with the duplicate code is not the same as any of the features in the library, then the feature is stored in the stroke density feature library and the character identifier is set to the “duplicate code” flag.
The second conflict arises in class B, where a stroke density classifier cannot uniquely identify a character image, i.e., where a character has multiple different encodings. In this case, the stroke density features of all characters in the library with feature codes in the B class range are searched. If the character with the same code does not have the same feature as any other character in the library then this feature can be stored in the Stroke Density Feature Library and the character identifier can be set to the “with duplicate code.” The character identifier is then set to the “with repetition” flag.
4.2. One Code with Multiple Characters
In the case of multiple characters in a single code in the stroke density recognition stage, the processing mechanism can be divided into three categories, depending on the range of situations that arise.
The first type of conflict occurs in class A, where the stroke density classifier cannot uniquely identify a single character, i.e., one code corresponds to multiple characters, i.e., one code with multiple characters. In this case, the character stroke density feature is converted to a character stroke density feature. If the recognition is successful, it will be transferred to the multimedia database and the multimedia resources will be called. If the character is still not recognized in the recognition phase using the stroke density feature, then the character is transferred to the highest level of stroke density feature which is to use the stroke density feature for recognition. If the recognition is successful, the character is transferred to the multimedia database and the multimedia resources are called. If the character is still not recognized in the recognition phase using the stroke density feature, the character is then matched to the grey-scale projection template.
The second type of conflict occurs in class B, where the stroke density classifier cannot uniquely identify a Chinese character, i.e., a single code corresponds to multiple characters, i.e., one code for multiple characters. In this case, the stroke density feature is converted to extract the stroke density feature. If the recognition is successful, it is transferred to the multimedia database and the multimedia resources are invoked. If the character is still not recognized in the recognition phase using the stroke density feature, the character is then matched to the grey-scale projection template of the image.
The third type of conflict occurs in class C, where the stroke density classifier cannot uniquely identify a Chinese character, i.e., a single code corresponds to multiple characters, i.e., one code for multiple characters. The Chinese character image is then matched to the grey-scale projection template.
The detailed processing mechanism is shown in Figure 7. First, it is necessary to divide the crossed diagonal stroke density according to the number of strokes. After that, there are two choices. The first one is to transfer to multimedia databases and the second one is accessing to the template matcher.
5. Learning Mechanism
Due to the huge number of Chinese characters, it is impossible to achieve a recognition rate of a single character in the process of character recognition. Therefore, almost all truly practical recognition systems have a postprocessing system, which, of course, also includes manual proofreading. This study uses a learning mechanism to improve the recognition function. The learning mechanism in this study refers to the fact that for some reason during the recognition process. The learning mechanism in this paper refers to the fact that when the recognition process cannot be completed due to some reasons, mainly the absence of features in the feature library, multimedia information, or Euclidean distance. The learning mechanism in this paper means that when the recognition process cannot be completed due to some reasons, mainly the absence of features in the feature database, multimedia information, or Euclidean distance, the content involved in the recognition process is entered into the database so that it is a mechanism that allows the recognition task to be completed when the image is recognized again. The content to be learned consists of two main parts. The first part refers to the Chinese character features and Euclidean distance parameters that do not exist in the feature database and are relevant for the recognition process. The other part refers to the multimedia information that is entered into the database to be displayed when the recognition is completed. For the first part, the learning of the first part can be done by the system itself. However, for the second part, the content-based in the content-based retrieval of multimedia information is still very immature and therefore requires a strong human intervention in this part. The following analysis is a specific analysis of the extraction of features and the determination of Euclidean distance parameters.
This research applies grey-scale template learning mechanism. Before proceeding with the grey-scale template learning mechanism, the first step is to determine whether the Chinese character image can be recognized by the stroke density method. If it can be recognized, then there is no need for grey-scale projection feature extraction. If the content of the Chinese character image cannot be recognized based on the existing stroke density features, then the grey-scale template features are extracted. The grey-scale projection feature vector of the Chinese character image is then stored in the template feature library. This completes the learning mechanism. The design of the grey-scale stencil-based learning mechanism is shown in Figure 8.
Chinese character image recognition for multimedia learning is a combination of text image recognition technology and a back-end database. This is a combination of text image recognition technology and backend database, which provides a powerful support for Chinese character learning. It has important theoretical significance and practical value.
This research describes in detail the key techniques for image recognition. This includes the extraction of features, the design of classifiers and the conflict handling mechanism applied when conflicts are encountered, as well as the learning mechanism for learning Chinese character images. Depending on the extracted features, the classifier is divided into a stroke density classifier and a grey-scale projection template matching combination for the recognition of Chinese character images. In addition, different punch-out mechanisms are used to solve the problem according to the problems that arise during the recognition process. The method is not only simple, fast and accurate, but also meets the recognition requirements. In the recognition process, the Chinese characters are first learned in case of nonrecognition. The training of the Euclidean distance parameters adopts an automatic learning mechanism, which makes the determination of the Euclidean distance more reasonable and more in line with the recognition requirements.
In general, the Chinese character image recognition technology for multimedia learning basically realizes the multimedia function of Chinese character images. The paper provides a new technical support for multimedia learning. The design of some of the main modules in this paper is also of some theoretical exploration and inspiration for the construction of Chinese character image recognition systems.
The labeled dataset used to support the findings of this study is available from the corresponding author upon request.
Conflicts of Interest
The authors declare no conflicts of interest.
This work was supported by the General Project of Hunan Natural Science Foundation (No. 2018JJ2147) and part of Youth Project of Hunan Natural Science Foundation (No. 2018JJ3203) or Project of Hunan Science and Technology Department (No. 2019ZK4018) and Hunan University of Science and Engineering Computer Application Special Subject Funding.
B. Chen and D. Peng, “The time course of graphic, phonological and semantic information processing in Chinese character recognition: 1,” Acta Psychology Sinica, vol. 33, no. 1, pp. 1–6, 2001.View at: Google Scholar
Z. Rao, C. Zeng, M. Wu et al., “Research on a handwritten character recognition algorithm based on an extended nonlinear kernel residual network,” KSII Transactions on Internet and Information Systems, vol. 12, no. 1, pp. 413–435, 2018.View at: Google Scholar
A. Athira, S. Lekshmi, P. Vijayan, and B. Kurian, “Smart parking system based on optical character recognition,” in Proceedings of the 2019 3rd International Conference on Trends in Electronics and Informatics (ICOEI), pp. 1184–1188, Tirunelveli, India, March, 2019.View at: Publisher Site | Google Scholar
F. Xie, M. Zhang, J. Zhao, J. Yang, Y. Liu, and X. Yuan, “A robust license plate detection and character recognition algorithm based on a combined feature extraction model and BPNN,” Journal of Advanced Transportation, vol. 2018, Article ID 6737314, 14 pages, 2018.View at: Publisher Site | Google Scholar
S. Ghosh, A. Roy, and A. Gill, “A review of pattern recognition and its application in artificial intelligence and agriculture,” International Journal of Research and Analytical Reviews, vol. 5, no. 4, pp. 893–897, 2018.View at: Google Scholar
Z. Zhong, L. Jin, and Z. Xie, “High performance offline handwritten Chinese character recognition using googlenet and directional feature maps,” in Proceedings of the 2015 13th International Conference on Document Analysis and Recognition, Washington, DC, USA, August, 2015.View at: Publisher Site | Google Scholar
Z. Y. Wang, “Chinese character recognition method based on image processing and hidden Markov model,” in Proceedings of the Fifth International Conference on Intelligent Systems Design and Engineering Applications, Hunan, China, June, 2014.View at: Google Scholar