Abstract
With the further improvement of comprehensive national strength, both cultural soft power and international status have increased significantly. The Chinese language has also become one of the languages eagerly studied in all countries in the world. However, the teaching resources of Chinese as a foreign language are far from enough to meet the rapid growth of learning needs. With the continuous improvement of the network environment of high-performance, low-latency, and high-bandwidth mobile edge computing, using computer network technology to assist Chinese language learning is an effective way to meet the needs. Under this circumstance, this paper proposed to design and build a Chinese language learning system based on NLP (natural language processing) technology. The built system modules could be roughly divided into three parts: the basic module of the system, the learning module, and the tool module combined with NLP technology. And, through the rational use of these three modules, it provided learners with the basic teaching of the Chinese language and also collected a large number of relevant documents on the Chinese language and culture through the system, which was more convenient for learners to learn. Research showed that the learning system basically met the learning needs. According to the data survey on the use of international students, more than 90% of the international students who have used the system believed that the content expressed by the learning system was clear and scientific. The results showed that the design of the Chinese learning system met the learning needs of learners.
1. Introduction
With the rapid development of computers and networks, human beings have entered the information age, in which the rapid development of computers has further enhanced the mutual communication of the network and affected people’s study or work but also provided new ways and results for problem-solving. For example, Internet technology can realize the sharing of rich online learning resources, and multimedia technology can provide more intuitive dynamic effects to assist Chinese learning. Meanwhile, the openness, interaction, and sharing characteristics of the Internet mean that more and more people choose to obtain and learn learning resources through the Internet, and this learning method has gradually be an important learning method today. The availability and reusability of Internet learning resources gradually highlight its advantages, which is also the future development direction of learning methods. Meanwhile, the rapid development of computer technology has also promoted the research of natural language processing technology, and many new achievements have been made in the field of natural language processing, some of which have been widely used in language teaching practice and achieved good results. To sum up, the current research in the field of natural language processing, especially the processing of Chinese, can be used to make up for the insufficiency of the current computer-based Chinese learning in China. Therefore, it can be seen that both computer technology and natural language processing technology play a very important role in language learning.
Natural language processing (NLP) is an important emerging field in modern linguistics as well as a research field that has gotten a lot of attention in the artificial intelligence field [1]. In the information age, natural language processing is crucial. The goal of natural language processing research is to enable computers to understand human natural language. Natural language understanding completes human-machine interaction and allows machines to perform the command and control processing tasks that humans require. This emphasizes the significance of research into natural language processing. NLP technology can help Chinese students learn in a more humane way. As a result, developing a Chinese learning system that combines network technology and natural language processing technology to encourage the acquisition and reuse of Chinese learning resources is critical. As a result, natural language processing technology is combined with computer network technology and multimedia technology to create a fully functional, scalable, and easy to manage and maintain Chinese learning system. It realizes online Chinese learning and resource sharing, which will help better promote and carry forward China’s excellent culture, and expands Sino-foreign exchanges, both of which are important for improving China’s international influence. This paper designs, implements, and tests a Chinese learning system with multimedia, easy interaction, and simple management using web-based multimedia and natural language processing technology.
2. Related Work
This paper aimed to research the design and conduct a mobile Chinese language teaching system under edge computing, so as to improve the efficiency and level of Chinese language teaching. A number of articles related to the establishment of teaching models have been studied. Among them, Jiao mainly focused on the research on the mobile English teaching information service platform based on edge computing. It constructed a mobile English teaching model based on “listening, reading, and listening” and designed and implemented a mobile English teaching information service platform. After investigation and research, the platform could improve the efficiency of teachers’ course scheduling by 5% to 6% and the efficiency of students’ course selection by 2% to 3%. Gao reported findings on the learning strategies used by a group of Chinese English as a foreign language (EFL) learners in a mobile technology-assisted environment. The research design was a situation-specific case study using category of learning strategies as a conceptual and analytical framework to guide data collection and analysis. The data showed that the mobile technology-assisted environment influences changes in the way Chinese English learners employed specific learning strategies that differed in type and frequency from typical teacher-led and test-oriented language classes [2]. Han introduced a LiveCode-based mobile app that includes a virtual tour of two traditional Chinese architectural sites, which were Nanyuan and Humble Administrator’s Garden. Designed by the author for advanced Chinese language learners, the app provided an immersive language and culture learning experience. The app’s built-in tools, flip tips, authentic multimedia resources, and helpful links effectively combined culture with language learning. This paper also discussed the pedagogical application, pilot results, impact, and future directions of mobile learning and location-based learning in Chinese language and culture education [3]. Although the above teaching systems all have their own learning advantages, most of them are systems for English teaching, which are not related to the model for Chinese language teaching in this paper, but have certain commonalities.
Mobile edge computing was born in 2013 and is still in the stage of technology research and development and industrialization. Although it is still in the early stage of development, it has broad prospects for development as one of the core technologies of 5G. In the context of its environment, this paper established a Chinese language teaching system based on NLP technology and collected the following related technologies for the main application technologies.
The volume, accuracy, variability, and speed of data generated by ever-growing Internet-connected sensor networks pose challenges to the power management, scalability, and sustainability of cloud computing infrastructures. Increasing the data processing capabilities of edge computing devices with lower power requirements can reduce some of the overhead of cloud computing solutions. Krestinskaya et al. reviewed neuromorphic CMOS memristive architectures that could be integrated into edge computing devices. Krestinskaya et al. discussed why neuromorphic architectures are useful for edge devices [4]. Semantic similarity has attracted great interest in NLP in recent years. With the help of NLP capabilities, Tripathi and Deshmukh described the design and implementation of a fully functional system, known as a reverse medical dictionary, to achieve the efficiency of a rapid health treatment consultation system. The Reverse Medical Dictionary allows users to get instant guidance on their health issues through a smart healthcare system. In short, users can search for their disease from the system and get an instant diagnosis at any time by sharing symptoms [5]. In recent years, PTMs have been widely used in most NLP applications, especially in high-resource languages such as English and Chinese. However, scarce resources hinder the progress of PTMs for low-resource languages. In this work, Jiang proposed transformer-based PTMs for Khmer. The established model was evaluated on two downstream tasks. Experiments demonstrated the effectiveness of the Khmer language model [6]. Jusoh conducted a systematic literature review to identify the most prominent applications, techniques, and challenging problems in NLP applications. However, in order to focus on the research scope, 503 papers were excluded, and only the most prominent NLP applications, namely information extraction, question answering systems, and automatic text summarization, were selected for review. Clearly, the challenging problem in NLP is the complexity of natural language itself, that is, the problem of ambiguity that arises at all levels of language [7]. The classification technique is most important for supervised and semisupervised machine learning tasks. Many classification algorithms [8, 9] have been introduced into existing systems. Deshmukh proposed new deep learning based on the document classification method using NLP and machine learning methods. Recurrent neural networks are used to classify individual objects according to their weights. It provides the final class labels for the entire test data set [10]. NLP technology has a wide range of applications, and it is one of the mainstream applications in the processing of language vocabulary. However, the related research on the establishment of a Chinese language teaching system based on NLP technology is relatively rare, and most of the above-mentioned related NLP technology research is mainly about text analysis.
3. Modeling Method of Chinese Language and Literature Teaching System
The research topic of this paper is to develop and apply a reasonable Chinese learning system based on LNP technology and related technologies used in the process of developing a Chinese learning system, including natural language processing technology, word segmentation and tagging technology, sentiment analysis technology, and so on.
3.1. Fundamentals of Natural Language Processing
3.1.1. The Concept of Natural Language Processing
Natural language processing (NLP) is a discipline that focuses on natural language understanding and is used for human-computer information interaction. Languages that are natural are those commonly used by humans, such as Chinese, English, and Russian, which are important tools for human communication and learning. In the process of natural language processing, the language model applied to natural language should be firstly studied, and then a framework to implement the language model on the computer should be established, and improvement methods should be put forward to continuously improve the language model. Finally, the language model should be applied to various practical systems and evaluated [11, 12].
3.1.2. The Process of Natural Language Processing
Research on natural language understanding can be divided into text understanding on the one hand and speech understanding on the other hand, and computer processing is increasingly involved in text understanding [13]. The analysis and understanding of language by computer is usually a hierarchical process, and linguists divide it into several types: pragmatic, phonetic, and semantic analysis. The basic model of the main steps of natural language processing is shown in Figure 1.

3.2. Basic Introduction to Natural Language Statistical Model
3.2.1. Bayesian Conditional Probabilities
Through the Bayesian formula, how the conditional probability is calculated or how it relates to the original probability can be known.
To judge whether the given phrase N belongs to set or set , it is necessary to calculate the probability and the probability and compare the obtained probability values. If the obtained value shows that the probability from class is greater than that from class . It indicates that it belongs to class ; otherwise, it belongs to class . The decision equation for conditional probability is as follows:
In the case of multiple pattern sets, the attribution of words needs to be determined by the Bayesian formula. can be calculated by of and N. The formula is as follows:
Among them, the probability of occurrence of given the occurrence of letter N is , and the probability of occurrence of N in the model is .
It can be seen from the above process that the Bayes formula plays a very important role [14]. The probability of a given word N as a noun in the corpus is expressed as . In this case, the maximum likelihood estimation method is needed to calculate probability . Maximum likelihood estimation is a statistical method to find the parameters of the associated probability density function of a sample set. The formula is expressed as follows:
Assuming that i is a given word and j represents the category of the word, the total number of times i appears as class j in the specified library is , where is equal to the total number of times i appears in the specified library, but if it is , can be obtained according to the formula derivation. This is also the problem of data sparseness, which needs to be solved by data smoothing techniques [15, 16].
3.2.2. N-Gram Model
The N-gram model is the most commonly used mathematical model in natural language [17, 18], and the N-gram model and Markov chain are the basis of NLP technology. It is defined as follows:
If the sequence is an R-order Markov chain, the probability of a certain element appearing is related to the first R − 1 elements, as follows:
When the current state is given, it is conditionally independent of the historical path; then this random process has the Markov property. If the natural language also satisfies the Markov property, the occurrence probability of a single word in a sentence E can be calculated by the formula, and then the probability calculation formula of the sentence can be derived as follows:
In Formula (7), generally, the larger N is, the more accurate the model is, but the parameters used by the model and the required training set are also larger [19].
In practical applications, considering the size of the corpus during training and simplifying the calculation process, N usually takes 2. Taking N as 3 as an example, the formula for calculating parameters by maximum likelihood estimation is as follows:
3.2.3. HMM Model
Hidden Markov model training data often encounters decoding problems [20] because this paper uses the Viterbi algorithm for part-of-speech tagging, so only this solution is explained.
To further determine the maximum probability of the state sequence is a decoding problem. The specific formula of the decoding problem is shown as follows:
However, in this case, in order to find the state sequence with the maximum probability, if each sequence is traversed, it will cause a waste of time and resources [21, 22]. To solve this waste problem, the DWT algorithm is used [23]. The DWT algorithm is a new spectral analysis tool that discretizes the scale and translation of basic wavelets. Because it can examine both the frequency domain characteristics of local frequency domain processes and the time domain characteristics of local frequency domain processes, even those nonstationary processes can be well transformed and processed.
The optimal path to reach state sd at time t + 1 is shown in the following formula:
DWT algorithm is characterized by the selection of the optimal path, which is also applied to decoding problems and will judge whether this choice is the best result at every step. The cycle continues until the last state is traversed, so the sequence with the highest probability must be selected [24, 25].
The Viterbi algorithm can solve the decoding problem in HMM model. The Viterbi algorithm is a dynamic programming algorithm that is used to find the Viterbi path that is most likely to produce a sequence of observed events, that is, the sequence of hidden states, especially in the HMM. At the same time, it can also be used for part-of-speech tagging, as follows:
The input word sequence is , and the output optimal part-of-speech tag sequence is .(a)Initialize(b)Inductive calculation(c)Termination(d)Path backtracking: backtracking from variables to obtain the optimal path of the observation sequence, that is, the optimal part-of-speech tag sequence.
3.3. Sentiment
The goal of text sentiment analysis is to figure out what emotions are reflected in the text. Rule- and text-based methods are two of the most popular sentiment analysis techniques. The former matches rules and sentiment dictionaries to text. The latter interprets the text as an emotion classification and begins by annotating the text with the training and test sets to extract text features. Classifiers are used to classify the text if there are two types of emotional analysis. Sentiment analysis frequently employs the following classification algorithms [26].(1)Naive Bayes classification: The basic concept of the naive Bayes algorithm is to calculate the joint probability value of classified text objects and categories and classify objects according to this value [27].(2)K-nearest neighbor classification algorithm: In order to judge the category of documents to be tested, the algorithm firstly searches for the nearest adjacent documents in the training and obtains the candidate category score of documents to be tested according to the classification of these adjacent documents [28].(3)Support vector machine classification: The support vector machine (SVM) classification algorithm is mainly used to solve binary pattern classification problems and is a new general and monitored machine learning tool based on the theory of statistical learning. The basic idea is to nonlinearly map the training data to high-dimensional feature space and find the optimal classification hyperplane as the decision plane, thereby maximizing the separation margin between positive and negative examples.
4. Chinese Learning System Based on LNP Technology
4.1. System Functional Requirements
A multimedia, interactive, easy-to-manage, and user-friendly Chinese learning system developed based on the B/S architecture of ASENET 4.0 and SQL Server 2008. B/S structure is a network structure mode, which unifies the client, concentrates the core part of the system function realization on the server, and simplifies the development, maintenance, and use of the system. It provides system users with skills and useful tools for learning Chinese knowledge, and it is developed to promote communication. Taking the Internet as the carrier, the website not only maximizes the use of multimedia technology to present users with basic Chinese knowledge and diversified Chinese culture but also integrates natural language processing technology to realize the learning support modules such as word learning, news summary, and emotion analysis so that users can learn Chinese more efficiently and happily. The goal of increasing satisfaction should be achieved. Chinese learning systems must be practical, secure, reliable, scalable, stable, and easy to maintain while meeting the needs of system users.
4.2. Overall Design Framework of the System
As shown in Figure 2, the specific implementation functions of each module of the learning system are as follows: Basic learning module: This module focuses on basic Chinese learning. It consists of four parts: phonology, pinyin learning, Chinese character learning, and idiom dictionary. It covers the pronunciation, stroke order, interpretation, word formation, sentence making, and idiom query of Chinese characters and paves the way for the reading and communication of subsequent chapters. Reinforcement learning [29, 30] module: Using the principle of “learning begins in life,” this module chooses topics that are closely related to Chinese daily life, making it easier for system users to learn Chinese. The articles are mostly drawn from everyday life, current events, network dynamics, idiom stories, and so on. Users can select the link of any Chinese character in the text to consolidate and learn basic Chinese character knowledge, and they can also query a word to get detailed learning materials for the word. Module on Chinese culture: Learning the Chinese language requires an understanding of Chinese culture. This module primarily contains articles on four topics: cultural common sense, famous folklore, Tang poetry and Song poetry, and folk myths, all of which are intended to assist users in learning Chinese in the Chinese way of thinking. Users can learn a series of basic skills while browsing articles, such as in reinforcement learning modules. Tool assistant module: This module mainly provides the registered users of the system with auxiliary T tools for learning Chinese, such as vocabulary learning, news summaries, sentiment analysis, and so on. It greatly simplifies Chinese learning. Dynamic update module: System updates will be displayed in real time on the home page, including the latest notices issued by the administrator, articles with high click-through rates, and recently read articles. Download module: A variety of Chinese materials will be provided to users for offline learning. Message board module: This module provides an interactive platform for all registered users to better facilitate mutual learning and improvement. Backstage management module: In the functional requirements analysis, it is mentioned that this function is convenient for administrators to update resources, manage users, and maintain the system, mainly including topic management, user management, message board management, and resource management.

4.3. Database Design
Database design is one of the most widely used technologies and the fastest-growing field in computer technology. A good database design can effectively reduce data redundancy and improve data storage efficiency.
4.3.1. The Design Steps of the Database
Usually, database design can be separated into six phases: requirements analysis, conceptual design, logical design, physical design, database implementation, and database operation and maintenance, as shown in Figure 3.

In database design, the quality of data structure not only directly affects the efficiency of operation execution but also indirectly affects the efficiency of data acquisition, so ensuring a good database design is an important guarantee to improve efficiency.
The Chinese learning system uses SQL Server 2008 database to store and manage data.
4.3.2. The Main E-R Diagram Design of the Database
E-R diagram (entity-relationship diagram), also known as the entity-relationship model, is an effective way to describe the conceptual model of real-world relationships. It is usually composed of three elements: entity, attribute, and relationship. Generally, during design, an E-R diagram describing the conceptual structure data model can be drawn according to application requirements. The E-R diagram of the main functional entities of the Chinese learning system is shown in Figure 4.

As can be seen from Figure 4, the administrator can view and manage all information in the system, including subject content, user information, messages, and replies. Nonregistered users can only see part of the subject content, while registered users have full rights to view and use information, and only have the right to manage their own information, replies, and user information.
4.4. System Login Function Test
4.4.1. Login Module Function
On the one hand, the users’ login can check whether the registration is successful. On the other hand, it can be used to check whether the function of the login module is normal. One group in the system is tested with the correct users’ name and password, and the other groups are tested with different wrong user names or wrong passwords as shown in Table 1.
System administrator login and system registered user login test content and results are the same, so the test will be passed.
4.4.2. Functional Test of Page View Module
The test of the page browsing module includes dynamic update module, basic learning module, topic browsing module, reinforcement learning module, Chinese culture module, resource download module, and friendly link test. The main test content is whether each page of the system is displayed correctly. The function test of the page browsing module is shown in Table 2.
4.4.3. Tool Module Test
The test of the tool module includes whether the three modules of word and sentence learning, news summary, and sentiment analysis can be used normally. The test results of the tool module are shown in Table 3.
4.4.4. Message Function Module Test
The test results of the message function module are shown in Table 4.
4.4.5. Other Tests and Results
(1)Compatibility test: Considering different browsers for the compatibility test, there may be compatibility problems in the display of the same page, so IE, Chrome, and Firefox, three common browsers, are selected for the system compatibility test. Checking the system in the above three browsers, the results show that the difference in page display effect is not obvious and will not affect the user’s experience, so the compatibility test is passed.(2)Friendly interaction test: If a user enters data outside of the defined range or performs an incorrect operation while using the Chinese learning system, the system should provide friendly prompts to direct the user to the correct action. The friendly interaction test passes after testing because the interface for interactive input with the user in the system has strong fault tolerance and good error correction and can prompt the users to perform the correct operation. Furthermore, no issues such as slow response or crashes were discovered during the process of repeatedly opening and closing the software and inputting test data. The test’s problems have been addressed; the test has been reorganized; and good results have been achieved. In general, the test indicates that the system has a user-friendly interface, complete functions, good security, and stability and meets the design requirements in general. By continuously collecting feedback from users during the future use process, further modifications and improvements will be made to make the system’s function more scientific and reasonable.
5. System Test
5.1. Investigation of Application Effect
The online survey data is used as the standard to reflect the experience of use to further test the application effect. Through the evaluation of the effect of the questionnaires issued by the international students who tried the system, a total of 24 questionnaires were distributed; 24 valid questionnaires were recovered; and the data were integrated and statistically analyzed. The evaluation index is shown in Figure 5.

5.2. System Content Survey
Figure 6 shows the statistical results of the four evaluation indicators in the content survey. From the statistical results, more than 90% of the students believe that the English-Chinese translation of the learning system is accurate, and the content is clear and scientific. These results show that the learning system has better scientific. More than 80% of the students believe that the learning system is rich in resources, diversified in forms, and extensible in Chinese knowledge. But 17% of students also think the learning system was average in richness. The statistical results of topicality and correlation shown in Figure 6(b) reflect that more than 95% of scholars believe that the topic of the learning system is clear, systematic, and logical. This further shows that the subject of the system meets the cognitive needs of learning: more than 85% of scholars believe that the learning system can target specific learning objects, while 13% of scholars believe that the system is general as a whole. It shows that scholars believe that the relevant design of learning content is satisfactory, but it needs to be properly optimized.

(a)

(b)
5.3. Survey Results of the System Interface
Figure 7 shows the results of students’ evaluation of the learning system interface. Ninety-six percent of learners think the media information provided by the system is reasonable and appropriate, which can help them understand knowledge points. This proves that using appropriate media according to the learning content can help learners accurately grasp the learning content and easily understand the knowledge points. Eighty-eight percent of the learners hold the opinion that the links of the system are obvious, easy to identify, accurate to jump to, clear to label, and could accurately inform them of the topic content, while 13% of the learners consider that the design of the links is normal. This indicates that the detailed clear label can lead to learners correctly, but this design still has a large improvement space: 92% of the learners think that the speed and effect of the system interface can meet their general needs, and the technical application design of the Chinese learning system should meet the needs of the learners.

(a)

(b)
5.4. Learning Feedback Results
In the evaluation indicators, the rationality of learning and the feedback results of students are evaluated by rationality. The statistical data are shown in Figure 8.

(a)

(b)
Figure 8(a) shows the statistical results of the reasonableness of learning. More than 90% of students believe that the design of a learning system is beneficial to the development of learners’ knowledge and skills. Figure 8(b) shows the results of learners’ feedback. Ninety-two percent of learners believe that timely feedback on test results can clearly understand the effect of learning and research, while only 8% of learners say that it gets the little effect. This suggests that properly designed exercises and feedback can help learners understand their learning and knowledge acquisition, identify gaps in time, reflect on, and improve their recent learning.
5.5. Using Effect
Figure 9(a) shows the statistical results of the attitude of seeing in the evaluation index, in which 87% of students believe that the learning system is conducive to improving their confidence in communication and dialogue. And nearly 75% of the students consider that using the learning system is helpful to enhancing their interest in learning Chinese. Figure 9(b) shows the results of the knowledge and skills assessment: 79% of learners can learn practical conversational sentence patterns and knowledge from the system, 63% of learners consider that the system can promote the fluency in Chinese communication, and 71% of learners think that the system can improve their Chinese expression and communication skills. The results reflect those respondents believe that their abilities could be improved. It suggests that the Chinese learning system for foreigners based on chatbot is more beneficial to learn Chinese.

(a)

(b)
5.6. Service Performance
The following conclusions can be drawn from Table 5.
Most users think that the design of each functional module of the system is reasonable. It shows that the design of the functional module of the system meets the learning needs of learners. Ninety-four of the students think that the system is safe and reliable in terms of information management, highly safe, and reliable in operation; 81% of the students think that there are safeguards to protect user privacy and that the system resources are highly stable. A good response time and a long fault-free time indicate that the stability-related performance of the system can be considered “good.” It shows that the system is operating well. In addition, 17% of learners rated response time as normal, indicating that this performance needs to be further improved.
6. Conclusion
This paper proposed to combine the development of natural language processing technology with the evolution of the Chinese language learning system in order to promote Chinese language learning and spread excellent Chinese culture. This paper focused on the design and implementation of a Chinese learning system integrating natural language processing technology. The system adopted B/S architecture and could be accessed directly from a web browser without installing any client programs. The system could be added by simply extending the functionality page and could be maintained and upgraded by updating the server-side page. Some natural language processing techniques have been introduced to enhance the system. Based on the detailed demand analysis of the Chinese learning system, this paper introduced the implementation of front-end functions such as user registration, login, personal information retention, content browsing, message dialogue, and natural language processing tools. In general, this paper designed and achieved a Chinese learning system that was easy to use, maintainable, secure, and extensible, guided by relatively advanced design concepts and mature technologies.
Although the system has most of the general functions required for Chinese learning, there are still some deficiencies to be improved due to the limited personal level and conditions. For example, integration with natural language processing needs further development.
Data Availability
The data used to support the findings of this study are available from the author upon request.
Conflicts of Interest
The author does not have any possible conflicts of interest.