[Retracted] Design and Application of English Grammar Error Correction System Based on Deep Learning

Hongli, Chen

doi:https://doi.org/10.1155/2021/4920461

Security and Communication Networks

On this page

Abstract Introduction Conclusion Data Availability Conflicts of Interest References Copyright Related Articles

Research Article Retraction

!

This article has been Retracted. To view the article details, please click the ‘Retraction’ tab above.

Special Issue

Massive Machine-Type Communications for Internet of Things

View this Special Issue

Research Article | Open Access

Volume 2021 | Article ID 4920461 | https://doi.org/10.1155/2021/4920461

[Retracted] Design and Application of English Grammar Error Correction System Based on Deep Learning

Chen Hongli¹

Academic Editor: Jian Su

Received12 Oct 2021

Revised26 Oct 2021

Accepted30 Oct 2021

Published23 Nov 2021

Abstract

In order to solve the problems of low correction accuracy and long correction time in the traditional English grammar error correction system, an English grammar error correction system based on deep learning is designed in this paper. This method analyzes the business requirements and functions of the English grammar error correction system and then designs the overall architecture of the system according to the analysis results, including English grammar error correction module, service access module, and feedback filtering module. The multilayer feedforward neural network is used to construct the language model to judge whether the language sequence is a normal sentence, so as to complete the correction of English grammatical errors. The experimental results show that the designed system has high accuracy and fast speed in correcting English grammatical errors.

1. Introduction

With the rapid development of China’s society and economy, the requirements for English language ability will be further improved. Compared with the rapidly increasing number of English learners, the number of English teachers cannot increase rapidly in a short time. Therefore, the use of computer-aided teaching is of great significance [1]. Automatic marking of English composition is an important aspect in CAI. As far as English education is concerned, English writing ability is a very important aspect that reflects the level of English ability. Because writing is mainly language reorganization and expression of ideas, the most important way to improve students’ language mastery is undoubtedly practice [2]. In China, it is difficult for the number of English teachers to increase rapidly in a relatively short time. However, the workload of students’ English composition practice and review is very large. In most cases, English teachers do not have enough energy to provide sufficient personalized guidance to each student, which has become an obstacle to students’ English progress to a certain extent. Correcting a large number of English composition exercises is a very time-consuming and laborious thing, so it is undoubtedly a very urgent need to find a way to deal with these exercises massively and effectively and feed back the correction information to students in time. The introduction of computer technology to assist in English composition review is a feasible scheme, which can effectively solve the problem of insufficient teachers in English teaching [3]. This is also an important way to reduce the workload of teachers and improve the quality of English teaching when there are a large number of English second language learners all over the world.

In the composition review, it mainly includes the upper general text structure and content analysis, as well as the lower spelling and sentence grammar error detection and correction. At present, there are good commercial tools for word spelling error detection and correction. The correction of grammatical errors in students’ English composition is a boring and headache problem, and there is no mature solution at present. If the computer can quickly identify the grammatical errors in sentences and give reasonable modification suggestions in time, learners will have a better experience in English learning [4]. At present, some commercial or open source syntax detection and error correction tools are generally used. For example, the syntax error detection function in Microsoft word cannot detect common syntax errors such as subject predicate consistency and preposition misuse, and the recognition rate of other errors is low. The error detection rate of the automatic English grammar check function provided by juku.com in China is also relatively low, and it does not have strong practicability in English learning. Therefore, the breakthrough of this problem will greatly promote the application of computer-aided English learning.

After these years of development, some English grammatical error correction systems have emerged at home and abroad. Literature [5] firstly designed an English grammar error correction system based on data expansion and replication, introduced replication mechanism into self-attention model, and constructed a text grammar error correction model from wrong text sequence to correct text sequence. Secondly, on the basis of public data set, this paper uses sequence to sequence learning to learn different forms of wrong text from correct text. Finally, combined with the characteristics of English hieroglyphs, the isomorphic and homophonic thesaurus is constructed, and the error samples are constructed manually according to the way of thesaurus mapping to expand the training data. Literature [6] designed an English grammar error correction system based on N-gram model, including corpus writing module, data preprocessing module, grammar error labeling module, and grammar error detection module. It carried out grammar error feature analysis and corpus error correction methods through machine learning, trained based on the model, preprocessed the data input through the data preprocessing module, and eliminated the data noise, so as to obtain high-quality data. The automatic segmentation and part of speech tagging before sentence segmentation are carried out through the syntax error tagging module, so as to improve the correctness of word segmentation. The N-gram model is established and the syntax error is detected based on the CRF model. However, the above two systems have low accuracy and poor correction effect. Literature [7] designs an automatic written grammatical error detection system for English learners. Firstly, three data-driven system design methods based on large-scale native language and learner corpus are introduced, then the evaluation standards of the language error detection system are discussed, and finally some suggestions to improve the accuracy of the existing system are put forward. Literature [8] designs an English grammar error correction system based on deep learning technology. Firstly, it introduces the theoretical basis of building a deep learning technology model based on seq2seq and the corpus and then analyzes the grammar error correction model based on seq2seq. Finally, the architecture design of grammar error correction algorithm model and the operation framework and main principles of core modules are introduced. The research results show that the application of artificial intelligence in grammar error correction has gradually attracted the attention of relevant researchers.

According to the existing research results, it has made great progress, but the formation of this technology can not only effectively reduce the workload of teachers’ marking papers, but also contribute to students’ autonomous learning. Meanwhile, the above systems take a long time to correct English grammatical errors and have low correction efficiency. Therefore, there is urgent to study a new system which can solve these problems.

Aiming at solving the problems of the above system, this paper designs an English grammar error correction system based on deep learning and verifies the effectiveness and practicability of the system through simulation experiments, which lays a foundation for the quality of English teaching. Deep learning has good accuracy, so we use it in this paper. It analyzes the business requirements and functions of the English grammar error correction system and then designs the overall architecture of the system according to the analysis results, including English grammar error correction module, service access module, and feedback filtering module. The multilayer feedforward neural network is used to construct the language model to judge whether the language sequence is a normal sentence, so as to complete the correction of English grammatical errors. The contributions of this paper can be described as follows:(1)This paper proposed a new method of English grammar error correction system based on deep learning, which can help to solve the problems of the existing English grammar error correction system.(2)This paper tried to solve the problems with the method of deep learning in this filed which is a new attempt.

The rest of this paper is structured as follows: Section 2 is the introduction of the English grammar error correction system. Simulation experiment analysis is given in Section 3. Then, Section 4 gives the conclusion and the future research directions.

2. English Grammar Error Correction System

2.1. System Business Requirement Function Analysis

The English grammar error correction system mainly provides users with a simple and easy-to-use English grammar error correction website, which can input sentences and correct them. The use case diagram is given in Figure 1 [9]. The user can log in, correct syntax errors, feed back modification suggestions, and view the original error correction. The user can log in with the user name and password. After successful login, the user can use the system syntax error correction function to input the English sentence to be corrected for syntax error correction, as shown in Figure 1.

According to the user use case diagram and the design objectives of English grammar error correction system, the main functions of English grammar error correction system are combined and analyzed below for subsequent system modeling and module division [10].

User management: it includes user login, user category, and permission management. User login: after accessing the website, the user enters the user name and password to log in. The system completes the verification of the user name and password and returns the login results as follows: the user is not registered, the user name or password is wrong, the login is successful, etc. User category and permission management: the system divides users into categories and gives them different permissions.

Syntax error correction: obtain the text to be corrected entered by the user. If it is multiple sentences, break sentences first, and then segment words to get the formatted sentences. Use the trained syntax error correction model for syntax error correction, combine the error correction results of multiple statements, and return the results to the user [11].

Feedback mechanism: after correcting syntax errors and obtaining the results, if the user is not satisfied with the error correction results given by the system or has a better way to correct errors, the user can feed back his/her modification suggestions to the system, and the system will evaluate the proposed modification methods and give the results. The feedback filtering module is used to filter and obtain the adopted results and return the results to the user.

Text management: it mainly includes text storage, status modification, batch export, and other functions. After the modification suggestions fed back by users are filtered by the system, if they are adopted, they are stored and the status is set. According to the status, it can be judged whether they have been exported to the error correction corpus. Provide batch export function. After triggering the system upgrade, you need to train the syntax error correction model. Firstly, the training needs to obtain the corpus, export the latest accumulated suggestion text from the system in batch, and modify its status for your own use [12].

System upgrade trigger: set certain rules to trigger system error correction, model training, model effect evaluation, model update, and other operations. The current threshold judgment rules can trigger system upgrade when the number of accumulated recommended texts reaches a certain threshold. Set a scheduled task to count the number of suggested texts accumulated in the database. If the statistical result is greater than or equal to the set threshold, the system will be upgraded.

Syntax error correction model update: the system upgrade needs to train the error correction model, obtain the accumulated suggestion text, and preprocess it to get the formatted text, which can be used as a new error correction corpus for model training. Conduct offline training for the error correction model. After the training, evaluate the error correction effect of the model. If the error correction effect of the new model is better than the historical model, replace the model and complete the system upgrade [13].

Modify suggestion filtering: due to the uneven English level of users, the submitted suggestions may not be correct. Therefore, it is necessary to filter the suggested text, screen out high-quality text, and improve the quality of corpus. The trained feedback filtering model is used to filter the text and return the results [14].

2.2. Overall System Architecture

According to the results of business requirements function analysis, the English grammar error correction system is modeled and divided into modules. In order to realize loose coupling, easy expansion, and reusability between modules, the system is divided into modules according to the business boundary. The architecture of English grammar error correction system is shown in Figure 2.

According to Figure 2, the English grammar error correction system includes English grammar error correction module, service access module, and feedback filtering module.

2.3. System Function Design

2.3.1. English Grammar Error Correction Module

Syntax error correction module is the core module of the system, which mainly has three functions: data processing, model training, and model error correction, among which model error correction is the core function [15]. Data processing is responsible for cleaning and screening the original corpus, extracting effective text, and carrying out structured processing to obtain regular text for later use. Model training: implement the grammar error correction algorithm, conduct model training and error correction effect evaluation in combination with the error correction corpus, and save the trained model for testing and formal use. Model error correction: use the trained error correction model to correct the errors of sentences and output sentences that do not contain grammatical errors. The module will provide two thrift service interfaces: model training and model error correction. The former is responsible for receiving the request for model training and retraining and evaluating the algorithm model. The model error correction interface is responsible for correcting syntax errors in sentences and returning error correction results.

Describe the syntax error correction itself. When receiving the interface request, first verify the parameters. If the parameters are illegal, it will end directly. If the parameter is valid, continue. Judge whether the request parameter contains multiple statements. If yes, break the sentence, and then correct the syntax error using the previously trained error correction model. When the error correction of the last sentence is completed, combine the error correction results . And return and end the call. If it is a single statement, you can directly use the error correction model to correct the error without breaking the sentence. The whole workflow is shown in Figure 3.

2.3.2. Service Access Module

The service access module is the background management element of the system, which is composed of error correction service, feedback mechanism, user management, text management, and system upgrade. The error correction service accepts the user’s syntax error correction request, calls the model error correction interface of the syntax error correction module for actual error correction, and returns the error correction results to the user to complete the main functions of the system. The feedback mechanism accepts the modification suggestion request submitted by the user, uses the service interface provided by the feedback filtering module for filtering. According to the filtering result, the user is prompted whether to adopt the suggestion. User management, which realizes the functions of user login, category differentiation, and permission verification, verifies the user name and password entered by the user, returns the login results, and controls the permission for the user to submit modification suggestions. Only advanced users have this permission. It is recommended that the management, storage, status modification, batch export, and other functions of the text should be set to the system. When the cumulative number reaches the set threshold, it triggers the retraining of the syntax error correction module model to improve the self-learning ability of the system.

2.3.3. Feedback Filtering Module

The module is responsible for implementing the feedback filtering algorithm and providing feedback filtering services, including corpus processing, model training, and feedback filtering. The text is extracted from the corpus for sentence segmentation and word segmentation. Train to evaluate the effectiveness of the model. Provide a feedback filtering thrift service interface to filter the suggested text submitted by users and give the results of whether to adopt it or not. The feedback filtering process is shown in Figure 4.

First, verify the request parameters. If the parameters are illegal, it will end directly. The parameter compliance law loads the feedback filtering model to calculate the probability of the proposed sentence and the modified sentence of the original system. If the probability value of the proposed sentence is greater than the modified sentence of the original system, this indicates that the proposed sentence is more likely to be correct, so it is adopted; otherwise it is not adopted.

2.4. Language Model Based on Deep Learning

Deep learning is a multilevel automatic feature extraction learning method, in which the input of each layer comes from the previous layer, and each layer is nonlinear transformation. From the first layer, the input data can learn different levels of features through multilevel nonlinear transformation. Different from traditional machine learning, deep learning can automatically extract very complex features. It is a data-driven learning method.

Multilayer feedforward neural network is the simplest network model in deep learning. Each layer contains several neurons. Each neuron is only connected with the neurons of the previous layer, receives the output of the previous layer, and outputs it to the next layer. The training of feedforward neural network model includes two main steps:(1)The input data is output through forward propagation, and the loss value is calculated and combined with the real output.(2)The random gradient descent algorithm is used, the gradient is back propagated in turn, and the parameters are updated. The forward propagation formula is as follows:where the input of the current layer is , the output is , and the parameters are weight matrix and offset vector . The function in (1) is generally a nonlinear activation function.

The essence of back propagation is the repeated use of the chain derivation rule. Starting from the last layer, the parameters are derived successively according to the error calculated by the loss function, and the gradient is back propagated successively. Stochastic gradient descent (SGD), Adam, and other optimization methods are used to update the parameters in the neural network.

In this paper, the language model is used to calculate the correctness of a sentence; that is, the statistical language model is obtained through corpus training, and then the sentence is “scored” to judge whether a language sequence is a normal sentence. In this paper, multilayer feedforward neural network is used to construct language model. Statistical language models are usually used to estimate the probability of sentences. Suppose a sentence ; the probability of the sentence can be calculated by the following formula:

In order to calculate the probability of sentence occurrence, a total of n parameters need to be estimated. In this way, as the sentence length increases, more and more parameters are needed to calculate the probability of the subsequent words. In fact, we lack effective methods to accurately calculate the probability of long sentences, because it requires a lot of training data to reasonably estimate these parameters. Moreover, with the increase of sequence length, the problem of data sparsity will occur, and many sequences will not appear in the training data at all. However, by introducing some independence assumptions, we can better calculate this probability. Assuming that the probability of a word’s occurrence is only determined by the finite words in front of the word, such as the previous word, this is the binary language model. Assuming that the probability of a word is only determined by the first two words, the ternary language model can be obtained. More generally, in order to unify the calculation process, the start and end marks will be added before and at the end of the sentence, and the probability of the sentence is calculated as follows:where represents the number of times triples appear in the training corpus, and represents the number of times triples appear. Because tuples do not appear in the training process, resulting in inaccurate generalization estimation, it also needs smoothing.

Language model based on multilayer feedforward neural network is as follows:where represents the embedding matrix and represents the one hot vector.

3. Simulation Experiment Analysis

In order to verify the effectiveness of the English grammar error correction system based on deep learning in practical application, a simulation experiment is carried out. The English grammar error correction interface is shown in Figure 5.

The experimental environment configuration is shown in Table 1.

This study selects the Spoken Arabic Digital Data Set as the experimental data set, which contains a large amount of English grammar data. In order to reduce the difficulty of the experiment, 13500 data instances are randomly selected from the Spoken Arabic Digital Data Set with a sampling rate of 16 KHz. The specific experimental data information is shown in Table 2.

At the same time, the DSP chip of the design system is set as 16 bit coding, and the English grammar pre-emphasis filter function is .

In order to objectively evaluate the application performance of the designed system, the experimental evaluation indexes are English grammar error detection accuracy and English grammar correction quality score. The calculation formula of English grammar error detection accuracy isIn (5), represents the correct number of English grammatical errors, represents the correct number of correct English grammar judgments, represents the number of English grammatical errors judged to be correct, represents the number of English grammatical error types and judgment errors, represents the number of correct English grammar instances judged as English grammar errors, indicates the number of correct English grammar type judgment errors.

The corrected English grammar quality score is calculated as follows:In (6), represents the score result of corrected English grammar quality, and , , , and represent the results of grammar evaluation.

Based on the above prepared experimental data and the determined experimental evaluation indicators, the English grammar error correction system based on deep learning designed in this paper, the English grammar error correction system based on data expansion and replication, and the English grammar error correction system based on N-gram model are used to correct English grammar errors and compare the accuracy of English grammar error detection. The analysis process of specific experimental results is shown in Figure 6.

According to Figure 6, the English grammar error correction system based on deep learning designed in this paper has a maximum English grammar error detection accuracy of 100%, while the English grammar error correction system based on data expansion and replication has a maximum English grammar error detection accuracy of only 70%, and the English grammar error correction system based on N-gram model has a maximum English grammar error detection accuracy of only 90%. The English grammar error correction system based on deep learning designed in this paper has a higher accuracy of English grammar error correction than the English grammar error correction system based on data expansion and replication and the English grammar error correction system based on N-gram model. The correction effect is good.

In order to further verify the effectiveness of this method, the English grammar error correction system based on deep learning designed in this paper, the English grammar error correction system based on data expansion and replication, and the English grammar error correction system based on N-gram model are used to correct English grammar errors, and the English grammar quality scores are compared and corrected. The comparison results are shown in Table 3.

As shown in Table 3, the corrected English grammar quality score of the English grammar error correction system based on deep learning designed in this paper is 9.00–9.88, the corrected English grammar quality score of the English grammar error correction system based on data expansion and replication is 8.00–8.65, and the corrected English grammar quality score of the English grammar error correction system based on N-gram model is 7.05–7.85, which shows that the corrected English grammar quality score of the English grammar error correction system based on deep learning designed in this paper is the highest, and the English grammar error correction effect of the designed system is better.

In order to further verify the effectiveness of this method, the English grammar error correction system based on deep learning designed in this paper, the English grammar error correction system based on data expansion and replication, and the English grammar error correction system based on N-gram model are used to compare and analyze the English grammar error correction time. The comparison results are shown in Figure 7.

From Figure 7, we can see the comparison results of English grammar error correction time in the English grammar error correction system based on deep learning, N-gram model, and data augmentation and replication. In the deep learning method, it is around 20s, which is shorter than that of the English grammar error correction system based on data expansion and replication and that of the English grammar error correction system based on N-gram model. The results show that the performance of the deep learning is much better than the other two methods.

In order to verify the effectiveness of this method, the English grammar error correction system based on deep learning designed in this paper and the English grammar error correction system based on data expansion and replication are used to correct English grammar errors, and the correction results are compared with the actual test results. The comparison results are shown in Figure 8.

In the figure, we can get the comparison results of English grammar error correction between the data augmentation and replication method, deep learning, and the actual test results. According to this figure, the English grammar error correction results of the English grammar error correction system based on deep learning designed in this paper are consistent with the actual test results, while the English grammar error correction results of the English grammar error correction system based on data expansion and replication are quite different from the actual test results. This shows that the English grammar error correction system based on deep learning designed in this paper has a good effect on English grammar error correction.

4. Conclusion

For English learning, grammar learning and practice constitute a very important part. English writing is an effective way to test and improve English grammar, so English learners will do a lot of writing training to improve their English level. It is difficult to spend a lot of energy on marking compositions. At present, there is a relative shortage of English teachers in China, so it is required that the judgment of English examination papers must be improved and can achieve a certain degree of automation. The most common problem in composition correction is grammatical errors. If we can help teachers point out the possible grammatical errors in advance, this will greatly improve the efficiency of teachers in correcting compositions. Whether it is from the daily learning of English learners, the teaching work of teachers, or correction tasks, there is an urgent need for an intelligent grammar correction system to correct English grammar errors. Based on this, this paper designs an English grammar error correction system based on deep learning, which can not only reduce the burden of teachers, but also enable English learners to get feedback quickly and fully mobilize students’ enthusiasm for autonomous learning.

In this paper, we proposed a new strategy of English grammar error correction system based on deep learning, and the experimental results show that the proposed method is effective. It makes full use of the advantages of deep learning, so as to improve the efficiency of the algorithm proposed in this paper. However, due to the high complexity of deep learning methods, further research is needed in the future to reduce the complexity of the algorithm and further improve the performance of the algorithm.

Data Availability

The data used to support the findings of this study are available from the author upon request.

Conflicts of Interest

The author declares that he has no conflicts of interest.

References

C. Park, Y. Yang, C. Lee, and H. Lim, “Comparison of the evaluation metrics for neural grammatical error correction with overcorrection,” IEEE Access, vol. 8, no. 8, pp. 106264–106272, 2020.
View at: Publisher Site | Google Scholar
S. Li, J. Zhao, G. Shi et al., “Chinese grammatical error correction based on convolutional sequence to sequence model,” IEEE Access, vol. 7, no. 99, pp. 72905–72913, 2019.
View at: Publisher Site | Google Scholar
Z. Qiu and Y. Qu, “A two-stage model for Chinese grammatical error correction,” IEEE Access, vol. 7, no. 7, pp. 146772–146777, 2019.
View at: Publisher Site | Google Scholar
Y. Tan, Y. Yang, and L. Yang, “Automatic correction of syntax errors in ESL articles based on LSTM and n-gram,” Chinese Journal of information, vol. 32, no. 6, pp. 24–32, 2018.
View at: Google Scholar
Q. Wang and Y. Tan, “Chinese grammar error correction method based on data expansion and replication,” Journal of Intelligent Systems, vol. 15, no. 1, pp. 99–106, 2020.
View at: Google Scholar
H. Wang and J. Zhou, “Research and implementation of Chinese grammar automatic error correction system,” Enterprise technology and development, vol. 460, no. 2, pp. 89–92+95, 2020.
View at: Google Scholar
L. Liu and M. Liang, “A survey of research on automatic detection of English learners’ written grammatical errors,” Chinese Journal of information, vol. 32, no. 1, pp. 1–8, 2018.
View at: Google Scholar
Y. Jing, “Construction and analysis of syntax error correction algorithm model based on deep learning technology,” Information Technology, vol. 44, no. 9, pp. 151–155+160, 2020.
View at: Google Scholar
S. W. Cho, H.-s. Kwon, H.-y. Jung, and J.-H. Lee, “Adoption of a neural language model in an encoder for encoder-decoder based Korean grammatical error correction,” KIISE Transactions on Computing Practices, vol. 24, no. 6, pp. 301–306, 2018.
View at: Publisher Site | Google Scholar
J.-H. Lee, M. Kim, and H.-C. Kwon, “Deep learning-based context-sensitive spelling typing error correction,” IEEE Access, vol. 8, no. 8, pp. 152565–152578, 2020.
View at: Publisher Site | Google Scholar
Y. Gao, “Discussion about the side effect of error corrective feedback and possible alternatives to error correction,” Overseas English, vol. 384, no. 20, pp. 254-255, 2018.
View at: Google Scholar
Li Guo, “Design of mobile terminal grammar learning system based on Android platform,” Electronic Design Engineering, vol. 28, no. 22, pp. 31–34+39, 2020.
View at: Google Scholar
W. Yinxia and L. Yang, “A systemic functional grammar study on the embedding of prominent tendency features in evaluative “V de C” clauses,” Foreign Language, vol. 43, no. 1, pp. 23–33, 2020.
View at: Google Scholar
X. Pang, “Design and research of English grammar mobile learning system based on Android platform,” Electronic design engineering, vol. 26, no. 15, pp. 46–49, 2018.
View at: Google Scholar
C. Gong and J. Wang, “Review of automatic grammar checking methods,” Computer science and Applications, vol. 8, no. 9, p. 10, 2018.
View at: Google Scholar

Copyright

Copyright © 2021 Chen Hongli. This is an open access article distributed under the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.

PDF Download Citation

Download other formats

Order printed copies

Views

1360

Downloads

671

Citations