Advanced Pattern Recognition Systems for Multimedia DataView this Special Issue
English-Assisted Teaching Evaluation System Based on Artificial Intelligence and Rasch Model
In order to improve the effect of English teaching in colleges and universities, this paper analyzes the English teaching process and combines the English teaching evaluation system and the Rasch model to construct an English teaching evaluation system. Moreover, this paper analyzes the needs of English teaching evaluation and uses the mixed Rasch model in the item reflection theoretical model to study the potential classification problems of teaching. At the same time as the optimal parameter estimation, the potential optimal classification ratio is found, and it is used as the basis for the classification of English teaching grades. In addition, this paper combines the Rasch model to construct the English-assisted teaching evaluation system and analyzes the system framework and workflow. The experimental research shows that the evaluation system of college English teaching process based on the Rasch model proposed in this paper has a good effect.
With the in-depth development of the new curriculum reform, all sectors of society pay more attention to “class efficiency.” The concept of “efficient classroom” can be said to be an innovation of the new curriculum concept, which has a profound impact on the development of education in my country. As we all know, only high-quality English teaching can shape the talents needed for social development to a certain extent. Since the reform of the new curriculum, English teaching concepts, English teaching methods, teaching materials content, and framework have all undergone major changes. During the implementation of the new middle school chemistry curriculum, the content of textbooks has increased, but English teaching hours have been reduced. Therefore, an efficient classroom has become an inevitable demand. Therefore, in a limited English classroom, on the one hand, how teachers can lead students to learn independently in order to achieve efficient classrooms is a problem that needs to be solved urgently in current basic education. On the other hand, how to reasonably evaluate the effectiveness of English classroom English teaching has also become the focus of education researchers. However, nowadays, relevant education measurements are inevitably subjective. Different evaluators have different understanding and use of the standards of scoring items, and even the same evaluator may not have the same degree of lenient and strict evaluation of different classes. Moreover, the degree of difficulty of the course is also different, and these factors will all have an impact on the evaluation. Therefore, how to exclude the influence of the subjective factors of the rater in the scoring process is the factor that the current education researchers need to pay attention to.
Higher education English teaching activities focus on students’ learning subjectivity and learning needs. Therefore, the management of English teaching in colleges and universities has been thinking about how to organize English teaching to meet the expectations of students. In order to train students to become the pillars of the motherland, teachers need to work closely with students to conduct classroom English teaching according to students’ desires, and constantly change ways and methods. The authenticity of the students’ evaluation of teaching helps the teacher to know the current learning status of the students and to change the English teaching method according to it, thereby improving their English teaching ability. In addition, it helps schools understand the current situation of English teaching, formulate reasonable English teaching management plans, and make reasonable educational decisions. Therefore, this research has not only theoretical significance but also significant practical significance.
Under the same data, the study compares the Rasch model that does not consider the evaluator effect under repeated sampling conditions, and the multifaceted Rasch model that considers the evaluator effect under repeated sampling conditions. Furthermore, the study selects the appropriate measurement model to estimate the parameters of the assessment test .
As far as teachers are concerned, the newly established teaching evaluation indicators can feed back more real teaching evaluation information, which not only improves the effectiveness of students’ evaluation of teaching but also promotes their own English teaching level. For students, they are the direct experiencers of English teaching effects and can more accurately and effectively reflect the teachers’ English teaching situation to the English teaching managers. For English teaching managers, the evaluation results of the new indicator feedback are more real and reliable and can provide a better reference for teachers’ performance evaluation . If the new teaching evaluation index system is implemented in a university that is stable and effective and can be recognized by teachers and students, it can introduce relevant experiences and practices to more universities for their reference .
This paper studies the English-assisted teaching evaluation system based on the Rasch model, evaluates and analyzes English teaching to improve the effect of English teaching, which provides a reference for subsequent improvement of English teaching.
The organizational structure of this paper is as follows: the first part analyzes the current situation of English teaching activities in higher education, and the second part describes the shortcomings of the current situation of English teaching combined with the warm-hearted review part, which leads to the research content of this paper; the third part applies the Rasch model to In the English-assisted teaching assessment. When the parameters of the mixed Rasch model are estimated, the number of latent classes in the model is unknown. One method of Bayesian estimation in the mixed model is to use the latent class C as an unknown parameter of the prior distribution, to improve the data processing and data analysis effect of the English-assisted teaching evaluation; the fourth part combines the Rasch model to construct the English-assisted teaching evaluation system and analyzes the system framework and workflow. The process evaluation system has good results and meets the actual needs of the current college English teaching evaluation. Finally, the content of this article is summarized, and the future work is prospected.
2. Related Work
The decisive factor of whether the Rasch model can be measured objectively is that it must collect data in accordance with the prior requirements of the model. When verifying the fit of the empirical data and the model, the fit index provided by the Rasch model can be used to test whether the measurement requirements are met . Under normal circumstances, first use Rasch to obtain the residual, and then estimate the fit between the data and the model based on the residual . Participants’ response reflects that the degree of matching with the model is inversely related to the size of the residual value. For a certain question, if the ability of the students who answer the right answer is not very high, it means that the degree of fit between the question and the model is not good, and this process is the so-called “project fitting analysis.” If the answer to the simple question is wrong there are more students not answering than students answering the correct problem, which means that the respondent’s response does not match the model’s expectations. This process is an individual fitting analysis . Generally speaking, outfit mean square residual, infit and their normal or standard form are common fitting statistics, also called t statistics. Among them, the distribution range of the mean square fitting statistic is 0 to positive infinity, and the ideal value is 1, which is a chi-square distribution. Literature  believes that if the data result is better than the model expectation, it means that the mean square fitting statistic must be less than 1, which is overfitting; if it exceeds 1, it means that the fit between the two is not ideal.
Literature  pointed out through research that 0.5–1.5 is the best value range for Outfit MNSQ and Infit MNSQ. Of course, there are different voices. Literature  suggests that it is the most scientific to narrow the range to 0.8–1.4.
Literature  examines the test results of the Knox–Cube Test, which measures short-term memory, using the 0.1 scoring method to examine the two levels of the testee and the test item; Literature  uses the double-sided Rasch model to examine children attitudes toward science-related things (two levels: testees and test questions).
The results of the literature  showed that the differences of the questions and the differences of the judges may cause the deviation of the scores obtained by the examinees, and the rating scale will also have an impact on the subjective scores of the judges. Literature  uses a multilevel Rasch model to test the oral proficiency of candidates who use a second language for discussion. The research results show that the multilevel Rasch model can distinguish the different ability levels of candidates, but the effective level is limited to 2 to 3. Literature  uses the multilevel Rasch model to investigate the factors affecting the scorer in the scoring process. The levels to be investigated include leniency/severity, central tendency, randomness, halo, and differential/severity. The research further expands the level of raters in the subjective scoring process into five levels, so the survey of raters is more detailed and comprehensive. Literature  uses the multilevel Rasch model to analyze the scoring results of the English test. The study examined four levels including examinees, judges, questions, and scoring scales, and the results showed that MFRM can statistically successfully distinguish examinees’ English proficiency. In addition, the study tried to adjust the scores of candidates using Facets software. Literature  researchers believe that the multilevel Rasch model provides a statistical proof for the construct validity of the test. Literature  uses the multilevel Rasch model to examine the scoring results of the German test scorer in the writing and speaking scoring process from four levels: rater, examinee, scoring standard, and communication tasks. It also tested the difference between the scores given by the scorers for different gender candidates. The results of the study show that although different raters have obvious differences in the degree of lenient and strictness in the subjective scoring process, the degree of consistency of the same rater itself is very high. Literature  believes that the multifaceted Rasch model can make the test takers’ scoring results more objective and fair on the basis of identifying the subjective scoring errors of the judges. Literature  uses a multilevel Rasch model to investigate the scoring errors caused by the halo effect in the subjective scoring process of three types of raters. The study used the six-level scoring scale and the 3.68.O version of FACETS to test the scoring results of 194 raters in the English test. The results showed that no signs of halo effect were found among the raters in the same group, but each type of the scorer’s own scoring error caused by the halo effect caused a certain degree of impact on the test taker’s scoring results. Literature  uses the 3.68.1 version of FACETS to study the central tendency effect from three levels: Self-Assessment, Peer-Assessment, and Teacher Assessment. The results of the study showed that no matter at the group level or the individual level, none of the three evaluation types showed signs of converging scores.
3. Evaluation of English-Aided Teaching Based on the Rasch Model
As a latent trait model, the Rasch model uses the scores obtained by the subjects in a test item to measure unobservable, latent traits. This trait can be learning attitude, learning ability, personal values, hobbies, etc. According to the basic principle of the Rasch model, the probability of a particular subject to a particular answer to a particular question can be expressed by a simple mathematical function composed of the subject’s ability and the difficulty of the question. Its mathematical expression is
Among them, represents the score of subject on item i, represents the ability of subject j, and represents the difficulty of item i.
The hybrid Rasch model (MRM) derived from the combination of Rasch model and LCA is one of the most widely studied and applied single-dimensional hybrid IRT models. The expression for the probability of correct answer is
Among them, c represents the potential category to which the subject belongs, and is the size of the c-th potential category, which is also called the mixing ratio, and satisfies . is the ability of subject j in the c-th potential category, is the difficulty of item i in the c-th potential category, and is the probability that subject j in category c will score 1 point on item i.
This article will use both AIC and BIC criteria for comparison and select the optimal classification. The likelihood function of the parameter to be estimated at this time is
In the formula, , represents the score of subject in item , means that subject j belongs to category in the -th iteration, otherwise, .
Since the value of may be different in each sample iteration. The definitions of and in this paper are as follows:
In the formula, , is the number of parameters to be estimated, and n is the number of subjects. The model selection strategy in this paper is to perform parallel operations on candidate models of different classification numbers, and then iteratively accumulate information to provide a probability, and then and can select a specific model.
To estimate , when is a relatively simple distribution, an independent random sample can be generated by the static Monte Carlo method, and it can be estimated by applying the law of large numbers. When is a more complex distribution, samples , , , and are generated from the Markov chain with stable distribution . Next, according to the ergodicity theorem, the estimate of can be obtained as follows:
iterations will be used for estimation, that is
In general, the method steps:(1)The algorithm constructs a Markov chain, so that its stable distribution is .(2)The algorithm starts from a certain point in , and generates a sequence of points from the Markov chain in formula (1).(3)The algorithm estimates the expectation of any function for a certain and large as follows:
The main ideas of Metropolis–Hastings algorithm are as follows:
The algorithm arbitrarily selects a function . For any state , it is defined as follows:
At this time, forms a transition nucleus. The algorithm marks the Markov chain with the transition core as . Equation (8) indicates that if the state of the chain is at time , then a potential transition is generated by (usually called proposal distribution), and then it is judged whether to transition according to the probability. That is to say, after the potential transition point is found, the algorithm accepts as the value of the chain in the next state with probability and rejects the transition with probability , so that the chain is still in state x at the next moment. So after we have , there is
The corresponding has a stationary distribution of .
The usual practice is
At this time, there are
We assume that is a reversible irreducible Markov chain, its state space is , and its stationary distribution is . Then, the Metropolis–Hastings algorithm that produced this chain is Step 1: The algorithm chooses an irreducible Markov chain transition probability , then select an integer from . Step 2: We assume and . Step 3: The algorithm generates a random variable X such that , and then generates a random number . Step 4: If , then , otherwise, . Step 5: . Step 6: The algorithm returns to the third step.
In MRM, the difficulty parameter of each potential category is measured and identified. The usual practice is to impose a restriction: , using this method can effectively identify the category. The algorithm first re-parameterizes to :
obeys a normal distribution with mean and standard deviation , that is, ;
obeys a normal distribution with mean and standard deviation 1, that is, .
obeys a normal distribution with mean and standard deviation 1, that is, .
Among them, . At the same time, it can be determined by selecting a certain potential category as the reference category, and setting the mean value of and of this category to 0. Then, the algorithm sets the mean value of the remaining potential categories to , so that the ability parameters of the remaining potential categories relative to the reference category can be estimated. Therefore, the mean and variance of other types of ability distribution are estimated based on the scale of the reference group. In addition to metric determination, it is also necessary to target items (i.e., constant items between different potential categories) to achieve comparability of metrics.
4. English-Aided Teaching Evaluation System Based on the Rasch Model
This article will use the following prior and super prior information to estimate the parameters of MRM:
Among them, indicates that the observation values of sampling are all greater than zero.
For , the relationship between the joint posterior distribution of the model parameter vector and the likelihood function of the item answer and the Dirichlet prior probability of the parameter is as follows:
A potential problem when estimating the parameters of the hybrid IRT model is label switching. The nonidentification of the model means that different parameter estimates produce the same log-likelihood value.
When designing the form, the administrator uses the form designer to design the form template. After the release, the user obtains the evaluation form and fills in the data, and performs statistical processing on the data through the evaluation system. The specific business process is shown in Figure 1.
According to the business process, the form template created by the form designer is composed of many form components, each of which has its own content and unique style. The created form template can be managed, measured, and evaluated. In order to complete the system design more clearly, the relationship between the various modules is determined. We extract the above functional elements and divide them into four parts: components, forms, templates, and evaluation. Therefore, we divide the system into the following modules, which are system functional components, custom form customization design, template management, and the final evaluation system application.
As shown in Figure 2, the system function component modules are divided into the component library structure, component attribute configuration, and encapsulation of business function components. The custom form design part includes interface design, drag-and-drop rule design, drag-and-drop component generation, and component preview functions. The form designer tool includes system function components and customization of custom forms, and the form interface also contains a large number of components. The template management module includes operations on unpublished templates and published templates. The unpublished templates can only perform template modification and template deletion operations. The published template involves template deactivation and template data operations. The application of the English evaluation system is a specific back-end business system, including login, assignment of evaluation roles, acquisition of evaluation forms, and data-related operations.
On the basis of the overall module, in order to better develop and design, while ensuring the portability of the system, we adopt the development model of front-end and back-end separate development, and the front-end and back-end are independent and interconnected. Since one of the biggest characteristics of front-end Vue is componentization, and the form designer is composed of a large number of components, it is most appropriate to use the Vue framework to develop the front-end. The back-end uses Spring MVC, Spring, and MyBatis frameworks to connect with specific business systems, and combines the custom form system with the evaluation system to achieve business continuity. Figure 3 shows the overall technical architecture design of the system.
The operation area is divided into three areas, the leftmost is the toolbar, the middle is the design area, and the right is the attribute setting area. This system fully combines this interface construction method to design a custom form interface. The interface diagram is shown in Figure 4.
After the form designer interface is formed, to design the form, you first need to understand the process of form design. Since this system ultimately serves the English evaluation system, the function of the form design will be designed and used by the administrator. After entering the form designer interface, the administrator sets the form attributes, then drags and drops the components in the component library to design, and fills in the attribute configuration of the components. After the settings are completed, the form can be submitted. Figure 5 shows the process of an administrator designing a form through the form designer.
The main function description of the English teaching evaluation system based on the Rasch model is shown in Figure 6:
The overall software architecture based on the Rasch model is designed as follows, as shown in Figure 7.
The college English teaching process evaluation system is a teaching process evaluation system based on the campus network. It is a diversified and open evaluation system. The overall structure design of the system based on the Rasch model is shown in Figure 8.
The functional module design of the college English teaching process evaluation system based on the Rasch model is shown in Figure 9:
After constructing the above system model, the effect of the system model is verified, and the evaluation system of the college English teaching process based on the Rasch model proposed in this paper is compared with the traditional English teaching evaluation system. The results obtained are shown in Figure 10.
From Figure 10, we can see that the system model proposed in this paper has greater advantages compared with the traditional English teaching evaluation system. On this basis, the evaluation system of the college English teaching process based on the Rasch model is evaluated, and the results shown in Table 1 and Figure 11 are obtained.
From the above research, it can be seen that the evaluation system of college English teaching process based on the Rasch model proposed in this paper has good results and meets the actual needs of current college English teaching evaluation.
Regarding the assessment and evaluation of undergraduate English teaching in colleges and universities, the Ministry of Education has formulated more stringent requirements for the quality of college English teaching. That is, the quality of information collection must be ensured in the process of reviewing English teaching work, and the quality assurance system of English teaching in school must be improved. The student evaluation system is the main means to gather students' views on English teaching and promote the improvement of English teaching. Therefore, it is an important part of the monitoring system of college English teaching quality. In order to enhance the main body consciousness of the quality of colleges and universities, timely understand the opinions of students, improve the existing student evaluation system, and further enhance its authenticity and effectiveness, it is an inexhaustible driving force for continuous improvement of English teaching. This paper studies the English-assisted teaching evaluation system based on the Rasch model, evaluates and analyzes English teaching, and improves the effect of English teaching. The experimental research results show that the evaluation system of college English teaching process based on the Rasch model proposed in this paper has good results and meets the actual needs of current college English teaching evaluation.
In this paper, the MCMC algorithm is used to estimate the parameters of the hybrid Rasch model. Factors such as algorithm convergence need to be considered, and because of the large amount of English teaching data, the optimization algorithm can be considered next to reduce the computing time and reduce the requirements for computer configuration. To better serve English teaching data analysis.
The labeled dataset used to support the findings of this study are available from the corresponding author upon request.
Conflicts of Interest
The authors declare no conflicts of interest.
L. Susanty, Z. Hartati, R. Sholihin, A. Syahid, and F. Y. Liriwati, “Why English teaching truth on digital trends as an effort for effective learning and evaluation: opportunities and challenges: analysis of teaching English,” Linguistics and Culture Review, vol. 5, no. S1, pp. 303–316, 2021.View at: Publisher Site | Google Scholar