Abstract

This paper presents an in-depth study and analysis of the assessment of English teaching ability using the algorithm of fuzzy mean-shift clustering. The paper proposes an automatic scoring model for Chinese-English sentence-level interpretation based on semantic scoring. The automatic scoring model calculates candidates’ Chinese-English sentence interpretation scores by fusing the feature parameters at both the phonological and content levels. Adaptive weights are introduced to fuse the current pixel and the neighborhood mean with adaptive weighting, and the weighted entropy constraint term is embedded in the clustering objective function to solve the selection problem of the weighting parameters. Finally, the graphical fuzzy division information of the neighboring pixels is used to construct the local spatial information constraint term of the current pixel, and the graphical fuzzy division term of the current pixel is adjusted to correct the clustering center obtained from the iteration. Fluency is selected as the feature scoring parameter at the speech level, and the automatic scoring model directly scores the fluency features of the candidates’ recordings; two feature scoring parameters, keywords, and sentence semantics are selected at the content level, and the content features are scored after converting the candidates’ recordings to text by manual conversion. The zero-energy product method is used to extract speech features to calculate fluency feature scores; the semantic scoring model introduced in this paper is used to calculate keyword and sentence semantic feature scores; finally, the random forest algorithm is used to fuse the above three feature scoring parameters to obtain the total quality score of Chinese-English interpretation. Considering the correlation of neighborhood pixel affiliation, the KL scatter and affiliation space information is used to supervise the current pixel affiliation, to further improve the segmentation accuracy of the algorithm; finally, segmentation tests are conducted on synthetic images, medical images, and remote sensing images. The results show that the proposed algorithm has a stronger noise suppression ability and can obtain more satisfactory segmentation results than other robust fuzzy clustering algorithms.

1. Introduction

With the advent of the information age, society has changed the cultivation and demand for talents, requiring them to have a higher level of comprehensive literacy. Education is the root of talent cultivation, and schools, as the birthplace of talents, are charged with the task of delivering various kinds of high-level talents for the construction and development of the country. Similarly, teachers, as the practitioners of education, are the key to the cultivation of high-quality talents [1]. When designing and organizing extracurricular activities, it is necessary to pay attention to the fact that the content of the activities should be related to the subject, all staff can participate, the content of the activities is exciting and interesting, and the forms of activities are rich and diverse. After all teaching activities, normal students should be able to apply the knowledge of pedagogy and psychology. At this stage, teachers should promote the application of technology-related products in education and teaching, not only by using technology-related products to build various forms of learning communities such as “workshops” and “virtual communities,” but also by updating professional knowledge and changing old concepts. They also need to actively integrate technology into their daily teaching practices to improve their professional abilities and keep pace with the times. At the same time, teachers must have the ability to innovate and learn throughout their lives to meet the demands of the information age for the training of high-quality human resources [2]. The public is very concerned about teachers’ educational and teaching abilities because of their value for discussion not only in the academic field, but also for practical research. Teacher trainees are the reserve force of China’s teaching force, and teacher education is the stage of training prospective teachers’ educational and teaching abilities, and the learning effect of this stage will, to a certain extent, affect the teaching effect of teacher trainees after they join the profession. For teacher trainees who will become primary and secondary school teachers, the essence is also to improve their educational and teaching abilities [3]. However, there are still many problems with the educational teaching ability of current teacher trainees, which leads to the fact that the educational teaching ability of teacher trainees cannot meet the needs of teaching in primary and secondary schools.

Unlike classification, clustering does not use the labels of sample points but uses the similarity and difference between sample points to divide data clusters. The ideal data cluster structure satisfies the following: the similarity between sample points located in the same data cluster is greater, while the difference between sample points located in different data clusters is greater. Therefore, it is important to measure the similarity and differences between sample points [4]. In cluster analysis, the distance between sample points is usually used to measure the similarity between sample points, and it is usually considered that the closer the distance between sample points, the greater the similarity between sample points, and the smaller the degree of difference, and vice versa. Educational administrations and teaching departments are also paying increased attention to the development of English-speaking teaching and learning by adding speaking exams to the large English exams. They are trying to use the test to draw students’ attention to the importance of learning spoken English. However, most of the English-speaking exams currently use an overview grading scale that does not quantify the grading criteria, resulting in unclear grading points [5]. Such scoring criteria are not conducive to scorers’ grasp of scoring details. Speaking exams mainly scored using the manual holistic scoring method. This traditional scoring method is easily influenced by subjective factors of the scorer, such as the state of the scorer, his or her educational background, and teaching experience. To reduce the influence of scorers’ subjective factors on scoring, scholars have begun to refine the outlined scoring criteria and develop an automatic scoring system for scoring large speech samples.

In addition, division-based clustering algorithms are distance-based clustering algorithms that aim to achieve closer distances between sample points located in the same data cluster and further distances between sample points located in different data clusters. The division-based clustering algorithm often has the characteristics of a simple model, less computation, and strong robustness. However, because these algorithms are based on the data cluster center as the core to divide the data clusters, they will inevitably form spherical data clusters in the data set, resulting in the weak recognition ability of these algorithms for other shapes of data clusters. At the same time, the division-based clustering algorithm mainly considers the highest degree of global cohesion of each data cluster in the data set when dividing the data clusters, and thus the separability among the data clusters is relatively weak. However, there is not much research on scoring methods at the content level, such as whether candidates express the content completely, whether the wording is accurate, and whether the sentences make sense. The process of using machines to evaluate the quality of translation should have two main steps: first, accurate speech recognition to get the translated text of the answer speech; second, semantic analysis of the translated text to compare the differences between that text and the standard answer, and thus scoring accordingly. Considering the recognition results of the current speech recognition system, there is still a gap between the results we require for scoring purposes.

For the fuzzy clustering algorithm, the outlier sample points of the outlier clusters will exert a large pull on the center of each data cluster, making the center of each data cluster shift, reducing the ability of the fuzzy clustering algorithm to extract the essential structure of the data clusters, while the presence of nonoutlier noisy sample points will make the data cluster structure more fuzzy, increasing the difficulty of the fuzzy clustering algorithm to extract the essential structure of the data clusters. Existing research suggests that noisy sample points should be assigned a low affiliation in each data cluster. In this case, the power of outlier sample points is greatly reduced or even erased [6]. In contrast, in the existing research on fuzzy clustering algorithms, many clustering algorithms achieve this goal by relaxing the affiliation constraints. As with the above NC model, PCM with the relaxation term does not only reduce the influence of outlier sample points to some extent, but also reduce the influence of normal sample points to a large extent [7]. In addition, the lack of interactions between data clusters in the PCM model makes the algorithm unusually sensitive to the initial values, and it is easy to divide the same class of samples in the dataset into different data clusters. In other words, the lack of interactions between data clusters makes the PCM too weak in extracting the essential structure of data clusters, resulting in the lack of robustness of the algorithm [8].

An automatic scoring system is essentially a computer simulation of a scorer scoring answer sheet, and the magnitude of the variability between the system scoring results and the manual scoring results reflects the performance of the automatic scoring system [9]. To build an automatic scoring system, a standard set of manual scoring data is needed first. These data are also referred to as the raw scores of the test takers’ answers. The goal of the study is to make the machine scoring results as close as possible to the test taker's raw scores. We can evaluate the performance of the automated scoring system based on both the correlation and consistency of the scoring results with the raw scores [10]. Based on the relationship between students’ academic assessment and teachers’ teaching ability, an evaluation model of teachers’ teaching ability development is proposed [11]. The components of teacher training students’ professional skills and their evaluation criteria were analyzed. Among them, teaching skills are divided into three areas: instructional design, teaching practice, and teaching evaluation. Educational skills included ideological education skills and classroom management skills [12]. The structural model of educational teaching ability was constructed, and the core competence group split the teachers’ ability into teaching skill level, teaching design, teaching practice, and academic inspection, and an attempt was made to construct an evaluation index system of classroom teaching ability of mathematics teacher training students [13]. The evaluation index system of physical education teaching skills microgrid training was constructed by using the hierarchical analysis method, and the evaluation index system of teaching ability of physical education students in Beijing University of Physical Education was constructed by using hierarchical analysis method and fuzzy rating method.

This paper defines teaching ability as various knowledge and abilities that should be possessed to complete education and teaching activities, which specifically includes design ability before class, practical ability in class, and reflection ability after class. Make accurate predictions on new data outside the dataset. Researchers use generalization ability to refer to the predictive ability of a model on new data. The precourse design ability includes the ability to design teaching and learning and the ability to clarify teaching concepts; the classroom practice ability includes the ability to express oneself orally and in writing, the ability to organize and implement teaching and learning, and the ability to teach with information technology; the postcourse reflection ability includes the ability to evaluate education and teaching, the ability to conduct research, and the ability to communicate and cooperate with students, colleagues, and parents. Based on the analysis of the relevant background, this study focuses on the question of “how to construct an evaluation index system for teacher education ability of teacher education students.” Based on this question, this study analyzes the main problems of the current evaluation index system of teacher training students’ educational teaching ability by using questionnaires and interviews and highlights the necessity of constructing an evaluation index system of teacher training students’ educational teaching ability.

3. Design of Fuzzy Mean-Shift Clustering Algorithm

To improve the extraction ability of fuzzy clustering algorithm to the essential structure of data clusters, this paper constructs the synergistic information by introducing the mixed Gaussian model (GMM) and then adopts the synergistic clustering technique to feed the synergistic information back into the fuzzy clustering model and change the distribution function of affiliation in the fuzzy clustering model, to improve the extraction ability of clustering algorithm to the essential structure of data clusters [14]. Experimental results show that the Gaussian cooperative fuzzy c-mean clustering algorithm (GCFCM) proposed in this paper not only has good robustness in dealing with noisy datasets, outlier datasets, and imbalanced datasets, but also shows excellent clustering performance when dealing with real datasets. It can not describe the degree of negation that a sample does not belong to a certain class. Therefore, the fuzzy sets extended to intuitionistic fuzzy sets.

The computational complexity of the neutrality obtained from the symmetric regular term is significantly reduced. Therefore, the RPFCM algorithm with symmetric regular terms has some potential applications compared with the FC-PFS algorithm with exponential regular terms. However, none of the above algorithms consider the pixel neighborhood space information, which leads to the clustering algorithm lacking sufficient robustness to noise and makes it difficult to obtain satisfactory segmentation results. Therefore, it is necessary to introduce the neighborhood space information into the graph fuzzy clustering segmentation algorithm.

As can be seen from (2), the adaptive weight coefficient is closely related to the tightness of the clustering objective function, and the larger the value of the subclustering objective function corresponding to the original image and the subclustering objective function corresponding to the filtered image is, the more it indicates that they are not conducive to the clustering tightness needed, so that the corresponding weight of this subobjective function should be smaller to obtain a better overall tightness. Because the subclustering objective function corresponding to the original image is directly using the original image information and contains more detailed information, while the subclustering objective function corresponding to the filtered image reflects the image contains less noise information; if the subclustering objective function term corresponding to the original image is smaller, the larger the weight is, which is more favorable to retain the image details; conversely, the smaller the value of the clustering objective function corresponding to the filtered image is, the larger the weight is, and the role played by the algorithm noise reduction is dominant, reflecting the ability to suppress noise. Using the Lagrange multiplier method, the expression of the iterative solution of model (3) can be obtained as

Using the calculated intuitive affiliation information, the affiliation is corrected by (4) and (5), so that the pixel affiliation information can incorporate the neighborhood space information into the clustering process.where denotes the modified affiliation, and denotes the new affiliation information normalized by the neighborhood space information and the affiliation information. The parameter p denotes the weight factor acting on when the spatial coordinate of is the maximum value of the vertical and horizontal distance between the center pixel and the neighboring pixels.

Finding the first-order partial derivative of the LaGrange function concerning the cluster center and making it equal to 0, we get

The GCFCCM uses the EM algorithm to solve the optimization problem, and the specific steps are shown in Table 1.

IIC is a half-duplex serial communication bus, which is mainly used between the expert and the sensor control module. As a master-slave structure, the IIC bus allows multiple sense control peripherals to be connected and uses different IIC addresses to distinguish peripherals, with strong reusability. To ensure reliable data transmission, the IIC does not allow multiple peripherals to control data simultaneously, and peripherals can send start-up data when the data line is idle, as shown in Figure 1.

The hardware system is responsible for the functions of sensing the physical environment, interacting with the software platform, and responding to the commands issued by the platform in the whole learning system, and it needs to ensure that the master control program can run stably, and the communication process can be completed smoothly [15]. In FCM, no matter how far the sample point is, it will exert force on the data cluster that does not belong to it, and the force of this type of sample point on each data cluster is almost the same. At this time, the long-distance sample points generate a large pulling force on the center of each data cluster, so that the center of the data cluster is shifted to a certain extent. The main control chip in the hardware system should be compatible with various perception and control modules and various ways of interaction with the software platform. Therefore, the main control chip of the hardware system uses the embed software library to build the code function, uses IIC, UART, and other communication protocols to realize the reading and writing of sense control modules, and uses BLE technology and DAP Link components to realize the interaction with the software platform.

UART is a full-duplex serial communication bus that is mainly used between the expert and submaster. UART consists of four signal lines, which are a high-level line, a ground line, transmit data line, and receive data line. When using UART serial communication, the two master devices should first synchronize the high and low levels; that is, connect the corresponding high-level line and ground line; then make the data line connection, and for the two masters A and B of serial communication, it should be noted that the transmit data line of master A should be connected to the receive data line of master B, and the receive data line of master A should be connected to the transmit data line of master B; finally, from the software level, it should be agreed finally that the transmission rate and data bits should be agreed upon to complete the data communication.

One of the many improved algorithms of the FCM algorithm describes the influence of samples on the formation of affiliation and cluster centers in the clustering process by adding additional constraints to the samples in the data set, and this construction idea is called conditional fuzzy clustering. The conditional fuzzy clustering method describes the degree of influence of a sample on different class constructions in the clustering process by the difference between the distance of a single sample to a cluster center and the distance of that sample to all cluster centers.

Neighborhood information: the main reason for this is that these fuzzy clustering algorithms applied to image segmentation are proposed based on the FCM model. The FCM clustering algorithm has insufficient ability to extract the essential structure of data clusters, which makes it difficult for the clustering algorithm to obtain a suitable image region division structure in the image segmentation process [16]. In the case of incorrect image region segmentation, the smoothing of the image tends to make the important information in the image lost and, at the same time, make the boundary between different regions in the image more blurred. Therefore, the key point to improve the quality of image segmentation is how to improve the ability of fuzzy clustering algorithms to extract the essential structure of data clusters.

Adjusting the distribution structure of interactions between data clusters is the key to improving the ability of fuzzy clustering algorithms to extract the essential structure of data clusters. The introduction of affiliation makes the interaction force between data clusters exist, and the sample points have an active force on all data cluster centers. From K-Means to FCM, the existence of intercluster forces transforms the data cluster boundaries from exact boundary lines to fuzzy boundary regions, and the robustness of the clustering algorithm is improved. However, in FCM, sample points exert forces on nonbelonging data clusters no matter how far away they are, and the forces of this class of sample points on each data cluster are nearly the same. At this time, the distant sample points (e.g., outliers) exert a greater pulling force on the center of each data cluster, making the centers of the data clusters all shift to a certain extent, and the data cluster structure becomes more blurred, increasing the difficulty of the fuzzy clustering algorithm to extract the essential structure of the data clusters, as shown in Figure 2.

The differentiability between clusters decreases, instead of increasing the influence of outlier sample points on the clustering results. The use of LI or L2p parameters as the distance measure in the fuzzy clustering algorithm provides a sparser affiliation description for the sample points in the central region of the data clusters, which increases the power of these sample points on the data clusters to which they belong and improves the degree of cohesion of the clusters. In addition, using a higher growth rate of the function value as the distance measure function in the fuzzy clustering algorithm can reduce the force of the middle and long-distance sample points on the data clusters to which they do not belong and increase the fuzzy degree of the affiliation representation of the sample points located at the junction of the data clusters, thus increasing the differentiability among the data clusters. In summary, different growth rate paradigms have different advantages and disadvantages in fuzzy clustering models. To make better use of the advantages of different paradigms, adaptive elastic distance with adaptive properties was constructed in this paper. The adaptive elastic distance combines the advantages of different types of paradigms in fuzzy clustering while avoiding their disadvantages and thus is more suitable for fuzzy clustering models such as FCM.

4. ELT Proficiency Assessment Model Design

The use of deep learning networks requires manual tuning of the following important parameters, which are the number of hidden units, learning rate, Dropout, Epoch, Batch, and optimization algorithm. The essence of building hidden units is to create a new space that represents the input sample space elements with a new spatial architecture. The hidden units are constructed to better explain the output layer variables. The setting of the number of hidden cells affects the number of operations and the accuracy of model computation. Underfitting can occur if too few hidden cells are set, while overfitting can occur if too many hidden cells are set. Data fitting using models is the basic problem that needs to be solved by machine learning [17]. The goal of machine learning is to make accurate predictions on new data outside of a limited data set by training on that data set. Researchers use generalization ability to refer to the predictive ability of a model for new data. Both underfitting and overfitting reduce the generalization ability of a model. Both phenomena are manifestations of the mismatch between model learning ability and data complexity. Underfitting is because the model does not have enough learning ability to learn the common patterns in the dataset, which is reflected in the poor performance of the training and test sets. Overfitting is because the model learns too much and considers the individual patterns captured in the data as common patterns, which is reflected by the good performance of the training set and the poor performance of the test set.

In terms of instructional design, teacher trainees need to focus on thinking about how to flexibly combine teaching methods with student characteristics, how to develop teaching strategies based on student characteristics and subject characteristics, how to reasonably use multimedia in the classroom for teaching, how to make teaching aids, and how to evaluate teaching after class. After completing the instructional design in the first stage, the implementation of teaching begins. In classroom teaching, from introduction to demonstration lecture to a conclusion, teacher trainees need to do the transfer of teaching design to teaching practice and do every step of board demonstration and classroom interaction. When designing and organizing extracurricular activities, they should pay attention to the fact that the content of the activities should be related to the subject, that they can be participating by all the students, and that the content of the activities is exciting and interesting, and the forms of the activities are rich and diversified. After conducting all the teaching activities, the teacher trainees should be able to apply their knowledge of pedagogy and psychology to reflect on the shortcomings of the teaching activities, drill down on the areas that need improvement, and improve the quality of teaching, as shown in Figure 3.

The main response of educational teaching ability is the teacher-student relationship. Teacher-training students are both ordinary college students and future people's teachers at the same time, and they have dual identities, so the development of the evaluation index system should consider the special characteristics of their identities. On the one hand, as a student, the main responsibility of a teacher-training student is to learn these professional skills, which prepare him/her for future educational teaching activities; on the other hand, as a future people’s teacher, his/her main responsibility is to transfer the skills learned in school to actual teaching activities. Therefore, indicators should be developed to promote the development of teacher trainees’ own abilities and provide a foundation for future teaching.

The researcher interpreted and analyzed the competency structure of information-based teaching from several perspectives. Then, connect the data line. For the two masters A and B of serial communication, it should be noted that the sending data line of master A should be connected to the receiving data line of master B, and the receiving data line of master A should be connected to the master. Control the transmission data line of B; finally, from the software level, the transmission rate and data bits must be agreed upon to complete the data communication. In the above analysis, the main perspectives are from technical support, pedagogy, and knowledge structure. About the above-mentioned scholars’ perspectives and the research and analysis of related literature, we can find that the four competencies of designing, implementing, evaluating, and monitoring information-based teaching are mentioned most frequently [18]. The interpretation and discussion from the perspective of pedagogy involve various practical abilities related to teaching before, during, and after class. Secondly, the concepts and skills necessary to carry out information-based classroom teaching are addressed. In addition, the attitudes and ethics of using and technology products appear several times in related studies. These are the prerequisites for carrying out informational teaching, are prepared for informational teaching, have an important impact on the quality of teaching and the results achieved by teaching, and are their native competencies. Therefore, the study of competency structure should consider not only various practical competencies related to the teaching process, but also their native competencies, as shown in Table 2.

The teaching operation ability is the concrete implementation of the planning of the whole teaching process, which corresponds to the teaching implementation ability; the teaching monitoring ability includes the teacher’s regulation and control of all aspects of the teacher’s teaching and students’ learning, as well as the real-time assessment according to their teaching and students’ learning, and timely feedback and summary, adjustment, and correction, which corresponds to the teaching evaluation ability [19]. In this study, teaching evaluation competencies were extracted separately, planning and preparation were subsumed into instructional design, and control and regulation and reflection and correction were subsumed into instructional implementation. Discipline-specific teaching competencies are more specific competency that is demonstrated based on a particular subject. At this stage, technology and curriculum are more related to each other. Compared with traditional subject competencies, teachers’ ability to adopt various technologies to carry out teaching research is more in line with the requirements for teachers’ competencies in the information age, and this ability is different from teachers in higher education institutions who focus more on academic research, so it is categorized as information-based teaching research competency.

The initially extracted indexes are more complex due to the large number and similarity of meanings, and there are crossover and duplication, so it is still necessary to respond appropriately for scientific categorization and screening [20]. At present, the method of expert consultation is mostly used, which is a kind of consultation questionnaire form issued to respondents by letter or e-mail using anonymity, and the suggestions of each expert are collected and collated, then fed back to the experts again, and cut off when the difference of each expert on the observed indicators reaches the permissible range. The indicators initially extracted based on literature research as well as theoretical research still need to be modified and improved with the help of experts’ experience to obtain good validity.

5. Analysis of the Results

5.1. Analysis of the Performance Results of the Fuzzy Mean-Shift Clustering Algorithm

Different algorithms often have different model assumptions about the data, and thus such algorithms work well with the corresponding datasets but often do not work well with datasets that do not match the model assumptions. The main control chip of the hardware system uses the Mbed software library to build the code function, uses the communication protocols such as IIC and UART to read and write the perception control module, and uses the BLE technology and the DAP Link component to realize the interaction with the software platform. Therefore, to measure the comprehensive performance of clustering algorithms, it is necessary to use real data sets to test the effectiveness and practicality of clustering algorithms. It is well known that the UCI dataset is the real dataset that is commonly used to measure the performance of clustering algorithms. In this paper, experiments were conducted on 25 UCI datasets to compare algorithms including FCM, PCM, probability fuzzy c-mean (PFCM), interval probability fuzzy c-mean (IPFCM), and relative first fuzzy c-mean (REFCM). Among them, FCM, PCM, PFCM, IPFCM, and REFCM all adopt Euclidean distance as the distance metric function of the clustering algorithm, while EFCM adopts the affiliation-based adaptive elastic distance proposed in this paper as the distance metric function.

In this experiment, for FCM, PCM, PFCM, REFCM, and EFCM, the fuzzy coefficient m = 2 is set, and for IPFCM, the interval value of m is adjusted. In this experiment, the mean value of RI (%) obtained by each algorithm on each data set was recorded by taking the mean value of multiple experiments to eliminate the influence brought by random initial values in the clustering algorithm, and the results of 10 clustering experiments were taken for each experiment as shown in Figure 4.

It is obvious from these experimental results that the SA achieved by the ARFCM algorithm is significantly higher than the other algorithms, indicating that ARFCM has better noise immunity performance in image segmentation. In addition, ARFCM demonstrates significantly better robustness than other algorithms when dealing with pepper noise datasets. The experimental results show that ARFCM is extremely robust to pretzel noise.

Reliable regions consisting of reliable sample points for each data cluster in the dataset can be determined adaptively. The above algorithms do not consider the spatial information of pixel neighborhood, which leads to the lack of robustness of the clustering algorithm to noise, and it is difficult to obtain satisfactory segmentation results. Therefore, it is extremely necessary to introduce the neighborhood space information into the graph fuzzy clustering segmentation algorithm.

When the compactness within the neighborhood of a sample point is high, the lower the possibility that the sample point is noise, and the higher the reliability of the sample point; when the compactness within the neighborhood of a sample point is low, the lower the possibility that the sample point belongs to the data set and the higher the possibility that it is noise. In addition, based on the given k parameters in kNN, the neighborhood variance of each sample point in this model can be obtained by preprocessing and does not need to be solved iteratively with the clustering algorithm, which can avoid the excessive computational complexity brought by the kNN method.

To fully demonstrate the difference of the affiliation distribution functions of the three clustering models, FCM, K-Means, and AR-RFCM, the affiliation distribution functions of the three clustering algorithms on a one-dimensional data set containing two data clusters are shown in Figure 5. Here, a large value is given to the β parameter in the AR-RFCM model, so that the dummy class does not work, avoiding the situation where the sum of the affiliation values of individual sample points is not one. In this case, the AR-RFCM clustering model is approximately equal to the weighted combination of the FCM model and the K-Means model. In the FCM model, no matter how far the sample points are from the center of the data clusters, the affiliation value of the sample points to the data clusters to which they do not belong is not zero, and the intercluster force is provided not only by the sample points located at the junction of the data clusters, but also by the sample points that are far from the center of the clusters. The farther the distance between sample points located in different data clusters, the better. The clustering algorithm based on partition often has the characteristics of simple model, less computation, and strong robustness. In the AR-RFCM model, for sample points located in the intersection region of data clusters, the intercluster forces of sample points on nonbelonging data clusters are preserved, thus preserving the intercluster forces in the fuzzy clustering model and improving the intercluster separability of the clustering algorithm, while, for sample points far from the intersection region, the affiliation value of 1 for the belonging data clusters reduces the unnecessary intercluster forces and improves the data intercluster separability.

In general, the AR-RFCM model utilizes the sparse representation of sample point affiliation in the K-Means model to reduce the unnecessary force of reliable sample points on nonbelonging data clusters, increase the degree of cohesion of data clusters, and indirectly reduce the influence of noisy sample points. At the same time, AR-RFCM also has the characteristics of the FCM model, which retains the intercluster force, and the intercluster force is concentrated on the edge sample points at the junction of data clusters, highlighting the force of the edge sample points and improving the generalization ability of the data cluster structure. In addition, AR-RFCM provides very low affiliation values for outlier sample points, which reduces the influence of outlier samples on the clustering results.

5.2. Analysis of ELT Proficiency Assessment Results

Figure 6 shows the correlation between the scoring results of the two models and the original scores. Both model scoring results show a positive correlation with the original scores. The word embedding layer using the BERT model maintains a moderate correlation with the original score, with an average similarity of 0.49 and a maximum similarity of 0.61, while the word embedding layer using the Word2Vec model maintains a low correlation with the original score for half of the questions, with an average similarity of 0.38. The correlation between the word embedding layer using the BERT model and the original score is always higher than that of the Word2Vec model. The correlation between the BERT model and the original score is always higher than that of the Word2Vec model.

Both models have an adjacency agreement rate of 90%, indicating that the overall scoring trends between the two models and the raw scores are similar. However, when only one factor of keyword scoring is considered, the agreement rate between the scoring results of the two models and the original score is not very high. The semantic scoring model using the BERT model to generate word vectors has a maximum agreement rate of 74% with the original score, with three questions having an agreement rate higher than 60%, and an average agreement rate of 56%; the semantic scoring model using the Word2Vec model to generate word vectors has a maximum agreement rate of 59% with the original score and an average agreement rate of 48%.

The semantic scoring model using the BERT model to generate word vectors has a higher agreement rate with the original scores and a higher adjacent agreement rate than the Word2Vec model, so the semantic scoring model using the BERT model in the word embedding layer has better agreement with the original scores than the semantic scoring model using the Word2Vec model.

The former compresses the features of the input data by encoding to obtain another representation of the input data; the latter restores the original input by decoding. The encoded features are considered to represent an approximate representation of the input if the results reduced by the decoding layer are very close to the features of the input data. The ideal data cluster structure satisfies the following: the similarity between the sample points located in the same data cluster is greater, and the difference between the sample points located in different data clusters is greater. Therefore, how to measure the similarity and difference between sample points is particularly important. The UAE-based sentence semantic scoring model is to mine the semantic features of the sentence through the encoded hidden layer neurons. The recurrent neural network URAE combined with the self-encoder reconstructs the features compressed into the parent nodes to form the child nodes again and uses a method to measure the error between the atomic nodes and the reconstructed child nodes to judge whether the feature extraction is effective.

The random forest algorithm was replaced with the linear regression prediction method for score fusion, while the speed of speech scoring method, keyword scoring method, and sentence semantic scoring method was kept consistent. Firstly, the grade distributions of the scoring results of the two models were compared with the grade distributions of the scores of the original scores, and the specific experimental results are shown in Figure 7. Regardless of whether the random forest algorithm or linear regression prediction algorithm is used for score fusion, the final score results of both scoring models conform to the law of normal distribution. In terms of the four ABCD grade distributions, the automatic scoring model using the random forest algorithm for score fusion is closer to the original score grade distribution.

The automatic scoring model using the random forest algorithm for score fusion has the highest similarity to the original score of 0.94, which shows a very high correlation with the original score and is much higher than the similarity between the scores of the two raters. Regarding the automatic scoring model using linear regression prediction method for score fusion, although some questions outperformed the manual scoring performance, the model’s performance fluctuated more from question to question and was not stable enough, and the average similarity between the scoring results of this model and the original scores was 0.57, which was moderately correlated. In terms of similarity, the random forest algorithm performed more consistently than the linear regression prediction method, and the scoring results were more correlated with the original scores.

6. Conclusion

In this paper, we have tried to transcribe the text using the current general-purpose speech recognition engine with high accuracy, but the recognition results of the current speech recognition system are still far from the results we required for scoring purposes. At the content level, two feature scoring parameters, keyword and sentence semantics, are selected, and the content features are scored after the candidates’ recordings are converted into texts by manual conversion. Therefore, this study used a manual approach to convert candidates’ speech data into text form for scoring the text content when experimenting with a content-level feature scoring parameter scoring method. However, a complete speaking scoring system requires a combination of speech recognition and automatic scoring that can extract both speech features and content features. Different feature scoring parameters are selected to build targeted scoring models. In the study of automatic scoring of speaking, different question types have different examination focuses, and different feature scoring parameters need to be selected according to the scoring rules and examination objectives, and research is conducted to explore scoring features such as grammar and intonation levels to build a more comprehensive scoring mechanism. We should use a variety of teaching methods to evaluate them. Each evaluation method has its limitations, so we should use them in combination to complement each other’s strengths and weaknesses and pay attention to the diversity and flexibility of evaluation methods, focus on the actual effect in the evaluation process, and use the assessment tools reasonably and correctly, and make a qualitative leap from quantity to quality of evaluation, knowledge and skills, attitudes, emotions, and values.

Data Availability

The data used to support the findings of this study are available from the corresponding author upon request.

Conflicts of Interest

The authors declare that they have no conflicts of interest or personal relationships that could have appeared to influence the work reported in this paper.

Acknowledgments

This work was supported by the Department of Foreign Language, Nanchong Vocational and Technical College.