Abstract

Data continually act as a substantial role in business and industry for its daily activities to smoothly functional. The data volume is growing with the passage of time and rising of information technology. Using data mining techniques for quality evaluation and business English teaching is essential in the modern world. These technologies are introduced in the classroom, especially in online classes during the COVID-19 pandemic. To analyze the quality of business English teaching, this paper uses multimedia and data mining technologies. Initially, the multimedia data are collected during classes, and the association rule recommendation algorithm using data mining is applied. Based on collaborative filtering algorithms in association rules, indicators for teaching quality evaluation in colleges and universities are set up. Next, the actual teaching data of a university is used. Taking business English as an example, the algorithm that has been built is tested. The application of the algorithm is tested, and the teaching process of College Business English is evaluated. Finally, the conclusion is drawn that data mining technology can describe the behavior of teaching well and evaluate it, and it has the potential of popularization.

1. Introduction

In the modern era, the development of technology already gets into various aspects of life; in particular, it is popular in education. Thanks to the Internet, all obstacles have been eliminated. Everything is changed in our daily life with the Internet, as most online education has started up and paid attention to online courses. In recent years, the online Internet industry has been rising and developing rapidly. Mobile terminals such as smartphones are upgrading rapidly. Mobile Internet has penetrated all aspects of our daily necessities, from social networking to e-commerce, from taxi software to takeaway platforms. It can be said that every person with a smartphone is connected to the mobile Internet [1]. However, how to make the technology and Internet for educational purposes is a challenging task. The main problem is introducing technology to an organization; for this purpose, we need to develop a methodology, introduce the developed method in that organization, and finally evaluate. To propose an evaluation methodology on teaching data characteristics based on traditional electronic platforms’ products [2], the institutes’ teaching evaluation methods vary, roughly comprised of two sets: one is a single qualitative evaluation method.

This approach is totally different from the current teaching activities’ criteria about courses’ measurement. It is too versatile and rough to critically, reliably, and in-depth represent actual teaching. The quantitative evaluation in quantitative analysis primarily represents the efficiency of coursing cases. The existing analyzed methods, on the other hand, have numerous flaws. For example, the evaluation contents are simple, and the subject and evaluation methods are narrow. Therefore, the expression, analyses, interpretation, and teaching evaluation results can pique teachers’ and students’ interest. It makes the evaluation play a more significant role in a wide range of college teaching level evaluation activities [3].

In the literature, there are a lot of discussions, and empirical analysis of this problem is given. First of all, the mainstream teaching evaluation methods consisted of background assessment, association measurement, and collaborative filtering metrics [3]. Furthermore, the core recommendation algorithm used in the evaluation method is also presented. In general, we always utilize the technology of data mining to mine corresponding correlations between goods and goods and between commodity categories in terms of research methods. Based on the association rules algorithm’s classical Apriori algorithm, judgment about the product’s information and user location is tried to increase. According to the actual user behavior data, teaching and evaluating the effect are considered [4]. This paper does compound work and can solve the following problems.Multiple disciplinary technologies, in most cases, computer technique always combined with Internet technology, are integrated into the online teaching environmentThe association rule recommendation algorithm in data mining technology is elaborated in detail, and a list of indicators for automatic evaluation of business English course is set upA chain of operations of users are mainly analyzed, and the algorithm constructs evaluation of the curriculum

We construct our paper into 5 sections and the next sections are as follows. In Section 2, we introduced the related methods in detail in recent years, and our proposal is introduced in Section 3. The experimental results and performance analyzed are discussed in Section 4, and the conclusion and the future work are placed in Section 5.

Although data mining history is relatively short, its development has been developing rapidly since the 1990s. Besides, it is a multidisciplinary synthesis product, and a variety of data mining definitions have been proposed. For example, the “advanced method of data exploration and building-related models based on various related instances” is proposed by the SAS Institute [5]. Some scholars say that pattern recognition technology, statistics, and mathematical techniques are used in the process of finding useful corresponding connections, patterns, and momentum [6]. “Data mining is a procedure for finding meaningful and valuable information in large databases” which is put forward by scholars [7]. Although the research on data mining technology in the commodity recommendation system as well as the related study of recommendation in domestic have been developed in recent years, China is still behind compared with foreign countries. The existing recommendation systems which are recommendation depth, recommendation quality, recommendation scale, and recommendation personalization in China still need further improvement. The main reason is the late start and the backward research of relevant theories [8]. In recent years, corresponding theoretical research and particular system application started in China after 2014 [9]. Educational Data Mining (EDM) and Learning Analytics (LA) are two hot research fields that seek to strengthen courses’ outcomes by assisting people (teachers, pupils, and staff) in making compound data-driven decisions. Increased computer storage and computation power can handle massive quantities of data such as machine learning, and data mining methods and techniques have aided their development [10]. We do our best to identify impact factors hidden in the teaching process, circumstances, and climate using the teaching connection data analysis tool and take advantage of different methods to improve the quality of courses. For thoughts and strategies in teaching-oriented data warehouses, data mining algorithms are used to evaluate teaching quality variables and discover information. The quality of English teaching is dependent on machine-learning algorithms such as the Apriori algorithm. The algorithm employs data mining algorithms to uncover hidden variables, such as the college English and learning environment, which is highly hidden but has significant research value. Experimental analysis indicates that the proposed approach improves the standard of college English teaching and offers successful methods for weighing the benefits and drawbacks of various teaching interventions or student groups’ applicability. A foundation is provided by a study of college English teaching quality [11]. The study has presented a comprehensive analysis of big data in the field of healthcare [12]. The applications of deep learning algorithms and multicriteria decision-making in big data are used [13]. The field of healthcare is facilitated by the applications of big data and its insights [14].

3. Methodology

This section will discuss data mining technology’s recommendation algorithm, evaluation index based on data mining technology, and data sources. The details of these sections are given as follows.

3.1. The Recommendation Algorithm of Data Mining Technology

The data mining technology is also called data mining and data exploring. Data mining refers to searching information hidden in a large number of data by the algorithms or rules generally. It can be mixed with many disciplines such as computer science, statistics, online analysis, and processing to achieve the above objectives. The association algorithm is essential in data mining. In 1993, it was proposed for the first time by Geng et al., whose core is a recursive algorithm utilizing the two-stage frequent set idea [8]. These rules have been successfully applied in the physical retailing industry like WAL-MART, and online submission is also widely used [15]. Association rules are the correlations among various commodities. The application scenario of this recommendation is that many people buy a product “A” and buy products “B” and “C” after; so once a user finds “A,” the system recommends “B” and “C” to the user [15]. It is a good shopping model because we often buy things according to the current needs; i.e., the exciting point of the user’s shopping is changing times. Association rules will provide entirely different items to users, such as buying digital cameras. It will put forward memory cards, batteries, and so on [16]. For special systems, the professional term is the diversity of recommendation results. The essential association rules are discovering co-occurrence or rules (frequent mining items) [17]. The classical algorithms used are the Apriori algorithm, FP-Grmvth algorithm, etc. Algorithms such as rule-based algorithms will be introduced next [18].

The first is the degree of support, which is used for a certain item. The support level of the material set is the number of times the material set appears divided by the total number of records, and its meaning is to measure the frequency of occurrence of the material set in the entire transaction set. When looking for rules, I hope to focus on frequently appearing items [19].

Next is the confidence level, which is for association rules. The association rule is a formula for calculating confidence level. The normal confidence level is the ratio of occurrences of item {a, b, c} in item {a, b}. If true on {a, b}, the happening probability of {c} can be noted as [20]

As for lifting degree, there is a problem that needs to be noted when finding a possible association rule. When RH’s support level is very significant, even if the rule’s confidence level is high, the rule may be invalid [21]. There is an example that in the 10000 transactions analyzed, 6000 transactions contain x, 7500 have y, and 4000 transactions include both. The support degree of an association rule is 0.4, which looks pretty high, but it is a mislearning. After purchasing x, there is a probability of users to purchase Y. Without any precondition, the user has the likelihood of (7500 = 1 DDDD) -D.75 to buy Y. That is to say, the probability of setting to buy x will reduce the user to buy Y, so x and Y are dependent totally. So, Lift’s concept is introduced, meaning that rules to enhance the degree of measurement are set {a, b} and {c} independently. The meaning of the lifting degree of rules lies in the autonomy of metric set {a, b} and {c} if ; both are independent. If the value is less than 1, this indicates that A conditions (or A events) and B events are mutually exclusive. Generally speaking, when data mining is more than 3, the association rules are recognized as valuable [22].

Association rules consisted of two steps: finding frequent item sets n items firstly and then generating item sets (itcmsct) by . Therefore, the minimum support degree needs to be specified to remove the nonfrequent items. Then, we should get the rule before. Therefore, the minimum confidence needs to be assigned to decrease all weak rules [23].

3.2. Evaluation Index Based on Data Mining Technology

Collaborative filtering recommendation [24] is one of the most widely used methods in recommending a curriculum recommendation system method. Under the assumption that you want to find the type of course which is going to do or tends to do for user A, firstly you find user B that has a similar interest to user A, and then, user B has made a course recommended to user A. The usual practice is to calculate the distance between users according to the information records of the user’s class and then to predict user A’s preference to the course a by using the weighted value of user A’s nearest user B to the course evaluation, thus recommending user A according to the degree of preference. It is effortless to understand. In our daily life, we often refer to good friends’ views or the courses that have been carried out to choose the type of course, and the friend is here B, a user similar to our interest [25]. Collaborative filtering is divided into the basis of the item and the user’s origin, and the difference lies in the distance between users and the distance between courses. The advantage of collaborative filtering recommendation is that the user’s potential needs and preferences can be discovered and based not only on the user’s history course. The course characteristics are not needed, so it performs well in the courses that are difficult to label and describe their characteristics. However, these approaches are not perfect at all. Because they refer to the user’s teaching score, the results are mainly for evaluating the joint filtering recommendation, mainly used to evaluate the system that needs to be presented to the user [26].

There are many indicators to assess the difference between the predicted one and the real one, and the difference can be divided into many errors. Assuming T as the test set, rAa for user A to the course a, and rA for the system to predict user A for the study and evaluation, rMAX and rMIN are the highest and the lowest scores. The formulas of quota above are as follows:

After calculating the samples by formula (6), it is necessary to establish a soundness identification model to identify the samples. H (j) is assumed to represent the set of sample-specific types of pattern recognition in the sample. I (P) indicates the characteristic identification information of the sample data signal information, and the pattern recognition process is as shown in formula (4).

This formula’s variable represents the characteristic information of the curve of different data signals’ peak values. The variable represents the power series of the characteristic data signal. The variable represents the data cycle regression point value between frames.

Let , and is the parameter vector c to be evaluated.

Our proposal will be denoted as and . The grey-predicted discrete-time response function can be obtained by solving a differential equation: is the accumulated predicted value which is restored as

The is set as follows:

The first-order accumulation generation module is generated by using the accumulation generation algorithm:

For this equation,

The differential equation formed by the first-order gray module is

According to the definition of derivatives, there is

If expressed in discrete form, the derivative terms can be written as

Among them, the value of can only take the average of time and ; that is,

However, the user’s subjective factor in the product score is huge, even if the system can recommend the product to the user. The prediction of the user’s score is often deviant, which is why most websites offer the recommendation of the goods rather than the evaluation of the goods.

3.3. Data Sources

The data used here are real teaching data of a university with one month (11.18 to 12.18). The users` behavior is simulated and recorded, and the whole course is evaluated. In this paper, we divided all datasets into two groups. The one is the mobile user behavior data (tiatichi_mobile_recommend_train_user). The second one is a subproblem of all courses (tianchi_mobile_recommend_train_item). First, different userid fields are related to each user. However, because one user can register many accounts on the system for course selection, userjd is not one-to-one with the real traveler. Besides, to protect users’ privacy, the field has been desensitized rather than the real user ID. Similarly, itemjd corresponds to course mode, while item_category is the encoding of teaching mode, which is also desensitized. Behaviorjype corresponds to users` different behavior types.

4. Results Analysis and Discussion

The algorithm is tested here. Taking physical courses as an example, data information of physical education is obtained first by the algorithm to understand the users’ various behaviors, mainly including clicks and collection. Click behavior is the behavior of users to choose learning mode. Add collections is when users add the collectors’ behavior to the teaching mode. The favorite items are limited to 50, and users can directly select modes or delete the contents. Data do not provide users with the behavior of deleting courses in favorites. The collection is without any capacity constraints. Courses can be selected directly for users or deleted from the collection without letting users delete the collection of the collection. Tianchi mobile that recommends train user is shown in Table 1.

As shown in Table 1, the user space location is represented by the hash value as user_geohaSh. And these values at this location are synthetic using the spatial hash algorithm. In order to protect the privacy of users, the specific calculation process is kept confidential. The hash algorithm draws on the experience of the GeoHash algorithm, and the accuracy range of the hash value is about 150 m 150 m rectangle. The prefix of the hash value represents a larger rectangular grid. For example, the prefix with length 6 corresponds to the rectangle of UOOmMOOm. For example, 99suwas corresponds to a 150 m 150 m rectangle, and 99suwa corresponds to a rectangle containing 99suwas 1200 m 600 m. The Time field corresponds to the specific time of the user’s operation, accurate to the hour level, such as 2014112517, which indicates that a user has performed a certain operation between 17 and 18 points between November 25 and 18, 2014. In the tianchi_mobile_recommend_train_item table, the meaning of item_id and item_category is the same tianchi_mobile_recommend_train_user, and item_geohash is the hash value of the spatial geographic location of the course. tianchi_mobile_recommend_train_item is shown in Table 2. The hash value and its spatial location range are shown in Table 3.

The following is a preliminary exploration of the user’s operation behavior. There are 12256906 data in tabletianchi _ mobile _ recommend _ train_user, corresponding to 2876947 data, group by 8916 categories. Taking item_id and item_category as the counting standard, there are 10000 different user IDs. In one month, all users had browsing courses, 6730 users had collected courses for courses, and 8886 users had conducted physical education courses. However, it is worth noting that most users operate only one course within a month. At the same time, there are many classes that only one user operates in one month. For example, the only behavior_type is 4; i.e., 94.99% of the courses are only carried out. The problem of data itself provides many problems for rule mining. The user’s operating time distribution is shown in Figure 1.

Besides, as shown in Figure 1, user behavior in class has obvious time characteristics. The operation time of users is concentrated at 7 to 22 o’clock in the morning, which indicates that users’ behavior in mobile terminals is mainly in school learning time. But in addition to the time from 1 am to 9 am and 4 pm to 6 pm., the activity of customers in a day is almost the same. The explanation for this phenomenon is that from 1 am to 9 am, most people rest and go to school, and 4 to 6 pm is the time after school. In these two time periods, most users will not carry out physical education due to the user’s rest time and environment. To sum up, the user usually conducts the course browsing, selecting, and so on in the spare time, while the nonrest time is the main time of physical education distribution of users’ purchase time shown in Figure 2.

The temporal distribution of browse, collection, and that sort of thing and the overall operation time distribution of the user are described, respectively, in Figure 2. It can be seen that there is a more obvious difference between the users in different periods and that among the various types of behavior of the user, the behaviors of browsing and collecting do not all finally facilitate the development of sports courses. A comparison between teachers’ scores and the algorithm results in this paper is shown in Figure 3.

Then, the algorithm built here is used to evaluate the curriculum. The 10 types of physical education classes and teachers are selected to score the class results and figure out the average value of the score’s full score of 100. The evaluation of the score of professional PE teachers is compared with the evaluation of this algorithm to investigate the algorithm’s accuracy, and then the following results are obtained. The result that can be seen is that the difference between an algorithm and manual scoring is not significant. The biggest difference is in the tenth lesson, with a 5% error. The convergence of the algorithm is investigated, and the convergence results are shown in Figure 4, showing that the algorithm has good convergence and strong stability.

This article provides a detailed lens interface through manual lens segmentation of video to more objectively evaluate the impact of its English teaching effectiveness. Then, the student’s comprehensive performance is deeply studied and analyzed. The specific data is shown in Figure 5.

GM (1, 1) gray prediction algorithm will perform the effect of English teaching. After calculating the correct average rate after optimizing the influence of English teaching results on the prediction results, the comparison line chart shown in Figure 6 can be obtained. It was not difficult to see from the line chart that the college English teaching method based on industry needs had a significant role in promoting English textbooks’ effectiveness. The effect of the prediction results on English teaching results is shown in Figure 6.

5. Conclusion

At present, the development of education in China is booming, and teaching in Colleges and universities has made quite rapid progress. Integrating multiple disciplinary information into teaching has promoted teaching in China. Business English teaching and its quality evaluation mode are studied based on data mining technology under this background. First, the association rule recommendation algorithm in data mining technology is elaborated in detail. A collaborative filtering algorithm is based on association rules. Based on this, many indicators for automatic evaluation of business English courses are set up. Secondly, all datasets used in this paper are collected in real life. Data mining is done by association rule algorithm, and a series of operations of users are mainly analyzed. The algorithm constructs an evaluation of the curriculum. By comparing the difference of scores on teachers and algorithm evaluation, it indicated that our proposal is better than others. The convergence of the algorithm is tested, which is good and has strong stability, with the potential to be popularized.

Data Availability

The data used to support the findings of this study are included within the article.

Conflicts of Interest

The author declares no conflicts of interest.

Acknowledgments

This work was supported in part by the Key Project of Humanities and Social Sciences of Anhui Provincial Department of Education in 2020 (SK2020A0677), the 2019 Anhui School-Enterprise Cooperation Demonstration Training Center (2019xqsxzx04), the Anhui Provincial Excellent Offline Open Course in 2019 “English interpretation and translation” (2019kfkc184), Key Projects of Teaching and Research in 2020 (2020xjzdjy03), and the Research Project of Excellent Young Backbone Talents in Colleges and Universities of Anhui Province at Home and Abroad in 2021.