Abstract

In order to improve the online English teaching effect, the paper applies the sensor and human-computer interaction into the English teaching. The paper improves the sensor information by applying Kalman Filter, combines sensor positioning algorithm to trace the students in the English teaching online, and turns the kernels by the skeleton algorithm into corresponding coordinates of space rectangular coordinate system taking the waist as a coordinate origin to get a human-computer interaction skeleton model in the virtual reality. According to the actual needs of English teaching human-computer interaction, the paper builds a new English teaching system based on the sensor and the human-computer interaction and tests its performance. The experiments suggest that the smart system in the paper can effectively improve English teaching effects.

1. Introduction

The traditional English teaching mode has already been unable to fulfill the need of English teaching in the information age, especially in the period of the COVID-19. Online tutoring has already become a new trending. Therefore, it is necessary to conduct a revolution in the English teaching through information technology and smart technology [1]. With the extensive application of the computer, the users have been gradually increased, which are not just the early computer professionals. Generally speaking, the users have different needs and opinions toward the data, kinds of ways to solve the problems as well; and for the same user, he could present changes in different periods [2]. The traditional fixed user interface designed is not to the point, which cannot satisfy the specific need of the user and even the needs in different periods; the traditional Human-computer interaction system is just a system without any “flexibility,” and the users have to follow the interaction way and operate [3]. Sometimes, the user just needs a simple operation, which is obviously with a purpose especially understanding the user on the basis of previous operations, but the system is unable to know the real intention and just treats it as a normal operation. So this cannot cause a smooth information exchange between the system and the user. The user cannot freely interact with the computer, and the computer cannot deal with the requests from the user just like humans understand their friends. The user-centered human-computer interaction technology is the best effective way to solve the foresaid problems. The human-computer interaction system, to a certain degree, can allow the users to freely talk to the computer in their own familiar way; the computer has a certain ability of understanding and can understand the users to a certain degree or can smartly give the users a specific feedback although the intension is not clear enough [4].

A current research focus of human-computer interaction is to improve the performance of user interaction when users use computers to complete tasks. The improvement of user interaction performance includes a variety of human-computer interaction interface components, such as display media, display content, interface structure and style, and various input mechanisms. These human-computer interaction elements are closely related to user’s perception, perception, processing, and reaction to the computer. To get the best user interaction performance, the interactive system must be built on the basis of a full understanding of users. The first thing to study is how people conduct daily communication in real life, including the way of expressing intentions, the way of receiving outside information, and the way people express their intentions to people in the outside world and people’s response models when asking questions. These expression models and response models are related to the types of users, users’ experience, abilities, skills, and preferences in the field. At the same time, different users have different computer knowledge, comprehensive capabilities, and various factors that affect their interactions. It is necessary to understand the characteristics of all aspects of user behavior (from behavior to primary perception), establish the corresponding user model, then establish the user model of the corresponding application field, and improve the efficiency of interaction by improving system’s understanding of people.

Based on above studies, the paper applies the sensor and virtual reality into the English teaching and builds a smart system so as to improve the English teaching effect.

The ITS (intelligent tutoring system) is a multidisciplinary field of traditional CAI (computer aided instruction), AI (artificial intelligent), and cognitive science; ITS has internationally become a larger research mode and achieved convincing results [5]. Especially in recent 10 years, ITS grows fast abroad and turns into the commercial stage from research stage [6]. ITS can be established and maintain the student model; on one hand, the system can simulate the teaching decision-making and provide real-time tailored tutor to the students; on the other, the system can automatically analyze the historical data of a set of students in the student model and help teachers objectively understand the overall condition of students so that the teachers can timely adjust their own teaching plans and contents. The student model can tell the knowledge level, learning ability, and cognitive feature of the students; it is essentially a programme based on the algorithm that solves the actual problems in the way that students do [7].

Human-computer interactive learning systems using multimedia ingenuity are widely used at home and abroad. Human-computer interaction learning can provide students with a safe, predictable, repeatable, relaxed, and lively interactive learning environment. The system follows the basic principles of human-computer interaction and scientifically designs learning goals based on related theories and establishes corresponding human-computer interaction learning activities [8]. At the same time, the human-computer interactive learning environment pays attention to the design of interactive experience, is committed to letting students participate in interactive operations in an entertaining way, and focuses on the friendliness and fun of the game interface. The color matching and screen display design can attract students for different student groups. Attention is a system that mobilizes students’ interest in learning and guides students to learn actively [9]. Literature [10] designed and implemented a three-dimensional drawing model tool, which can outline and model the Lanwei model. Users can interact with gestures and pens. Although its accuracy is not as good as professional modeling software, it can satisfy students and nonbasic professional needs. Literature [11] uses an infrared camera to obtain the facial feature data of children, uses a pressure seat to obtain the child’s body posture, and combines the learning behavior on child’s computer to identify child’s rational state. Literature [12] uses computer vision technology to study joint attention (that is, to follow and guide others’ attention). The experiment uses a web camera to detect learner’s head posture in real time to estimate learner’s attention. Literature [13] establishes a multichannel learning environment by using video and audio fusion technology. In this learning environment, the video signal of the lips is obtained through the camera and merged with the voice signal obtained by the microphone. The speech recognition result and the handwriting recognition result are combined for human-computer interaction.

The rapid development of multimedia technology has also greatly promoted the progress of the human-machine interface. The human-computer interaction gradually uses multimedia input and output devices such as microphones and cameras [14]. With the rise and development of emerging disciplines such as cognition, science, artificial intelligence, graphics, and image processing, multimedia-based multichannel human-computer interaction has gradually become a research hotspot. Virtual reality technology is a computer system that can create and experience a virtual world. It uses computer technology to generate a realistic virtual environment with multiple perceptions such as sight, hearing, and touch. The user uses various interactive devices to be the same as the virtual environment. The interactive visual simulation and information exchange that produce immersive feelings in the interaction between the entities in the world is an advanced digital human-computer interface technology, and its appearance will change the way of life of human beings [15]. For the research of virtual reality technology, literature [16] searched the WPI database and counted the country/regional distribution of related patents. From the perspective of the development of human-computer interaction, it can be found that the overall trend is towards naturalization and intelligent development. It is believed that the future human-computer interaction design will bring people a more relaxed and comfortable life [17].

3. Sensor Intelligent Fusion Algorithm

Extended Kalman filter (extended Kalman filter, EKF) is suboptimal that performs Kalman filtering after linearizing a nonlinear system.

Assuming that the system equation is represented by a discrete equation that remains constant under stochastic nonlinear conditions, the equation of state described above and the sensor measurement equation of the moving student can be expressed as follows [18]: in which and represent the Gaussian white noise sequence.

And , and , assuming that can be expressed as the Taylor formula as followed.

is the Jacobi matrix of function on , then [19]:

Similarly,

If the Kalman filter incorporates several readings of the above ranging sensors, then can be expressed as in which is the distance from moving student sensor to a point on the wall in the environment; thus, the line segment characteristics of the global coordinates in the environment can be expressed as [20]:

The Kalman algorithm is (1)Testing phase

In the above formula, is the variance matrix of , and is the Jacobian matrix of . (2)Observation phase

From formula (7) and formula (10), the predicted measurement equation can be obtained:

We use the predicted error of the sensor measurements to correct the predicted equation of state , and we can obtain the predicted error of the measurements , which can be expressed as

The measurement error variance is

is the Jacobian matrix of in the measurement Equation (7) [21]: (3)Estimation phase

It is seen from the above two steps that is the predicted value of the state that is realized by extending the Kalman filter to obtain the prediction error information based on the measured value. Among them, the measured variance represents the corrected weight factor.

If the extended Kalman filter gain matrix is , state is estimated as , and its covariance matrix is , and the estimation value can be expressed in the following formula:

In conclusion, the recursive formula of EKF includes the test, observation, and estimation phase. To obtain the optimal estimation value for the system, we can assume that some parameters are not related to the system, including the initial input noise of the system and the measurement noise generated during the measurement in the process, then we can handle the system by the irrelevant Gaussian white noise.

The sources of error in information fusion are summarized as follows: (1)Milodometer system

An odometer is a sensor that uses data obtained from the mobile sensor to estimate the change of the object position over time, usually installed in the internal body position of the moving student. Its working principle is first use the pattern information or coding information on the code plate to obtain the rotation radian information of students’ left and right wheel, and the forward direction and speed changes of the students can be calculated through these radian information.

According to the statements above, the odometer can represent the line speed and angular speed of the moving student; thus, odometer movement model can be represented as shown in Figure 1. We take the student as a mobile robot. Based on which, assuming that the student radius of movement is , photocode disk is line/turn, in a cycle time , the movement distance of the left wheel is , the moving distance of the right wheel is , and the moving distance of the car body , the output pulses of student’s photoelectric code disk is , and the movement distance of the car body is

By the nonlinear regression analysis, it can be assumed that in a sampling cycle time , the movement track of the moving student is almost a straight line. If the distance between the left and right wheels is , and then, the moving student is moved from the position to the position , the distance that the moving student moved is , the angle information that the moving student rotated is , and the odometer model input is in the formula obtained by the control command (). During a sampling cycle , the motion path of the moving student is , and the pose relative angle is .

The odometer has two models: circular arc and straight lines. Since the linear model is simplified from the arc model, the arc mode is generally used in practice, which is more accurate. (2)Arc model

in formula (17) indicates the rotation angle difference between the end point and the beginning point of the moving student, in which is the movement displacement of the moving student during the sampling period AT time. (3)Line model

Assuming that the moving students makes a small displacement in a very small period of time, the extremal method of nonlinear regression analysis method can be used for processing, and the model dealing with the problem becomes a linear model, which can be represented by simplified lines, namely, , then the model equation can be expressed as

To calculate the position of moving students, in this paper, linear model and arc model are used. The specific implementation method is the displacement calculation adapts a straight line model, and the arc model is used to calculate the change difference of direction angle. Then, the two models are represented as

At the moment, if line speed and angular speed values of the moving student are known, formula (19) can be transformed as

The errors of the moving student odometer include two types, namely, nonsystematic error and systematic errors (as shown in Figure 2). Its systematic error accumulation is always existed, while the nonsystematic error accumulation is random and indeterminable. (4)Sonar calculation method

Generally, most of the sonar applications adapts very cheap Polaroid6500 sonar modules. Among them, the environmental road sign characteristic is used for the sonar sensor to scan the environmental information, extract useful environmental feature information, and correctly calculate the specific location of the environmental characteristics. Assuming that the coordinate position of the line segment in the local coordinate system is , the predicted coordinate position for observation can be estimated and obtained. Among them, the width of the sonar beam and the reflection of the sonar signal are the important factors affecting the sonar sensor. Figure 3 shows the coordinate system transformation diagram.

In Figure 3, the point is the origin in this local coordinate system that represents the moving student, the point represents the wall in the moving student walking area, is the sonar sensor, and represents the offset angle of the on-board coordinate relative to the coordinate OXY. The degree of the sonar is represented by the point , and the distance to the point is , and the relative position of each of the sonar is fixed.

Figure 4 shows the interrelationship between the global and local coordinate systems for moving students. The width of the sonar beam and the reflection of the sonar signal are two factors affecting the principle of the sonar sensor.

Assuming that in the moving student environment, the reflected wall is represented by , the distance from the origin to the plane and the distance between the plane and the sonar sensor in the environment are represented by . Since the relative position information of the sonar in the local coordinate system is known, which can be represented by , formula (22) and formula (23) can be used to calculate and coordinate position of and point on the wall in the environment.

In the figure above, is the number of planes, is the number of sonar sensors, and the distance from the obstacle plane to the sonar at the moment is , which can be simplified as

Set , , in , is the sonar beam width, and may be represented based on formula (21) above as

According to the view of Figure 5 above, assuming we are sampling at the time, the position of the moving student is . After a rotating coordinate transformation, we can convert the coordinate numbers of in ’ system to in thecoordinate system, then the model of the sonar sensor is expressed as the following formula:

As the point on the wall in the sonar coordinate system in Figure 5, the point , , is on the same line, and the position detected by moving the student odometer and the sonar sensor can be represented as

Mobile students must rely on the collected external environment information to achieve more accurate navigation and positioning. However, in the real environment, there are various uncertainties and intertwined complexity in the environment itself, which leads to the absence of theoretically reliable environmental information. In addition, the data measured by the mobile student’s own sensors will also be disturbed by the environment. As a result, the measured environmental information is not our ideal information, and there are often extremely complex uncertain factors. In order to reduce or even eliminate these interferences in the observation information of mobile student sensors, we can establish a reliable noise simulation system to express the uncertain factors described above. However, this is only our ideal state. In fact, due to the nonlinear and incomplete constraint system of moving students, it is not easy to establish an efficient and accurate mathematical model. In this case, we use a model (also called an error model) that introduces noise that obeys the Gaussian function model to approximate it.

In this paper, the sonar sensor is used to detect the road signs of environmental information. Due to the characteristics of the sonar sensor, if an observation point cannot be selected correctly in the process of selecting an observation point, it will lead to the environment corresponding to it. It is impossible for the feature information to perform correct data association, which will cause the divergence of the filter. The divergence of the filter will cause all the predicted observations to be wrong. In order to make the road sign data scanned by the sonar sensor match correctly, a data association model can be introduced to solve this problem.

The analysis shows that the position estimation inferred by the mobile student through the odometer and sonar sensor is inaccurate, so the correct position estimation of the mobile student cannot be obtained. In this paper, a two-dimensional vector road sign is selected as the observation in the position of the global coordinate system. Since the environment in the space is variable and complex, but there is a certain correlation between the environmental characteristics, it can be estimated; how to eliminate the interference of the correlation between these variables is the problem we want to solve. A large number of studies have verified that Mahalanobis distance can well eliminate these interferences, so Mahalanobis distance is selected as the measure of data association in this article. In addition, it should be noted that the correlation gate used in this article is an elliptical correlation gate.

Mahalanobis (Mars distance) is defined as , represents the mean distance of the measured value within group , the vector in the formula represents the mean of the variable in group , represents the value of the variable in the environment observed by the sensor model, and represents the covariance matrix within the group.

4. English Teaching System Based on the Sensor and Human-Computer Interaction

The kernels of the skeleton extracting algorithm are universally turned into corresponding coordinates of space rectangular coordinate system taking the waist as a coordinate origin. The right part of the body is -axis, the vertical upper part is -axis, the front of vertical to the body is -axis, which is shown in Figure 6.

The paper proposes human-computer interaction method based on monocular camera. First extract the physical skeleton from the video taken by monocular camera and complete the interaction with virtual environment through the human body pose. The process is shown in Figure 7.

After cutting out the current frame, first of all, pretreat the picture. That is because the profiles in the video are of sizes, if the body in the frame takes up smaller, which can greatly reduce the precision of the later kernels. At the same time, much irrelative information in the frame could cause too much useless information in the network training process and affect the precision of identification result. Therefore, we need to find out the boundary of human profile and get it out from the frame to identify aiming to improve the precision in the following stages. There are many ways to extract from the RGB images. As the video in the paper is shot by a fixed camera and there is only one person in the frame, our background is fixed, and we use background subtraction to extract the human profile from the video; the principle is shown in Figure 8.

The paper builds English teaching system based on the sensor and human-computer interaction technology and tests the performance. According to actual needs, the paper conducts a test on the basis of the real-time information transmission in English teaching by the sensor and human-computer interaction. The simulation experiment is used to test the performance of the system, transmission effect of sensor, and human-computer interaction (by the way of experts’ appraisal); the statistics is shown in Table 1 and Figure 9.

From above experiment, the English teaching system in the paper based on sensor and human-computer can effectively improve the English teaching effect and have some positive influence on the following English teaching revolution.

5. Conclusion

In the teachers-centered link, teachers play roles of traditional instructor, teaching monitor, after-school tutor, homework assigner and corrector, and document provider. With further development of teaching-related theory and interaction design theory, the human-computer interaction between teachers and students has attracted more and more attention among people, from the feasibility of human-computer interaction in English teaching in the light of theory and study, to the theory foundation of creating learning environment. In order to improve the effect of human-computer interaction in English teaching, the paper studies the sensor technology and puts a Kalman Filter Algorithm which stresses the precision to solve the problems from the human-computer interaction information by the positioning in sensor information. The virtual reality is used to build in English teaching online interaction system. The system goes through the performance test, and the study suggests that the system in the paper has certain effects.

Data Availability

The labeled dataset used to support the findings of this study are available from the corresponding author upon request.

Conflicts of Interest

The authors declare no competing interests.

Authors’ Contributions

Yuanyuan Dong completed the analysis and thesis writing as the main writer. Shuai Jiang and Lei Wang completed the collection and sorting of experimental data and jointly participated in the revision, polishing, and typesetting of the article, so I decided to add them to the list of authors of the article. Shuai Jiang and Lei Wang assisted in the analysis through constructive discussions based on their professional expertise.

Acknowledgments

This study is sponsored by “The 2020 year project of 13th Five- year plan of Educational Science in Shaaxi Province, China (grant no. SGH20Y1131)”.