Abstract

In order to effectively improve the recognition rate of human action in dance video image, shorten the recognition time of human action, and ensure the recognition effect of dance motion, this study proposes a human motion recognition method of dance video image. This recognition method uses neural network theory to transform and process the human action posture in the dance video image, constructs the hybrid model of human motion feature pixels according to the feature points of human action in the image coordinate system, and extracts the human motion features in dance video image. This study uses the background probability model of human action image to sum the variance of human action feature function and update the human action feature function. It can also use Kalman filter to detect human action in dance video image. In the research process, it gets the human multiposture action image features according to the linear combination of human action features. Combined with the feature distribution matrix, it processes the human action features through pose transformation and obtains the human action feature model in the dance video image to accurately identify the human action in the dance video image. The experimental results show that the dance motion recognition effect of the proposed method is good, which can effectively improve the recognition rate of human action in dance video image and shorten the recognition time.

1. Introduction

The advent of the intelligent era promotes the rapid development of computer image processing technology. Human action recognition technology has always been a hot topic in the field of computer vision [1]. The purpose of studying human action recognition technology is to effectively extract the human motion features of video image, analyze human action features in video through effective preprocessing of video image, extract human action features from video image and classify them, and finally realize effective recognition of image features. This recognition technology has been widely used [2, 3]. It is of great significance to analyze the human action in the dance video image by using the characteristics of computer vision, realize the recognition of human action, and correct the wrong action in time to achieve high-quality development.

At present, scholars in related fields have conducted research on video image recognition technology and achieved some research results. Guo et al. [4] proposed a human action feature recognition technology based on image feature similarity. Firstly, they use the image analysis method to make action recognition of the video image, then carry out the dimension reduction, and obtain a group of new human action representation models at the same time, use the image feature similarity technology for secondary recognition analysis, and calculate the similarity of the same two images through the adaptive analysis of the image, and finally obtain the recognition results of human action features by the weighted processing method. This method can effectively improve the recognition accuracy of similar actions in human action video. Yu and Min [5] proposed a human action recognition algorithm based on improved time network. Firstly, they extract human action features based on improved time network, construct human recognition model through neural network, use CNN framework for grid fusion, and analyze the characteristic of neural grid. Then, they use the same structure as spatial network for weighted summation, obtain a set of new feature vectors, and iterate the processing results. Finally, they get a new set of human action features and obtain the classification recognition results through the recognition of the two groups of human motion features. This method has good recognition effect. Based on the above analysis, this paper proposes a method of human action recognition in dance video image, which provides a certain reference for further improving the recognition rate of human action in dance video image and shortening the recognition time of human action.

2. Design of Human Motion Recognition Technology in Dance Video Image

2.1. Extraction of Human Action Features from Dance Video Image

In the design process of this technology, the human action features in the dance video image are extracted by using the neural grid theory, and the feature points of the extracted feature image are classified and processed. Combined with the above theory, the following research carries out the posture transformation of the human action target in the dance video image, which can be expressed as

In the above formula, represents the filtering result of the human action image in the dance video image after processing, represents the human dance target action image, and represents the filtering combination result. By processing the above results, the definition of order matrix human action image in dance video image is obtained, which is expressed as

In formula (2), the center coordinate of the dance video image is .

Through the iterative processing of human action image in dance video image [68], the feature point in the coordinate system of human dance action image is obtained, and the pixel probability within the moment is

In this formula, represents the pixel value of the human action in the dance video image at time . It is assumed that the actual probability of the pixel of the human motion image is at any time . The mixed model of human action feature pixel in the dance video image at time is as follows:

In the above formula, represents the number of recognition models of human action image in dance video image. The weight of feature vector of human action image in dance video image is at time . When , the characteristic vector value of human action image is , and the variance matrix of human action features in dance video image is at time [911].

Through weight analysis, the fitness value of individual action image in the dance video image can be obtained, and suitable human action feature distribution model can be found in the dance video image. At any time , in the dance video image, the probability value of the action image pixel of the human action is expressed as follows:

In this formula, and represent the feature vectors of the coordinate point of the human action image in the dance video image, in which the probability value of the coordinate point is , and represents the variance matrix of the human action image in the dance video image.

According to the pixel information of the human action image in the obtained dance video image, the judgment pixel can be obtained, which is as follows:

In formula (6), represents the characteristic distribution function of the pixel value HB of the human action image, represents the characteristic distribution function of the pixel value HF of the body action image. When , the human action image feature can be used as foreground pixel. When , the human action image can be used as the background pixel [1214].

After obtaining the background of the human action image in the dance video image, comparing the human action image in the dance video image with the standard motion image, the features of the human action in the dance video image can be obtained, which can be expressed as

In the above formula, represents the normalization processing result of the human action image in the dance video image, represents the characteristic value of the human action image, represents the center pixel of the human action image in the dance video image. The feature of the human action in the dance video image can be obtained by formula (7), so as to realize the extraction of the human action features.

2.2. Detection of Human Action in Dance Video Image

Assuming that the variance of the gray value distribution of the human movements in the dance video image is , all the gray values meet the expected value of the human action image and the Gaussian distribution of the human action in dance image at this time [1517]. Then, the background probability model of human action image in dance video image can be obtained by using the following formula:

In formula (8), represents the normal distribution of human action image features, and represents the feature vector of gray value of human action. Assuming that , the variance summation of the human action feature function in the dance video image can be expressed as

Then, in frame , if updated the variance of the human action feature function in the dance video, it can be expressed as

In the above formula, represents the statistic mean value of human action image in the dance video image of frame , represents the variance of human motion feature vector, represents the Gaussian model feature function, represents the Gaussian model variance of human action, represents the sample number of human action, represents the gray value of human action feature points, and represents the feature update rate.

In the analysis of human action in dance video image, in order to detect the relationship between human action features and basic feature points, this study carries out the target comparison of human action image in dance video image by using Kalman filter [18]. Assuming that the action amplitude of the human action image in the first dance video image is , then the feature of detecting the human action in the dance video image can be expressed as

In formula (11), represents the structure parameter in the dance video image, and represents the detection variance value of human action in the -th dance video image. Combined with the above formula, the detection of human action in dance video image can be completed.

2.3. Recognition of Human Action in Dance Video Image

In order to recognize human action in dance video image, the mathematical model is established as follows:

By connecting the human action samples in the dance video image, the recognition model of the human action in the dance video image can be obtained. The coordinate point of this model is , and . The number of human action samples in this model is , and the human action component in the dance video image is , and then . represents the behavior classification of human action. It can be obtained through normalization processing of all human actions in dance video image [19, 20] and also can be used to describe the linear combination of human action features, which is as follows:

Using the above analysis method, both the human action matrix of the dance video image and the human action image features can be obtained:

In the above formula, represents the mean value for classification of human action behavior in dance video image, and represents the human action matrix in dance video image. According to the human action feature vector[21], the multiposture features of the human body can be obtained:

In formula (14), represents the offset matrix of human action in the dance video image; and are the state variables in the changing process of human dance posture. Based on the feature distribution matrix , the human action feature model of dance video image can be obtained by carrying out pose transformation of human dance action features. The formula is as follows:

In the above formula, represents the feature points of human action in dance video image [22]. By extracting the state variables of human dance action image, the feature vector of human dance action and the feature components and of human dance action can be obtained. Through the above steps, the recognition of human action in dance video image can be realized.

3. Experimental Analysis

In order to verify the effectiveness of human action recognition technology in dance video image in practical application, this experiment chooses the AIST++ dance video dataset and uses MATLAB simulation software as the experimental platform to carry out simulation verification.

In the AIST++ dance video dataset, the most common dance actions mainly include belly dance action, Latin dance action, ballet dance action, national dance action, and folk dance action. This experiment respectively uses the method proposed in this paper and the methods in literature [4] and literature [5] for recognition and compares the recognition effects of dance action with different methods, as shown in Figure 1.

According to Figure 1, for the belly dance action, the dance action recognized by the methods in literature [4] and literature [5] cannot accurately identify the image details, while the method proposed in this paper can accurately identify the image contour. For Latin dance action, the dance action contour edges recognized by the methods in literature [4] and literature [5] are missing, while the dance action contour edges of the method proposed in this paper are complete. For the ballet dance action, the hand and foot areas of dance action recognized by the methods in literature [4] and literature [5] are deformed, and the method proposed in this paper can effectively restore the original image shape. For the national dance action, the methods in literature [4] and literature [5] still cannot effectively identify the details of the hand area of dance action, but the method proposed in this paper can accurately restore the original shape of the original image on the hand area. For the folk dance action, both the methods in literature [4] and literature [5] have deviations from the original image, while the method proposed in this paper avoids the image recognition deviations and accurately identifies the image edges. Based on the above analysis, it can be seen that the recognition effect of the proposed method is better, because the proposed method uses Kalman filter to detect human action in dance video image, which can effectively remove image noise, so as to ensure the recognition effect of dance action.

During the experiment, the neural network method is to extract the dance video image in the AIST++ dance video dataset frame by frame, and the descriptor matrix is taken as the training sample. According to the dance action samples of the collected dance video image, the experiment aims at five dance action types in AIST++ dance video dataset, and it also tests the recognition rate of the dance action type by the method proposed in this paper. The results are shown in Figure 2.

According to Figure 2, in the AIST++ dance video dataset, when the method proposed in this paper is applied to identify dance action types, the recognition rates of five dance actions are high. In addition, among them, the recognition rate of belly dance action is the highest, which can reach 92%, and the recognition rate of folk dance action is the lowest, which is 85%. Then, the calculation shows that the average recognition rate of the proposed method is 88.7% when identifying five dance action types in AIST++ dance video dataset. Therefore, the recognition rate of dance action types by the proposed method is high.

In order to further verify the recognition effect of the proposed method on human action in dance video image, this experiment selects the data in 400 MB AIST++ dance video dataset, uses the proposed method for recognition, and obtains the recognition rate of human action in dance video image by the proposed method, as shown in Figure 3.

As it can be seen from the results in Figure 3, when the proposed method is used to identify human action in the AIST++ dance video dataset, the recognition rate is high. Among them, when the data amount of AIST++ dance video dataset is 240 MB, the highest recognition rate reaches 95.5%. However, when the data amount in the AIST++ dance video dataset is 300 MB, the lowest recognition rate is 86%. According to the calculation, when the data volume in AIST++ dance video dataset is 400 MB, the average recognition rate of human action in dance video image by the proposed method is 90.4%. Therefore, the proposed method can effectively improve the recognition rate of human action in dance video image.

On the above basis, the recognition time of human movements in dance video image is further verified. The 400 MB AIST++ dance video data are selected and the recognition time of human movements in dance video image is obtained by using the method proposed in this paper, as shown in Figure 4.

According to the data in Figure 4, the recognition time of human action in the dance video image by the proposed method also increases with the increase of the data amount in the AIST++ dance video dataset. When the data amount in the AIST++ dance video dataset is 400 MB, the recognition time of human action in dance video image by the proposed method is only 40 s. Therefore, the proposed method can effectively shorten the recognition time of human action in dance video image.

4. Conclusion

This paper studies the human action recognition technology in dance video image, detects the human action in the dance video image by extracting the features of human action, and finally realizes the human action recognition in the dance video image according to the principle of human action recognition in the dance video image. What is more, the research results show that the human action recognition technology designed in this paper has high recognition rate, and it can effectively shorten the recognition time, which further verifies the practicality of this technology [2326].

Data Availability

The raw data supporting the conclusions of this article will be made available by the authors, without undue reservation.

Conflicts of Interest

The authors declared that they have no conflicts of interest regarding this work.