Abstract

To address problems of serious loss of details and low detection definition in the traditional human motion posture detection algorithm, a human motion posture detection algorithm using deep reinforcement learning is proposed. Firstly, the perception ability of deep learning is used to match human motion feature points to obtain human motion posture features. Secondly, normalize the human motion image, take the color histogram distribution of human motion posture as the antigen, search the region close to the motion posture in the image, and take its candidate region as the antibody. By calculating the affinity between the antigen and the antibody, the feature extraction of human motion posture is realized. Finally, using the training characteristics of deep learning network and reinforcement learning network, the change information of human motion posture is obtained, and the design of human motion posture detection algorithm is realized. The results show that when the image resolution is 384 × 256 px, the motion pose contour detection accuracy of this algorithm is 87%. When the image size is 30 MB, the recognition time of this method is only 0.8 s. When the number of iterations is 500, the capture rate of human motion posture details can reach 98.5%. This shows that the proposed algorithm can improve the definition of human motion posture contour, improve the posture detailed capture rate, reduce the loss of detail, and have better effect and performance.

1. Introduction

Nowadays, with the widespread of the Artificial Intelligence (AI) in various fields, surveillance system has been born, which gradually expands the advantage of deep learning in the field of visual computers and clarifies the development direction of image processing technology [13]. After deep learning is optimized, related algorithms and models have been applied to identify, track, and detect human body’s motion postures, and excellent results have been achieved [4, 5]. There are many researches on the basic theory of detection and tracking of human body’s motion postures with enhanced deep learning. Starting from the field of surveillance and security, traditional surveillance technology is widely used in the military field, such as customs defense and borders. The detection and tracking technology based on deep reinforcement learning can be used to assist manual completion of designated tasks [6, 7]. The application of deep reinforcement learning human motion posture detection to sports items can provide services for sports training viewing [8]. For skiing and short track speed skating events, it can reproduce the three-dimensional posture of the human body and improve the standardization of athletes’ own movements. In the competitive arena, human motion posture detection can restore live scenes and reconstruct human motion targets, enhancing the audience’s visual experience [9, 10].

In view of the serious loss of details and low detection clarity of the above methods, this paper proposes a human motion posture detection algorithm based on deep reinforcement learning. The perception ability of deep learning is used to match the feature points of human motion, and by locating feature points of human motion posture, the position and direction of the human motion posture feature are determined, and the human motion posture feature is obtained. This method analyzes the contour of the human body motion posture, uses the training characteristics of the deep learning network and the reinforcement learning network to obtain the human body motion posture change information, obtains the general direction of the human body motion posture, and realizes the design of the human body motion posture detection algorithm. The contribution of this paper is as follows: (1) the algorithm in this paper uses deep reinforcement learning to detect human motion posture, determine the position and direction of human motion posture features, and obtain human motion posture features, which can improve the accuracy of human motion posture detail feature extraction. (2) The paper proposes an antigen-antibody binding method to detect human motion posture. This method uses the color histogram distribution of the human body motion posture as an antigen, searches for an area similar to the human body motion posture in the human body motion image, and uses the candidate region of the human body motion posture as an antibody. At the same time, this method realizes the extraction of human body motion posture features by calculating the affinity between the antigen and the antibody. (3) In this paper, the deep learning network is used to obtain the information of the human body movement posture change to improve the clarity of the human body movement posture contour, thereby obtaining the general direction of the human body movement posture, which can effectively reduce the amount of detail loss.

The human motion posture detection algorithm has a wide range of research at home and abroad as it has huge economic and social value. Aiming at the problems in the recognition of human posture data, Cai et al. [11] extracted a small number of key sequence frames from human motion images and designed a recognition method based on this. On the premise of preselecting the original motion posture sequence, they constructed the initial key frame sequence of the human body posture. This sequence is combined with the frame elimination algorithm to obtain the key frame sequence of posture, and the posture model is trained using the Baum-Welch method to recognize the posture. The method has better performance in the accuracy of human body posture recognition. Considering that the sliding window algorithm often fails to detect when collecting human body posture data, Zheng et al. [12] produced a human body posture data set by collecting time series data. In addition, they predicted and classified the human body’s persistent behavior posture and sudden behavior posture by comparing long short-term memory (LSTM) with other networks and algorithms. The method can improve the accuracy of human posture detection by 4.49%. Gao et al. [13] designed a recognition algorithm using the posture signal sequence and used a three-layer recognition algorithm to recognize different postures. The results show that the recognition algorithm fulfills the requirements. Ren [14] designed a monitoring instrument for human hand movement, using nanogenerators to monitor hand movement posture. Liu et al. [15] combined human prior knowledge and performance capture of the human motion capture system. To improve usability of the system while maintaining high fidelity, the model was updated while capturing motion. Experiments show that this method has a certain degree of convenience. Zhao and Chen proposed a method for detecting and recognizing the posture of a moving human body based on sensor technology [16], which uses inertial sensors to recognize shooting, passing, dribbling, and catching. A shaft inertial sensor is placed on the arm to collect the experimental data. After smoothing and normalization, the frequency domain features are extracted, and 30 feature vectors are obtained by principal component analysis. The method is improved after dimensionality reduction. However, the details of the human posture are severely lost. HSIU-YU LIN et al. proposed a posture analysis method of convolutional regression neural network (RNN) in fall detection [17] and proposed a continuous deep learning model. The model receives a set of continuous images and is used to classify posture types using Microsoft Kinect as our nonwearable sensor. In addition, RNN is used to build a LSTM architecture, and our detection model is constructed by recognizing human postures in fall detection. This method extracts features from preprocessed high-resolution RGB images. This method obtains the body shape with real motion and depth information, but the definition of human body posture detection is low.

3. Posture Detection Algorithm Design of Human Motion

3.1. Recognition of Human Motion Posture Features

When recognizing the posture features, the method uses deep learning to match the perception of human movement feature. The specific steps are as follows:Step 1: Establish the posture scale space using the deep learning network, and define the human body motion posture image as the product of deep learning network algorithm and the original image . Namely,Step 2: Compare the posture feature in the scale space with the neighboring feature points to obtain the specific position and locate the human motion posture feature points. Namely, where means space scale, and refers to new feature point.Step 3: To recognize the human body motion posture, Equation (3) is used to determine the position and direction of the posture feature. Namely,Step 4: Suppose is the characteristics of human body motion posture represented by , and is the characteristics of human body motion posture represented by . The Euclidean distance between the two features is calculated as

The perception ability of deep learning is used to match human motion feature. By locating the posture feature points, the position and direction of the human motion posture feature are determined, and the human motion posture feature is recognized.

3.2. Analysis of the Human Motion Posture Contour

Assuming that the camera device is stationary when detecting the posture, the differential image of posture is defined as , and is any pixel in human motion image. If it is within , the accumulated difference image is . Conduct pixel normalization processing of human motion images, namely,

If the pixel is static in , then the normalized difference of all pixels is , all of which comply to normal distribution.

Choose an appropriate confidence value , and according to the distribution, get the threshold of human motion posture analysis , in which . Then, it can be deemed that all the pixels that meet have correlated movements.

According to the above process, the contour of the human body movement posture is analyzed.

3.3. Extracting of Human Motion Posture Features

Suppose , means the color histogram distribution of the human body motion area, which is defined as an antigen, and is the center of the area of the human body movement posture [18, 19]. The human body motion posture image to be solved is . There are pixels in , and the gray quantization level of each pixel is , and is the characteristic value of human body motion postures. Construct the probability density estimation model of human motion posture:where means the quantified eigenvalue of human body motion posture image pixels at the characteristic space . It is usually used to analyze whether the posture pixel value in the human motion area is in feature spaces. means the estimated coefficient of probability density, is probability function, and is the pixel value [20, 21].

Assuming that the shape of the kernel function extracted from the posture feature is . Since the human body motion posture image will be disturbed by the environment, the human motion posture image pixels close to the center are more stable than the distant pixels. can provide pixels with larger weights to pixels in the middle position, while pixels at a long distance can only get a small weight [22, 23].

Assume , means the color histogram distribution of human body motion posture image in candidate area. Define as the antibody, and is the posture of current frame. Then, at the feature value in human body motion posture , build a probability density estimation model [24], namely,where means the normalized coefficient of human body movement posture characteristics, which should meet the condition that [25].

Assume that, at moment , the affinity of the antibody and antigen can be described with the following equation:

When extracting the features of posture, Equation (8) is used to measure the binding strength between antigen and antibody [25], and the equation iswhere means the similarity coefficient, which is between 0 and 1. The larger the value of , the higher the similarity between the human motion posture feature and candidate area posture feature [26, 27].

When reaches the maximum value, the candidate area of the human motion posture will become the human motion posture feature to be solved in this frame of image .where means the feature area of human body movement posture, and means the candidate region for posture feature extraction.

When the posture feature is extracted, equation (11) is used to calculate the position of the posture.where means the weight of posture features, and is the specific posture.

The color histogram distribution of the posture is used as the antigen, and search for an area similar to the posture of the human in the image. The candidate area of the human motion posture is used as an antibody, and the posture features are extracted by calculating the affinity between the antigen and the antibody.

3.4. The Paper Algorithm

The deep reinforcement learning network not only has the perception ability of deep learning, but also has the decision-making ability of the reinforcement learning network. Combine the two networks to analyze the change information of the human body’s movement posture, and obtain the general direction of the human motion posture. The global threshold segmentation method is used to retain the pixel points of the human body motion posture greater than the threshold, namely,where refers to threshold value, and means human body motion posture pixels.

According to the result of equation (12), the gray value of the human motion posture image is retained to facilitate the detection of the image edge [28].

Assume that, in the posture, the gray level of the original image is . When the gray value of the image is , the number of human motion posture images is . is the total pixels, and then the segmentation calculation process of the human motion posture image is as follows:Step 1: By judging the gray level of the human posture image, the normalized histogram of the posture image is calculated, which is expressed as , and then there isStep 2: In the original image of the human motion posture, calculate the average gray value of all pixels.Step 3: Calculate the image edge detection threshold

According to the actual situation of the human body motion posture in the detection process, design the edge detection algorithm. It is shown in Figure 1.

According to the human motion posture information obtained from the analysis, the deep learning and reinforcement learning are combined, and the detection process of human motion posture is designed, as shown in Figure 2.

To sum up, according to the edge detection algorithm process, the edge contour of the human motion posture is roughly outlined, and deep learning and reinforcement learning are combined to complete the design of the human motion posture detection algorithm and realize the detection of human motion posture.

4. Experimental Analysis and Results

4.1. Experimental Environment and Data Set

Since the production process of the posture data set is extremely complicated, multiple cameras need to be used in the collection process, supplemented by the full support of depth cameras and sensors. The data sets selected in this paper are as follows:

Human 3.6M data set: the data set is composed of 3.6 million human motion images, all data are provided by 11 experimenters, that is, eleven different characters, each image is composed of different postures, the number of postures is 17, choose S1, S5, S6, S7, and S8 as the training set, and choose S9 and S11 as the test set.

MPI-INF-3DHP data set: this data set is a classic human motion posture data set, including 6 human objects performing 7 motion postures, and the posture data is obtained by motion capture system. CMU Panoptic data set: this data set uses the human posture data collected by 480 VGA cameras.

4.2. Experimental Standards

For detecting the effect of the proposed algorithm, it is necessary to design experimental indicators for specific analysis.(1)Human motion posture contour detection effect. The posture contour of human motion image is analyzed, and the detection effect of human motion posture contour is compared through the gray level of background area and the contour integrity of human motion posture. The higher the contour integrity, the better the detection effect. On the contrary, the higher the contour integrity, the worse the detection effect.(2)The effective area ratio of the human motion posture image. The effective area ratio is expressed by the ratio of the number of frame subblocks of the detected human motion image to the number of accurately identified frame subblocks. The higher the effective area ratio, the worse the detection effect and, on the contrary, the better the detection effect.(3)Accuracy AP value of human motion posture detection.Accuracy of key point detection is shown in the following equation:where is the correctly detected target number, and is the falsely detected target number. Similarity index of posture key points is shown in the following equation:where is the Euclidean distance between the detected key and the actual key. is the mark indicating whether the key point is occluded, is the target size, and is the product constant for each key point. Then, is obtained through the joint calculation of equations (16) and (17), i.e., the average value of all the when the value of is 0.55, 0.6, …, 0.9 and 0.95.(3)Human motion posture recognition time. In this paper, the time interval measuring instrument is used to obtain the recognition time of human motion posture. The shorter the recognition time of human motion posture, the higher the recognition efficiency, and vice versa, the lower the recognition efficiency.(5)Human motion posture detailed capture rate. The posture detailed capture rate is obtained by the ratio of the number of captured frame subblocks to the number of all frame subblocks. The higher the capture rate of human motion posture details, the better the capture effect of posture details, and on the contrary, the worse the capture effect.

4.3. Results and Discussion

To verify the performance using deep reinforcement learning in practical applications, this paper compares the algorithm in literature [12], algorithm of literature [1315], and the following experiment was designed. It is shown in Figure 3.

In the three experimental data sets, using human body posture images as research data, four posture detection algorithms are used to detect the posture contour, and the detection comparison results are obtained.

According to Figure 3, in the posture contour of the human motion image, the gray level of the background area has a relatively small contrast, which causes the contour of the human motion posture to appear incomplete, and there is a relatively obvious void phenomenon. After comparison, posture detection algorithm of literature [12] is found; the contour detection is obtained by the posture detection algorithm of literature [13]. The result has relatively poor edge integrity; the posture detection algorithm of literature [14] has a better human motion posture contour detection effect: the posture detection algorithm of literature [15] and the posture detection using deep reinforcement learning algorithm, and the completeness of the obtained human motion posture contour is better. The proposed algorithm has higher quality of human motion posture images, which improves the smoothness of the edges of human motion posture images, which further shows that the posture detection proposed algorithm designed has better results.

In order to further compare the application performance of the proposed algorithm and the other three kinds of posture detection algorithms, the effective area of the posture contour and many blocks are selected as the measurement standard. It is shown in Table 1.

According to Table 1, this paper selects a human motion image sequence and divides it into 32 frames, 46 frames, and 60 frames, respectively. After processing the posture detection algorithm, when the human motion posture image is in a 32-frame sequence, the number of subblocks is less than the other five human motion posture detection algorithms. As the number of sequence frames increases, the number of subblocks of the proposed algorithm is always the smallest. After they compare that it is found that the human body motion posture detection algorithm based on deep reinforcement learning detects human motion posture, the detected effective area is higher than the other five algorithms. It shows that the proposed algorithm can reduce the amount of information loss in the posture image.

To verify the proposed algorithm effect, literature [12] algorithm and literature [1315] algorithm are selected. And the method in this paper is used to analyze the accuracy of motion posture contour detection under different input resolutions. It is shown in Figure 4.

According to Figure 4, for human motion images at different resolutions, the accuracy values AP of motion posture contour detection are different. When the image resolution is 128 × 96 px, the algorithm in literature [12] has a motion posture contour detection accuracy of 52%, and the algorithm in literature [13] has a motion posture contour detection accuracy of 53.5%, and the algorithm in literature [14] has a motion posture contour detection accuracy of 56%, the algorithm in literature [15] has a motion posture contour detection accuracy of 57.5%, and the algorithm in this paper has a motion posture contour detection accuracy of 67%. When the image resolution is 1 384 × 256 px, the algorithm in literature [12] has a motion posture contour detection accuracy of 58%, and the algorithm in literature [13] has a motion posture contour detection accuracy of 59%, and the algorithm in literature [14] has a motion posture contour detection accuracy of 64%, the algorithm in literature [15] has a motion posture contour detection accuracy of 72%, and the proposed algorithm has the contour detection accuracy of 87%. The method in this paper can retain the original input resolution information in the human motion posture image monitoring, thereby enriching the local features and improving the detection accuracy.

In order to further test the recognition efficiency of this method in motion posture recognition, this problem can be transformed into solving the recognition time problem. On the basis of the above experiment, this paper uses the time interval measuring instrument to obtain the identification time and input the experimental data into the SigmaPlot 14.0 software to obtain the result. It is shown in Figure 5.

According to Figure 5, the size of the human motion image is different, and the time it takes to recognize the human motion posture is different. When the image size is 10 MB, the recognition time of the method in this paper is only 0.3 s, the recognition time of the algorithm in literature [12] is 6 s, the recognition time of the algorithm in literature [13] is 5 s, and the algorithm in literature [14]. The recognition time of the algorithm is 9 s, and the recognition time of the algorithm in literature [15] is 12 s. When the image size is 30 MB, the recognition time of the method in this paper is only 0.8 s, the recognition time of the algorithm in literature [12] is 16 s, the recognition time of the algorithm in literature [13] is 11 s, and the recognition time of the algorithm in literature [14] is 21 s, and the recognition time of the algorithm in literature [15] is 23 s. The method in this paper takes significantly shorter time to detect images, which shows that the recognition efficiency of the method in this paper is higher. This is because this paper introduces a deep reinforcement learning method to detect human motion posture. This paper uses the perception ability of deep learning to match the feature points of human motion. By locating the feature points of human motion posture, the position and direction of human motion posture features are determined, and the human motion posture features are obtained. This effectively improves the recognition efficiency of motion posture recognition.

In order to verify the human motion posture detailed capture effect of the method in this paper, the human motion posture detailed capture rate is calculated by the algorithms in literature [12] and literature [15] and the method in this paper. The experimental results are shown in Figure 6.

According to Figure 6, it can be seen that the capture rate of human motion posture details is different under different methods. When the number of iterations is 100, the human motion posture detailed capture rate of the algorithm in literature [12] is 85%, the human motion posture detailed capture rate of the algorithm in literature [13] is 74%, the human motion posture detailed capture rate of the algorithm in literature [14] is 77%, the human motion posture detailed capture rate of the algorithm in literature [15] is 71%, and the human motion posture detailed capture rate of the algorithm in this paper is 95%. When the number of iterations is 500, the human motion posture detailed capture rate of the algorithm in literature [12] is 83%, the human motion posture detailed capture rate of the algorithm in literature [13] is 81%, the human motion posture detailed capture rate of the algorithm in literature [14] is 80%, the human motion posture detailed capture rate of the algorithm in literature [15] is 75.5%, and the human motion posture detailed capture rate of the algorithm in this paper is 98.5%. This shows that the human motion posture detailed capture effect of this method is better. This is because this paper introduces the deep reinforcement learning method, takes the candidate region as the antibody, calculates the affinity between the antigen and the antibody, realizes the capture of human motion posture detail features, and effectively improves the detailed capture rate.

5. Conclusions

This paper uses the advantages of the deep reinforcement learning network to design a detection algorithm of human motion posture. By analyzing the characteristics of human motion posture, extracting the characteristics of human motion posture, and combining the two capabilities of deep learning and reinforcement learning decision-making, the results show that the proposed algorithm has a good performance. When the human motion posture image is in a 32-frame sequence, the number of subblocks is less than that of the other five human motion posture detection algorithms. As the number of frames in the sequence increases, the number of subblocks in this algorithm is always the smallest. The proposed algorithm can reduce the loss of information details in the human motion posture image. When the image resolution is 384 × 256 px, the motion posture contour detection accuracy of this algorithm is 87%. The method in this paper enriches the local features and improves the detection accuracy. When the image size is 30 MB, the recognition time of the method in this paper is only 0.8 s, which shows that the method in this paper effectively improves the efficiency of motion posture recognition. However, it takes a lot of time to collect images of human motion posture in this paper. Therefore, the detection time of this method needs to be further optimized.

Data Availability

The data used to support the findings of this study are included within the article. Readers can access the data supporting the conclusions of the study from Human 3.6M, MPI-INF-3DHP data set.

Conflicts of Interest

The authors declare that there are no conflicts of interest with any financial organizations regarding the material reported in this manuscript.

Acknowledgments

This work was supported by the Guangdong Philosophy and Social Science Planning Project under Grant no. GD20CTS02.