Abstract

In order to improve the recognition accuracy of human falling actions, the impact of randomness of actions is reduced. To this end, this paper proposes an automatic recognition method for physical fitness human fall based on pose data sequence. The color camera is used to collect the fall motion images of the physical fitness personnel, and the motion image preprocessing is completed by extracting the fall motion features of the human body, tracking and adjusting the fall motion of the human body. A model of human body fall movement displacement feature extraction from posture data sequence is constructed. By tracking the displacement feature points, the automatic recognition of physical fitness human body fall movement based on posture data sequence is realized. The experimental results confirm that the proposed method can effectively obtain the details of the fall motion images of physical fitness. When the number of human falling actions reaches 500, the accuracy of action recognition is also as high as 77%, improving the recognition effect of human fall action.

1. Introduction

Physical fitness is a very important aspect of modern social life, with strong natural and social functions. As the main body of sports fitness activities, residents’ views, support, and participation in decision-making on sports fitness directly affect the development of sports fitness [1, 2]. At the same time, with people’s attention to healthy life, more and more people participate in physical fitness. However, because many people do not acquire enough physical fitness knowledge or do not have the guidance of professionals, accidental falls often occur [3, 4]. Fall warning in video surveillance [5] is of great significance in many occasions. For example, it can be applied to the supervision of the elderly living alone. At the same time, it can also be applied to the accidental fall of sports fitness personnel. If the video monitoring system has the ability to intelligently judge whether the monitored object falls [6], it can timely alarm relevant personnel and take remedial measures in time to reduce unnecessary loss of life and property.

A fall action recognition method is proposed based on BP neural network [7]. This method can distinguish fall from daily action by using the attitude angle and triaxial acceleration data provided by the attitude heading reference system (AHRS) fixed in the waist. Experiments on different samples of falls and daily behaviors show that this method has high recognition rate, good stability, and strong practicability. But the accuracy of this method to recognize fall action in complex environment needs to be improved. Reference [8] proposed a new method of fall recognition. Based on OpenPose deep convolution network, the key points of human posture are extracted from the image to obtain the dynamic features of human body tilt posture. But the accuracy of the key point recognition of the fall action needs to be further improved. Reference [9] presents a human fall detection method based on intelligent vision. According to the intelligent vision analysis technology, the method uses acceleration sensor to collect the data of human fall inertia and uses the acceleration sensor to establish three-axis acceleration coordinate. This method can classify fall motion accurately, but the recognition effect in complex environment needs to be improved. In [10], a VD-ZSAR method was proposed for extracting nonredundant visual features, which alleviated the relationship ambiguity caused by redundant visual features. And by combining nonredundant visual space with semantic space, we can learn the visual-semantic joint embedding space. The fuzzy relation caused by redundant visual features can be eliminated when the fuzzy relation is applied to fall action recognition.

The currently applicable methods for human fall action recognition from pose data sequences all have the problem of low recognition accuracy. Therefore, in this research, an automatic recognition method of physical fitness human fall based on posture data sequence is designed. The method utilizes a color camera to capture the fall motion images of physical fitness personnel. On this basis, a feature extraction model of human falling action displacement in pose data sequence is constructed. By tracking the displacement feature points, the automatic recognition of physical fitness human fall based on the pose data sequence is realized, in order to further optimize the fall action recognition performance and make up for the deficiency of the randomness of the fall action on the recognition accuracy.

2. Design of Fall Recognition Method for Physical Fitness

2.1. Obtain the Image of Human Body Falling Action in Sports Fitness

The CMOS (complementary metal-oxide-semiconductor) motion sensor and the CCD (charge coupled device) motion sensor are selected by the fitness researchers to collect the fall image. The camera can simultaneously obtain the color image, depth image, and bone image of sports fitness personnel [11]. However, in the process of image acquisition, there is a problem of radial distortion [12]. Based on this, the established camera acquisition image model is shown in Figure 1.

In Figure 1, represents the world coordinate system of color camera, and represents the image point of human body falling action of sports fitness personnel; represents the image coordinates of human body falling action; represents the optical center of the color camera, and there is an -axis coincident with the optical axis; represents the ideal coordinate of the human body falling action image of sports fitness personnel; represents the actual coordinate deviating from ; and represents the coordinates parallel to the and axes at the intersection U of the -axis and the image plane.

The fall action image model is obtained according to the color camera shown in Figure 1, in which the color image and depth image are transmitted in the form of data stream. The resolution of color image is , the number of frames is 15fps, the format is Bayer format, and the color data can be encoded into rgb-64bit. The depth image acquisition process is consistent with the color image. The effective position information is 13 bits high, and the user ID information is 3 bits low. The bone image of sports fitness personnel is obtained from the depth image data, including the three-dimensional coordinates of 20 joint points. The bone map of sports fitness personnel is displayed visually [13].

In order to facilitate the recognition of fall movement of sports fitness human body, the spatial coordinate relationship between color image, depth image, and bone image of sports fitness personnel is analyzed. The coordinate system of color space, depth space, and skeleton space of sports fitness personnel [14] is shown in Figure 2.

Set the color space pixel coordinates as , the depth space pixel coordinates as , and the bone space pixel coordinates of sports fitness personnel as .

The transformation formula of bone space and depth space coordinate system of sports fitness personnel is: where represents the horizontal viewing angle of the camera, with a value of 57° and represents the vertical viewing angle of the camera, with a value of 43° [14].

The conversion formula between depth space and color space coordinate system is where represents the displacement of the color camera.

Through the above process, the fall action color image, the depth image, and the bone image of sports fitness personnel are transformed into the same coordinate system. Due to the less information content in the direction, in order to facilitate image processing and ignore the direction information, the resulting fall action image is , which provides image data for the automatic recognition of the following fall action.

3. Image Preprocessing of Human Fall

3.1. Extracting Human Fall Motion Features

The human body can be divided into five parts: trunk, left arm, right arm, left leg, and right leg. Among them, trunk is an important part to support the human body. Some joints in the waist of the human body reflect the information of their motion characteristics, while the motion information characteristics of hands and feet are shown from the joints of limbs in this part. The division results of the five parts of the human body are shown in Figure 3.

In some basic movement classification methods of sports fitness personnel, hierarchical strategy is adopted. The first level: first, the actions of the five related combination modes should be summarized into a large category. For example, only the movements of two arms are the combination of the second part and the third part, which is the result of roughly classifying the movements. The second level: reclassify the actions of the same combination mode to determine which action is the detailed classification of actions. The joint angle feature vector formed by the projection on the two-dimensional plane is verified from 17 joint angles of the human body, which is used as the first rough classification feature of human motion. According to the principle of kinematics, the features of the same combination of human body are extracted. The complete actions of a sports fitness personnel can be divided into main actions and auxiliary actions. The main action reflects the overall state of the motion mode, and the auxiliary action reflects the local state of the motion mode. Only by combining the characteristics of main and auxiliary actions can we express this action more accurately. For the body’s trunk, left arm, right arm, left leg, and right leg, the limb vectors in three-dimensional space coordinates are established, respectively, which are expressed as: where represents the three-dimensional space; represents the time when the limb moves when falling; and , , , , and represent the limb vectors of the human body’s trunk, left arm, right arm, left leg, and right leg in the three-dimensional space, respectively. According to the different contribution of human motion expression in physical fitness, two joint angles are selected from each part as the active joint angle. The size of each joint angle of human body in three-dimensional space can be calculated by using the following formula, and the angular velocity of human joint angle can be calculated as:

The motion sequence of human body is continuous and changes with time. The angular velocity value is produced after the change of front and rear torque joint angle. The limb vector and angular velocity of active joint angle are the performance of the overall movement of human trunk and limbs, and the change of the distance between joint points is reflected by the bending of human limbs and trunk. The human body also projects the xoy side plane from the left view direction. The distance from the five parts of the physical fitness personnel to the joint point is: where represents the Euclidean distance between two joints of human body in physical fitness [15, 16]. In order to eliminate the differences of different individuals in physical fitness, each item in formula (3) and formula (5) is standardized with the width of human shoulder and the mean value of Euclidean distance between joints to obtain: where represents the width of the human shoulder and represents the average of the Euclidean distance between the joints in the five major parts of the human body.

According to the above process, the limb vectors of the human body’s trunk, left arm, right arm, left leg, and right leg in three-dimensional space coordinates are established, and the human fall motion features are extracted by using the distance between bones and joints.

3.2. Track and Adjust Human Body Falling Movements

In the expanded tracking target area, symmetrical vertical and horizontal tracking is carried out to generate and adjust the image of the moving target. Calculate the “centroid” coordinate of the target area through formula (7), move the centroid coordinate according to the central area, and then generate the target tracking image of sports fitness personnel.

After the above processing, target tracking and adjusting the image sequence can well solve how the camera changes with the movement of sports fitness personnel. The adjusted image sequence only includes the athlete’s limb movement and the action caused by falling, and cannot reflect the motion video of the camera in the original image.

By tracking the moving target of sports fitness, the activity area of sports fitness personnel is obtained. By calculating the centroid coordinates of the human fall action recognition area, the human fall action tracking image is generated to realize the tracking and adjustment of human fall action.

4. The Displacement Feature Extraction Model of Human Fall Motion in Posture Data Sequence Is Constructed

After formula (7) is used, the original fall action image in the posture data sequence can be obtained. Because the posture data sequence is affected by the time series, the fall action feature points in physical fitness will be displaced accordingly. In this study, the displacement feature extraction model of human fall movement in posture data sequence is constructed to improve the feature extraction accuracy of fall movement in sports fitness.

Set as the sports fitness action information of frame in the posture data sequence, and take it as the input value of human fall action displacement feature extraction model at time , represents the output value at time , represents the weight corresponding to the feature output, represents the weight at the input of the original image, represents the model activation function, and the current input posture data sequence is , and then, the original input value of the model can be expressed as: where represents the output value of the feature extraction model at the last time. In order to obtain displacement data and calculate historical data, there are:

The corresponding displacement data is obtained according to this formula, and the characteristic information of sports fitness fall action is obtained according to the displacement data. The feature extraction process is as follows:

The displacement characteristic data in the falling action image can be obtained from the above formula, and the displacement characteristic data can be saved, read, and reset according to the characteristic data processing method in the current method. If the displacement feature data does not meet the use requirements, the new feature data can be obtained by updating the long distance. In the process of characteristic data processing, the displacement data at each time is not of equal importance. Therefore, in the process of proposing the characteristic data, it is necessary to use formula (9) to calculate the falling action image of the attitude data sequence for many times, set the corresponding loss function [17, 18], eliminate the difference information in the data, obtain the final falling action characteristic data, set it in the form of action label, and provide help for subsequent action recognition.

5. Realize Human Fall Motion Recognition Based on Pose Data Sequence

In the above design, the fall movement displacement data in the posture data sequence is extracted, and the extracted fall movement features will be used in this part. By tracking the displacement feature points, the process of human fall recognition is completed. A moving object will form a three-dimensional sports field in space, and the projection of the sports field on the imaging plane will form a two-dimensional field, which is the optical flow field. The optical flow field carries the three-dimensional information and motion parameters of the moving object, and the calculation of the optical flow field can indirectly obtain the three-dimensional motion information of the object. Because the images in this study are dynamic, the optical flow is calculated at the original input position of the displacement feature, and then, the collected action dynamic displacement is taken as the initial value of the subsequent position feature output, and the calculation process is repeated to obtain a continuously accurate optical flow estimation value.

In this study, the optical flow estimation will be obtained according to the relevant principles in the personalized learning algorithm [19, 20]. For the displacement feature point of layer in the fall action image not recognized in the attitude data sequence, the corresponding feature point of layer in the preset action image is . Using the expression set above, it is necessary to determine the value range of optical flow vector [21], so that:

Then, the position of the action positioning point is determined in the image to be detected, which can be expressed as:

According to the obtained action positioning point information, the gradient matrix of the fall action image is obtained by multiple operations using the L-K iterative algorithm [22, 23], and the optical flow in the fall action image of the attitude data sequence is determined. L-K (Lucas-Kanada) optical flow was originally proposed by Lucas-Kanada in 1981. This algorithm assumes that the motion vector remains constant in a small spatial neighborhood and estimates optical flow using weighted least squares. Since this algorithm is convenient when applied to a set of points in the input image, it is widely used in sparse optical flow fields. The specific formula is as follows: where represents the mismatch vector in the image, and the final optical flow estimation value is obtained by repeated operation with formula (11) and formula (12). Use the above calculation process to calculate the collected displacement feature points [24, 25], determine the dynamic characteristics of sports fitness action, and complete the identification of human fall action. The automatic recognition process of human fall action is summarized as Figure 4.

According to the model in Figure 4, the automatic recognition of falls of physical fitness personnel can be realized.

6. Experimental Demonstration and Analysis

6.1. Experimental Preparation

In recent years, a large number of action databases have been built at home and abroad to provide data sources for human fall action recognition. In this study, an automatic fall recognition method based on posture data sequence is proposed. In order to verify that this method has strong recognition effect, two recognition methods in reference [7] based on back propagation (BP) neural network and reference [8] based on OpenPose deep convolution network are selected as the control group, marked as current recognition method 1, and current recognition method 2, respectively, and the application experimental analysis process is completed by using the action database.

The experimental sports fitness actions used this time are obtained through video acquisition, and the images are taken at the same time through two groups of corresponding cameras, including 600 groups of experimental samples. The experimental action library does not include light conditions in the process of use, but only analyzes the sports fitness actions. At the same time, the main equipment used in the experiment is image acquisition equipment, which can reduce the transience and rapidity of human fall. The experiment adopts short-time acquisition, storage, and recording system to obtain and store the experimental image, which is composed of camera, acquisition card, cable, computer, and acquisition software. The camera parameters are shown in Table 1.

The overall experiment is completed in the platform built by MATLAB software under the configuration of Table 2.

The original target image obtained according to Table 1 before the experiment is shown in Figure 5.

For the processing of experimental data, each video segment is processed into a color image sequence, and then converted into a gray image sequence, and then, different recognition methods are used to recognize the falling action.

7. Analysis of Experimental Results

7.1. Experiment on the Division Accuracy of Physical Fitness Movements

In this experiment, the division accuracy of sports fitness action is taken as the first group of indicators to determine the image category division ability of the method in the experiment. The results are shown in Table 3.

According to the experimental results in Table 3, the classification performance of the proposed method is significantly better than the current recognition method. In each experiment, the preset 5 kinds of actions are divided. The proposed method can carry out high-precision division of each action after use, and the action division accuracy is higher than 90%, ensuring the reliability of subsequent action recognition results. The current recognition methods 1 and 2 cannot achieve high-precision classification for some preset actions. The action classification accuracy of the two methods is lower than 87%, and there is often a problem that some action images cannot be classified. Using this method will have adverse effects on subsequent processing. Based on the above analysis results, it can be seen that the proposed method has stronger ability of action category division.

7.2. Action Detail Key Point Recognition Quantity Experiment

Taking Figure 5 as the search object, test the identification number of key points of fall details by different methods. The test results are shown in Table 4.

The experimental results show that the proposed method has only one error or omission in the identification of key points of action details. However, the number of errors and omissions in identifying key points of action details by the two methods compared is higher than 20. The recognition ability of the proposed method for action key points is significantly better than the current recognition methods. Through literature research, it can be seen that the recognition effect of key points of fall action directly affects the recognition effect of human fall action. Therefore, the application effect of the proposed method will be better than the current method.

7.3. Recall and Precision Experiments for Action Recognition

In order to quantitatively evaluate the proposed method, recall index and accuracy index are introduced to determine the recognition ability of each swing in human fall. The recall rate indicates how many positive examples in the sample are correctly identified (find all) the proportion of all positive examples that are correctly identified. Accuracy is the proportion of the total number of correct predictions by the model. The calculation method of recall rate index and accuracy rate index is as follows: where represents the number of correctly recognized human falls, represents the number of unrecognized human falls, and represents the number of incorrectly recognized human falls.

For the experimental object shown in Figure 5, taking the number of human falls as the independent variable, after several iterations, the recall rates of the three action recognition methods are tested, and the results are shown in Figure 6.

It can be seen from the results of Figure 6 that the recall rate of action recognition of the three action recognition methods is basically the same when the number of human fall actions is less than 250. When the number of actions exceeds 250, the recall rate of human fall action recognition of the proposed method is higher and higher when identifying human fall actions; the highest recall rate is 90%. In the current recognition method 1, due to the low quality of the recognized human fall motion video, the recall rate of human fall motion recognition appears a turning point when the number of actions is 300 and begins to decrease gradually. The current recognition method 2 human fall action recognition method is affected by the movement of the camera. Although the recall rate of human fall action recognition shows an upward trend, the upward trend is relatively slow. Therefore, it can be concluded that the proposed method has good performance in the recall test of human fall action recognition.

The accuracy test results of human fall action recognition are shown in Figure 7.

It can be seen from the results in Figure 7 that with the increase of the number of human fall movements, the current identification method 1 and the current identification method 2 human fall movement identification method cannot be effectively distinguished when the movement direction of sports fitness personnel changes, resulting in the gradual reduction of the accuracy of human fall movement identification. When the proposed method is used to identify human falls, the accuracy of human fall recognition tends to decrease slowly. When the number of human falls reaches 500, the accuracy of action recognition is also as high as 77%.

In the process of this experiment, the recognition accuracy and recognition ability of the proposed method are analyzed from many angles. It is determined that this method has a certain application effect. At the same time, it is also confirmed that the application effect of this method is better than the current method.

8. Conclusions

Human fall recognition is an important part of the application of sports fitness recognition technology, and it is also an important research topic in the field of artificial intelligence interaction. In this study, the motion recognition method in the current attitude data sequence is used as the design blueprint, and the displacement feature and personalized learning method are used to optimize the design, so as to improve the recognition accuracy and effect on the existing basis. The color camera is used to collect the fall motion images of physical fitness personnel, and the fall motion features of the human body are extracted. In this paper, a feature extraction model of human falling action displacement from posture data sequence is constructed, and the automatic recognition of physical fitness human falling action based on posture data sequence is realized. The experimental results confirm that the classification accuracy of physical fitness actions of the proposed method is higher than 90%, and the recall rate of human falling action recognition can reach up to 90%, which effectively improves the recognition accuracy of human falling actions. In the future research, information fusion technology will be added to improve the methods proposed in this study, add the recognition process of voice and other information on the basis of action, use a variety of information to more accurately the information of sports fitness athletes, and capture the action types of sports fitness athletes.

Data Availability

The dataset can be accessed upon request.

Conflicts of Interest

The authors declare that they have no conflicts of interest.