Abstract

With the rapid development of computer vision technology, human action recognition technology has occupied an important position in this field. The basic human action recognition system is mainly composed of three parts: moving target detection, feature extraction, and human action recognition. In order to understand the action signs of gymnastics, this article uses network communication and contour feature extraction to extract different morphological features during gymnastics. Then, the finite difference algorithm of edge curvature is used to classify different gymnastic actions and analyze and discuss the Gaussian background. A modular method, an improved hybrid Gaussian modeling method, is proposed, which adaptively selects the number of Gaussian distributions. The research results show that, compared with traditional contour extraction, the resolution of gymnastic motion features extracted through network communication and body contour features is clearer, and the increase rate is more than 30%. Moreover, the method proposed in this paper removes noise in the image extraction process, the effect is good, and the athlete’s action marks are very clear, which can achieve the research goal.

1. Introduction

Contour extraction refers to the process of accurately marking the contour of a target object in an image. It is an important foundation for many computer vision and image processing algorithms. The extracted object contour information is widely used in image segmentation, object detection, occlusion and depth reasoning, three-dimensional reconstruction, and other fields. The existing contour extraction methods require a lot of user interaction, or the extraction effect is difficult to meet the requirements. With the rapid development of computer vision technology, human action recognition technology has occupied an important position in this field. It has important practical value and research value in security protection, advanced human-computer interaction, video search analysis, and sports analysis. Due to the nonrigid characteristics of the human body, the change of illumination, and the influence of the changeable surrounding environment, the recognition of human actions is more challenging.

Gymnastics is a kind of bodybuilding, forming a beautiful body posture and cultivating elegant temperament. On the basis of natural gymnastics, it incorporates music, dance, light equipment, and other elements and uses a combination of sports and art. It can be used in a certain space and time. Inside, a sports event that perfectly expresses the beauty of women’s physical and mental beauty was temperament. It is not only a highly difficult and high-level competitive sports project with broad innovation space. The complete set of exercises in individual gymnastics is composed of difficulty and completion. It means to follow the stress and phrase of the music to complete a number of physical difficulty (BD), instrument difficulty (AD), dance step combination (S), and dynamic movement combined with rotation (R) within the specified time and show what the music reflects. At the same time, pay attention to the connection of the action and the fit of the action with the characteristics of the music, as well as the “artistic” to be reflected in the completion of the action. Therefore, the recognition of gymnastics movements is particularly important.

For the extraction of sports contour features, experts at home and abroad also need to study. In the research of Genc H, height, weight (BW), BMI (body mass index), BFP (body fat percentage), skinfold thickness (SFT), diameter measurement (DM), circumference measurement (CM), sitting height (SH), arm span (AS) and vertical jump (VJ), standing long jump (SLJ), and flexibility test as the test group (EG) and control group (CG)). The pretest and posttest are tested. Data analysis is evaluated in the SPSS 22 software package program. The analysis related to between groups, within groups, and training effects was performed by multiple measurement of variance (MANOVA) in repeated measures, and the post-hoc comparison of significant values was determined by Bonferroni test [1]. Kyselovičová et al. believe that there are few studies on the anthropometric and physiological characteristics or physical benefits of exercise. They examined the control group, CO (n = 10), versus competitive synchronized swimmers, SS (n = 11), and aerobic gymnasts, AG (n = 10). Anthropometric and cardiovascular characteristics were between 13 and 25 years of age. According to the agreement, the physical measurements are evaluated in the following order: height (BH), weight (BW), body mass index (BMI), and body fat percentage (% BF). The measurement values of maximum oxygen consumption (VO2max) and maximum heart rate (HRmax) are checked by the spirometry method of COSMED K4b2 [2]. Katarzyna and Emanuela believe that the research on the anthropometric characteristics of athletes has a long history, but there are no comments on the body shape of rhythmic gymnasts. Practitioners and professional coaches can better understand the ideal physique and the impact of high-intensity training since preteens come for guidance. This review aims to provide this information. All studies on the body composition of male rhythmic gymnasts are included. We identified 19 studies that evaluated the body shape of male gymnasts. We found that young gymnasts (≤18 years old) and older gymnasts (>18 years old) have a high degree of heterogeneity in body composition (1.6 ± 0.3, 5.4 ± 0.8, and 3.0 ± 0.6 vs. 1.8 ± 0.4, 5.9 ± 0.6, and 2.2 ± 0.4) [3]. These studies provide a lot of reference for this article, but due to insufficient sample data, there is a little deviation in the results of the experiment.

Characteristic selection and exploitation is the intermediate and high stage of human action detection, which is also a hot and a difficult research point. In this regard, human movement recognition is a process of selecting the suitable features from relevant videos or images to effectively describe human behavior. Feature extraction is an indispensable part of a complete human action recognition system. It is not only related to the accuracy and speed of subsequent action recognition, but also affects the performance of action classification. Combining network communication and body contour feature extraction for image summary generation, image content question and answer, and video-oriented behavior recognition model, with its superior image feature extraction method, very good segmentation results have also been achieved in human body contour extraction.

2. Gymnastic Movement Signs

2.1. Shape Contour Extraction

Human body contour extraction occupies an important position in the field of computer vision, and it is the core technology of human body detection and human behavior recognition. Human contour extraction technology is currently widely used in intelligent monitoring, medical treatment, and other fields [4, 5]. The virtual reconstruction of the human body model is a key technology in the modern medical visualization system. The accurate collection of human contour information can ensure a reasonable medical analysis of the patient’s symptoms. On the other hand, with the strengthening of the requirements for personal and public property safety in modern society, the utilization rate of intelligent monitoring systems is gradually increasing. The core task of intelligent video surveillance technology is to process the information provided by the surveillance video through computer vision, recognize and understand the behavior in the screen, automatically track and evaluate the target, so as to predict what may happen [6]. An intelligent surveillance system as a key technology to support, human contour withdrawal can accurately provide the human body’s locations and contour information in the screen, human tracking and identification of behavior without human intervention, so that video surveillance to achieve smart purposes.

The problem of human contour extraction is different from that of human body detection. Contour extraction requires an understanding of image features on the basis of recognizing the human body, so as to achieve accurate segmentation of human contours. Therefore, the human body contour extraction is a more difficult image segmentation problem derived from the human body detection problem. In previous studies, the actual situation is often affected by some factors and cannot achieve the expected results. These factors include the following:(1)Diversity of Human Appearance Information. The most intuitive point that distinguishes humans from ordinary creatures is the diversity of appearance information, which is reflected in changes in posture and clothing. Different people have different heights and body types, and their behaviors and postures in different environments are also very different [7]. As shown in Figure 1, the appearance information of an underage boy wearing sportswear playing football and a woman walking with a stroller seems to have nothing in common. It is difficult to extract common features when modeling such targets, so the diversity of appearance information increases the difficulty of extracting human contours to a certain extent.(2)Background Interference. The human body is usually in a more complicated environment, so background interference is a problem that cannot be ignored. When the color of the human body is close to the background color, the outline boundary may be unclear when extracting, or the human body may be judged as the background to cause missed detection; at the same time, the tree stumps, mailboxes, and other objects that are close to the shape of the human body in the background are often mistaken. It is judged as a human body, which greatly increases the possibility of false detection [8]. The background interference is shown in Figure 2.(3)Self-Occlusion and Mutual Occlusion. In daily life, the position and posture of the human body are often moving and changing, so self-occlusion and mutual occlusion are not rare [9]. For example, the act of making a phone call may cause the arms to cover part of the head and the upper part of the torso, which forms self-occlusion; there are often cases of mutual occlusion in crowds. One person’s torso may be blocked by another person’s arms. This person’s shoulders may cover another person’s head; in addition, the human body may also be partially or completely blocked by objects such as vehicles, telephone poles, and trees. At present, many scholars have conducted research on the problem of occlusion, but they have not yet come up with a very effective solution.(4)Viewing Angle and Illumination Changes. The study of human body contour extraction is indispensable to the observation of the human body, and the difference of the observation angle and viewing angle will directly affect the recognition result of human body characteristics. The human body structure seen on the front, back, and side is quite different, so the extracted features are also different [10]. On the other hand, different light conditions also affect the effect of feature extraction. The features extracted by the same person under different lighting conditions will be different, and in most cases there will be shadows. These problems will affect the contours of the human body.

The human profiling methods are divided into static extraction and dynamic animation extraction. Mobile objects’ contour extraction is achieved mainly by background models. The silhouette message of a moving object can be accessed by using background reduction and then chosen by such methods as template mapping. This part belongs to the outline of the human body [11, 12]. Feature learning based is the most widely used method for image target detection in recent years. The whole idea of these methods is to first pick up the specified features of the objective and then iteratively train the extracted features to generate a classifier, and after the detected image is passed through the classification, the identification result of the applicant can be created.

2.2. Feature Extraction Method

In recent years, machine learning has become a core technology in the field of artificial intelligence research. Human beings learn knowledge from continuous learning, so learning is a key skill for the development of human society [13]. Machine learning is to allow machines (computers) to simulate the learning function of humans. It is a discipline that studies how to simulate or implement human learning activities through machines and acquire knowledge and skills to improve system performance. This subject is combined with multiple subjects such as probability, statistics, and calculus, so it may be able to consider more situations and perform more complex calculations. In terms of subject areas, machine learning belongs to the same category as pattern recognition, mathematical statistics, data analysis, and other subjects. On the other hand, the emergence of a series of advanced disciplines such as computer vision, speech recognition, and image processing is the result of the fusion of basic machine learning and most advanced technologies not in this field [14]. The process of machine learning and human brain learning is shown in Figure 3.

Machine learning has a similar process to the human thinking pattern, and the process as a whole can approximately be classified into two stages. The very first stage is to generalize experience and provide data cultivation [15, 16]. Historical experience occupies a very important position in the long-term life of mankind. Through historical experience, laws can be summarized to guide mankind forward and predict the future. The historical data in machine learning is just like human historical experience. As a data set, it is repeatedly and iteratively trained into a data model that has accumulated a lot of “historical experience” [17]. The second stage is the generation and judgment of new problems. When humans have a new problem, they will analyze and compare the new problem with the rules that have been summarized, predict the possible results of this problem, and judge the behavior that they are about to make [18].

Human action recognition is a typical application based on statistical pattern recognition technology [19, 20]. Figure 4 is a structure diagram of a vision-based human action recognition system. First, the original video sequence is preprocessed to detect a binary sequence containing a moving human body, and then feature selection and extraction are performed. Finally, a trained classifier is selected to perform feature vector classification and recognition and get the recognition result.

2.3. Image Edge Feature Extraction

The silhouette shape of the fringes is one of the most important features. This information of feature is well classified and recognized. It essentially is the area of the change in gray level of the pixels in the picture. These are the points that give the location of the image contours. These significant regions are the important feature data we need for edges [21, 22]. And we need to find out these feature conditions, extract them, analyze and count these features, and then identify and judge.

The edge curvature needs to be calculated by the finite difference algorithm in mathematics. For the leaf edge, set the curvature to k; then it can be written as .

The curvature of the edge needs to be calculated by the finite difference algorithm in mathematics. For the edge , if the curvature is k, it can be written as

For the preprocessed image of the target image, it is necessary to first use the rectangular coordinate system to express the center of mass coordinates :

In the above two formulas, the height is H and the width is W. When the pixel is on the image, , and when the pixel is outside the image, .

In the rectangular coordinate system, means the pixel point, and in the polar coordinate system, means the pixel point. We take the centroid of the target image as the origin of the coordinate; then the pixel point coordinates in the polar coordinate system are

In the above formula, the distance from the center of mass to a certain point is r, and the angle between a certain point and the center of mass point is ; then the conversion formula for the center of mass point is as follows:

In the polar coordinate system, the minimum value of r is 0 and the maximum value is 360. In order to facilitate the following calculations, the original image is set to 1/360, and the value of is

where represents the angle of a certain pixel. The surface area in polar coordinates is the area between the highest peak and the lowest valley of the edge curve.where S is the area of the entire rectangular area, expressed as

Among them, the height of the curve at the lowest valley is ; then the surface area ratio is

Because the moment feature is invariant in terms of rotation, translation, scale, etc., it is also called invariant moment. It can accurately describe the movement characteristics such as the center of mass and symmetry of the moving target. Suppose f(x, y) is a two-dimensional continuous function, and the definition of the moment of origin of order (p + q) is as shown in the following formula:

Among them, the definition of the center distance of (p + q) order is as shown in the following formula:

The use of moments for feature recognition has a fast speed and a higher recognition rate for larger images [23].

3. Experiments and Results

3.1. Silhouette of a Sports Human Body

Let us take the following picture as an example to extract the contours of the characters in the picture. The original picture is shown in Figure 5.

First, we perform denoising processing on the original image, and then perform binarization to obtain Figure 6.

The feature of human movement is extracted through the binarized image to obtain the contour of human movement, as shown in Figure 7.

The same movement of the same action is also different for different people, so there is a need to detect the type of action by establishing a speed in the threshold value. The recognition results of running actions based on speed characteristics are shown in Table 1.

Video clips of jumps and bends of Julie, Monica, Bread, and William from the video database were selected for the experiment. The eigenvalue of the center of matter of the traveling objects is found. Figure 8 shows the changes in William’s centroid feature.

The experiment selects the video clips of jumping and bending in place of Julie, Monica, Bread, and William in the video database as the research objects. The characteristic value of the center of mass of the moving target is found. Figure 8 shows the changes in William’s centroid feature.

It can be seen from Table 2 that the centroid thresholds of different characters are set differently, so the bending action can be better recognized. At the same time, only a single feature can recognize the action, and the recognition rate is not particularly high.

3.2. Gymnastics Characteristics

In this gymnastic exercise, the primary movements include jumping from one foot forward, jumping with two feet forward, jumping in place, hand clapping, etc. Because the hop in place and the hop with both feet straight forward take off in the same place, they are both through the width of the periphery. Whether they are changed, they can be classified and identified. If the width of the circumscribed rectangle does not change, it is a jump in place. Since the forward jumping with one foot and the forward jumping with both feet are both forward jumping, the one-foot forward jumping can be classified according to whether the height of the circumscribed rectangle increases. If the height of the circumscribed rectangle does not increase, jump forward with one foot. However, the above single-feature method cannot achieve good results, and multiple features need to be fused for recognition. The contour extraction of in situ take-off is shown in Figure 9.

We will extract the speed characteristics of walking and running, respectively, and the changes are shown in Figure 10.

It can be seen that the speed change range of the walking motion is relatively gentle, and the speed change range of the running motion is relatively large. The change range of the mean speed is relatively stable. At the same time, the speed of the same action of different people is also different, so it is necessary to detect the type of action by setting a speed threshold.

We select four people’s action videos as the training set, each of 12 video frames of each action, the number of input samples is 432, the dimension of each sample is twelve dimensions, and the remaining 216 samples are used as the test set. Experimental steps: first, extract a two-dimensional image of a moving human body in the training sample. Combining the feature extraction method to extract the twelve-dimensional feature vector of the fusion of the motion feature and the shape feature, construct a support vector machine classifier between various actions; then use the same feature extraction method to extract the moving target in the test sample. Obtain the twelve-dimensional fusion feature vector and input it into the trained classifier for classification. The recognition results are shown in Table 3.

Among all the actions’ recognition, the lowest recognition rate is jumping forward with both feet. The action with the best recognition effect is bending over, and the recognition rate is 100%. Due to the intraclass changes and the interclass changes of actions, the similar key frames will be misrecognized during the recognition process. The speed and center of mass characteristics of running and skipping motions are very similar to walking. 5.21% are recognized as running and 8.43% are recognized as skipping. The recognition rate of running is 83.86%, of which 10.43% are recognized as jumping forward with one foot. Because of the differences between the same actions of different people, especially the speed characteristics have a great influence on running. The recognition effect of jumping forward with both feet is the worst, of which 9.41% are recognized as walking, 6.75% are recognized as running, and 5.56% are recognized as jumping forward with one foot. The biggest reason for the error rate is that these actions have great similarities in certain key frames.

Because of the periodicity of the action, different actions overlap in some keyframes. The waving action is perfectly recognized correctly as it shows low similarity to other animations. The detection rates of one-handed flapping and two-handed flapping do not differ much, and there is a great similarity in some feature segments. The average recognition rate reached 83.25%, which needs to be further improved.

4. Discussion

4.1. Moving Target Detection

This article gives a comprehensive overview of human action recognition, analyzes the status quo of relevant research at home and abroad, and conducts corresponding research on moving target detection, human action feature selection and extraction, and human action classification and recognition. Firstly, it introduces the background and significance of motion recognition of moving human body and the development status at home and abroad, and summarizes the process of human motion recognition system in general, and finally summarizes the research content of this article and the content arrangement of each chapter. The video image preprocessing part is the basis of the human body action recognition system. This part mainly introduces the basic image preprocessing grayscale, binarization technology and noise removal methods, as well as mathematical morphology calculation methods [24].

In the moving target detection part, basic target detection algorithms such as optical flow calculation method, interframe difference method, and background difference method are introduced and compared [25]. Then, the single Gaussian model method and the mixed Gaussian model are summarized, and an improved algorithm based on the mixed Gaussian background model is proposed. This algorithm is combined with the background difference method to extract moving foreground targets. In the feature extraction part of the moving human body, it mainly extracts movement features and shape features. Since different actions have their own characteristics, the hu moment, center of mass, speed and other motion features, and shape features such as compactness, circumscribed rectangle width and height changes are extracted for different actions as the feature vector, describing the moving target. Experiments show that the extracted features can effectively describe human actions.

In the motion recognition part of the moving target, a radial basis kernel function-based SVM classifier is used to clap nine kinds of motions, including jack, pjump, one-hand wave (wave1), two-hand wave (wave2), and bend (bend), for walk (walk), run (run), one-foot forward jump (skip), two-leg parallel forward jump (jump), and jump test. The results show that the algorithm can effectively identify these actions, but further research is still needed. Realize a simple human motion recognition system, and introduce the various modules of the system to test the effectiveness of the system. The results show that the system can realize the functions of moving human detection, feature selection and extraction, and action classification and recognition.

4.2. Gymnastics

By measuring body weight, the body’s quality, nutritional status, and potential for strength qualities are explained to a certain extent. And you can also find the athlete’s thickness, girth, length, width, and weight development from the side. At the same time, from the perspective of exercise physiology, strength quality is also closely related to weight, and the physiological cross-section of weight muscle is directly proportional to weight. The requirements for athletes’ confrontation ability in the competition are no less than that of other athletes. In high-level gymnastics competitions, competition is strong, time-consuming, and higher requirements are needed for athletes’ physical fitness. Proper body weight can preserve athletes’ physical strength and help athletes achieve good results. The speed of contemporary gymnastics is constantly improving, and the larger body weight will also affect the athlete’s reaction speed.

Artistic qualities are divided into general artistic qualities and special artistic qualities. General artistic quality is a kind of cognitive ability and performance ability of various art forms formed through the training and training of the basic abilities of the individual’s innate inheritance. This ability is the level of artistic knowledge and inner self-cultivation displayed on the basis of aesthetics. Special quality refers to one of the basic qualities that people who are engaged in art or other industries have because of the demand for art in the major. The content of the special quality varies with the different artistic qualities of the major.

Artistic quality is contained in quality. It is an education higher than other qualities, and it is also a kind of ability that everyone should have. Artistic quality can be expressed in the creation of a certain art form, and it can also express one’s own artistic form and ability. We can also improve our own artistic quality through a variety of artistic activities. The atmosphere of life is also a key factor affecting artistic quality. Artistic quality includes human’s own artistic experience, life experience, and a creative understanding of social things and nature.

Rhythmic gymnastics is a combination of art and sports. It is a project that straddles the two major fields of art and sports. It is also the sports project that can best show the beauty of female characteristics. The vitality of artistic gymnastics lies in the pursuit of beauty. The artistic expression of rhythmic gymnastics is mainly embodied in the way of expression of the body. In the complete set of movements, in addition to completing the difficult body movements and equipment movements, the athletes must also express the connotation of the whole set of movements. The complete set of rhythmic gymnastics is like a work of art, which requires careful crafting by artists. In the process of carving, the quality of every detail may affect the quality of the work.

5. Conclusions

Due to the strong professional nature of gymnastics, the coaches are basically professional athletes who have retired. Most coaches lack the training of artistic quality when they are engaged in rhythmic gymnastics training, so they will naturally ignore the correctness in the teaching process. Cultivation of athlete’s artistic is quality. Human action recognition, as an important research topic in the field of computer vision, is very challenging. In this paper, three aspects of moving target detection, feature selection and extraction, and action classification and recognition have been studied. Although many human motion recognition algorithms have been proposed, they need to be further improved when applied to real life. Feature extraction is the most important step in human motion recognition. This paper mainly extracts the motion features such as moment feature, center of mass, speed, and shape features such as compactness, dispersion, and width and height changes of the circumscribed rectangular frame from the moving target as feature vectors to describe the movement of the human body. Since the human body is a nonrigid body, its movements are variable and uncertain. The features extracted in this paper still cannot describe the human movements perfectly, so feature selection and extraction are also the focus of future research. At present, the domestic research is basically a single human action recognition. However, in actual scenes, the motions of moving targets are diverse and difficult to distinguish. It is still a huge challenge to accurately recognize the complex actions of multiple people.

Data Availability

No data were used to support this study.

Conflicts of Interest

The authors state that there are no conflicts of interest regarding the publication of this article.