Video event detection and annotation work is an important content of video analysis and the basis of video content retrieval. Basketball is one of the most popular types of sports. Event detection and labeling of basketball videos can help viewers quickly locate events of interest and meet retrieval needs. This paper studies the application of anisotropic diffusion in video image smoothing, denoising, and enhancement. An improved form of anisotropic diffusion that can be used for video image enhancement is analyzed. This paper studies the anisotropic diffusion method for coherent speckle noise removal and proposes a video image denoising method that combines anisotropic diffusion and stationary wavelet transform. This paper proposes an anisotropic diffusion method based on visual characteristics, which adds a factor of video image detail while smoothing, and improves the visual effect of diffusion. This article discusses how to apply anisotropic diffusion methods and ideas to video image segmentation. We introduced the classic watershed segmentation algorithm and used forward-backward diffusion to process video images to reduce oversegmentation, introduced the active contour model and its improved GVF Snake, and analyzed the idea of how to use anisotropic diffusion and improve the GVF Snake model to get a new GGVF Snake model. In the study of basketball segmentation of close-up shots, we propose an improved Hough transform method based on a variable direction filter, which can effectively extract the center and radius of the basketball. The algorithm has good robustness to basketball partial occlusion and motion blur. In the basketball segmentation research of the perspective shot, the commonly used object segmentation method based on the change area detection is very sensitive to noise and requires the object not to move too fast. In order to correct the basketball segmentation deviation caused by the video noise and the fast basketball movement, we make corrections based on the peak characteristics of the edge gradient. At the same time, the internal and external energy calculation methods of the traditional active contour model are improved, and the judgment standard of the regional optimal solution and segmentation validity is further established. In the basketball tracking research, an improved block matching method is proposed. On the one hand, in order to overcome the influence of basketball’s own rotation, this article establishes a matching criterion that has nothing to do with the location of the area. On the other hand, this article improves the diamond motion search path based on the basketball’s motion correlation and center offset characteristics to reduce the number of searches and improve the tracking speed.

1. Introduction

The rapid increase of new data forced changes in information processing methods. Starting from the initial text data, it was mainly reflected in data extraction, analysis, and mining [1]. Later, people were not satisfied with the processing of simple data and began to develop into graphics and video images. This processing is mainly concentrated in the field of graphics and video image recognition. In recent years, people began to be interested in video data and began to study more complex data and gradually expanded the application field to the detection of highlights in sports videos, behavior recognition, and other fields [2, 3]. The richness and continuity of video data bring great benefits to the audience visually, and at the same time, it can bring people enjoyment from the auditory sense, so the video is more likely to win everyone’s favor [4]. However, in recent years, there has been a flood of video data. People are often confused when faced with the vast amount of video data. On the one hand, they do not know where to start. On the other hand, the video lasts too long and it is difficult to take a lot of time to complete the entire video. It can be seen that how to dig out key information from massive data and how to build different interest models for different users’ needs have become unavoidable problems in video analysis [5, 6].

Video information accounts for a large proportion of Internet information. With the rapid development of computer technology, network technology, multimedia technology, and mobile terminals, video data is also rapidly growing and spreading [7]. How to quickly retrieve the video clips that people are most interested in or most concerned about from the huge video data is an urgent need. The traditional video retrieval technology is based on the text description information outside the video to build an index, but due to the lack of accuracy and objectivity of the text content, the retrieval results are inaccurate and cannot meet the retrieval needs of users [8]. Therefore, video retrieval technology has gradually developed to the analysis and research of video semantic content. Video event detection and annotation work is an important content of video analysis, and it is the basis of research on video content retrieval and video abstract generation [9]. Basketball is a type of sport that is popular all over the world. The videos of popular basketball events have a very large and fixed audience around the world, and they are watched, reproduced, and disseminated in large numbers. For the audience, they hope to quickly locate the events they are most interested in from the complete video; for professional sports coaches and athletes, they hope to focus on specific events and clips of the video to help formulate training plans and competition strategies. For sports video editors, they hope to quickly extract the highlights from the video and automatically generate a video summary [10]. These are very urgent and intuitive needs, so event detection and labeling of basketball videos have always been a research hotspot in the multimedia field [11].

This article takes basketball video analysis as an example, focusing on the study of the key object of the game-basketball segmentation and real-time tracking. First, according to the characteristics of the basketball in the close-range shot and the long-range shot, the corresponding algorithm is designed to complete the basketball segmentation; then, the basketball in the subsequent frames is segmented based on the interframe tracking method. This paper introduces the classic watershed segmentation algorithm, analyzes its shortcomings—oversegmentation, and uses forward-backward diffusion to process video images. The video images in the nonlinear scale space obtained smooth weak edges, thus effectively solving the problem of oversegmentation, and the effect of segmentation is better. We introduced a PDE algorithm for the segmentation-active contour model and its improved GVF Snake and analyzed how to use the idea of anisotropic diffusion to improve the GVF Snake model, so as to obtain a new GGVF Snake model and improve the segmentation effect. In the close-range shot basketball segmentation, the color segmentation is first performed according to the consistency of the basketball color distribution; then, the direction variable filter is used to detect the edge and edge direction characteristics of the video image; the Hough transform is improved based on the edge point and its direction, and the basketball center and the direction are extracted. Radius. In the basketball segmentation of the perspective shot, Gauss filter is used to smooth the video image noise; secondly, the background area is separated by the difference between frames, and the connected areas are labeled; then, a “basketball recognition strategy” is formulated according to the characteristics of the basketball, and the candidate basketball area is reserved. Finally, the edge deviation is corrected, and the improved active contour model is used to establish the optimal solution strategy for candidate region extraction. According to the time correlation of basketball movement, a basketball tracking method based on frame interval is proposed. We use the basic characteristics of the segmented basketball to establish an object template and an adaptive object model; then, use the object model to improve the block matching method to track the basketball. Finally, the tracking deviation is corrected, and the basketball tracking effectiveness detection mechanism is established.

Relevant scholars proposed a difference scheme with fourth-order approximation in the space direction and first-order approximation in the time direction, and the stability and convergence of the format were analyzed by the method of Fourier analysis [12]. The researchers increased the order in the time direction to the second order and applied energy inequalities to analyze the stability and convergence of the format. Compared with the one-dimensional case, the numerical solution of the two-dimensional problem is much more difficult. On the one hand, the fractional derivative is a nonlocal operator and has historical dependence, which means that the solution of the current time layer needs to use the solutions of all previous time layers; on the other hand, when applying the existing implicit scheme to solve, it will spend a lot of computational complexity and running time. This problem is especially obvious when solving high-dimensional problems [13, 14].

Related scholars proposed to introduce the theory of fractional calculus into the PDE video image denoising model and build a video image denoising model based on the fractional PDE [15]. The main idea is to use the fractional gradient operator to replace the integer gradient operator to construct the energy functional of the denoising model. Due to the nonlocal nature, the fractional differential derivative can be used to compensate for the shortcomings of the integer-order model (step effect and contrast loss) in video image processing. So far, preliminary research results have been obtained. Related scholars have studied fractional-order space and fractional-order high-order linear filtering methods, which provide a certain theoretical basis for the application of fractional-order models in video image processing [16, 17].

Some scholars believe that video is composed of a series of related video images and audio [18]. The preliminary research of video analysis mainly focuses on feature extraction and improvement research directions. Related scholars can effectively analyze the video image information of the video frame and detect the main color area [19]. They statistically establish the color distribution model of different exciting events, such as the goal area and the color of the referee’s clothes, and finally realize the detection of goals and referees through learning and summarization [20]. Simple video image features are not enough to model the entire video. Video is composed of two parts: video image and audio. Some scholars should consider these two factors comprehensively and make full use of auditory features while extracting visual features [21].

From the perspective of video streams, related scholars analyze basketball sports videos [22]. The video is divided into two parts: video image stream and audio stream as the input of two channels, and simple video image features are implemented for the video image stream to realize lens labeling. Stream first establishes event keywords. For example, keywords related to goal events include “three-point goal,” “goal scored,” and “score” and then perform voice recognition on the input audio. When there is a match, it is considered that the wonderful event has occurred. Finally, the detection of basketball goals, fouls, and shots is realized by combining audio features and video image features [20]. However, the stability of audio and video features is poor, and the ability to adapt to complex sports videos is poor. Some scholars have begun to introduce the concept of video semantics into video analysis [23]. Foreign institutions, such as Carnegie Mellon University (CMU), IBM Watson Research Center, and Microsoft Research, have successively carried out research on video content analysis and understanding [24]. In recent years, famous domestic video websites introduced the research of video semantic intelligent search, through the semantic analysis and understanding of the video content, to realize the fast video retrieval.

Unsupervised learning does not require training samples and directly models and classifies data. Its typical representative is the clustering algorithm, which compares the similarity between targets by extracting features, and directly classifies similar targets into one category. However, the video data is diversified, unsupervised learning is difficult to give full play to, and the clustering results are difficult to interpret. At present, the most successful exciting event detection models all use supervised learning methods. The supervised learning method learns the existing samples, establishes an optimal model based on the known input and output of the samples, and then uses the established model to detect the input samples and give the recognition results.

Related scholars have studied the potential relationship between video data and feature tags based on the Bayesian Network (BN) [25]. First of all, they extracted 6 basic information for the football game video, including football goal, player face, audio information in live broadcast, text in video, title, and texture information of video image. Then, they used the knowledge of the relevant field of the football game to learn the probability distribution of known videos, modeled the goal event in the football game, and realized the goal detection.

3. Method

3.1. Watershed Algorithm for Video Image Segmentation

Through the study of the gray-scale changes of the video image, the video image can be regarded as an area with different terrain heights. The gray value of a certain point of the video image represents the height of the terrain at that point. By performing precipitation on the video image, multiple “reservoirs” can be formed in the video image, and each local minimum corresponds to a reservoir. In order to separate different reservoirs, it is necessary to build a dam adjacent to the reservoir. When the precipitation completely floods the terrain, the completed dam becomes the “watershed” of the area, and these “watersheds” can divide the video image into different areas. The main advantage of the watershed algorithm is that it can obtain a closed boundary of a single pixel and can detect weak edges, which are difficult to achieve by other video image segmentation algorithms. The reason why it has these advantages is that the watershed algorithm divides the video image. The theoretical basis is mathematical morphology. Mathematical morphology not only has a solid theoretical foundation and unified basic ideas but also has important practical value. The schematic diagram of the video data structure is shown in Figure 1.

The watershed algorithm is a video image segmentation method that uses the “split-merge” algorithm. In the process of merging, the location of the “dam” determines the segmentation result, and different definitions of the threshold of the “dam” have a huge impact on the segmentation result. Too small a threshold will result in too many “dams,” thus forming the most common oversegmentation problem. Too large a threshold will cause too few “dams” to meet the requirements of video image segmentation. Moreover, the watershed algorithm is very sensitive to the noise in the video image, and the influence of noise is one of the culprits that cause the oversegmentation problem of the watershed segmentation algorithm.

3.2. Forward-Backward Diffusion Reduces the Oversegmentation of the Watershed Algorithm

The multiresolution video image analysis method is to analyze the video images of different resolutions and extract the characteristics of the video images of various resolutions. The multiresolution video image analysis method provides a new way of understanding video images, which makes the video image information gradually simplified in the scale space, but the important feature information is retained, so the image segmentation is obtained. The Gaussian filter can generate a linear scale space of the video image and is a powerful multiscale analysis tool. Gaussian smoothing provides a useful tool for multiscale analysis of video images; unlike Gaussian filtering, anisotropic diffusion filtering uses a nonlinear diffusion method but pays attention to keeping the video image information while simplifying the video image information. For edge information, the scale space generates nonlinear. We discussed the anisotropic diffusion filtering model and improved models and their properties. Anisotropic diffusion is widely used in various fields of video image processing due to its smooth and edge-preserving characteristics. After using the anisotropic diffusion method to filter the video image, the parts with smaller gradients in the video image will be smoothed, and the edge features with larger gradients can generally be retained, and anisotropic diffusion will not be generated in the video image edge features. Because it not only simplifies the details of the video image but also maintains the nature of the edge features, it is more suitable for the preprocessing process of video image segmentation. The semantic hierarchy of the video is shown in Figure 2.

Oversegmentation means that the video image is divided into a large number of small enclosed areas, which makes the result of the segmentation completely meaningless and unusable. The corresponding solution is to preprocess the video image and smooth the parts with smaller gradients so that these parts cannot affect the watershed algorithm. The phenomenon of segmentation is weakened, and the number of segmented regions is greatly reduced, so that the segmentation result conforms to the original information of the video image. In the process of using the watershed algorithm to segment the video image, the original video image is generally not processed directly, but instead watershed segmentation is performed on the video image gradient mode. This is because the gradient modulus can more effectively and conveniently calculate the edges of the video image. Anisotropic diffusion can generate video images of different scales, and the watershed algorithm is used to segment video images of different scales, and the results obtained are different.

However, when the P-M anisotropic diffusion algorithm is used to smooth the video image, the gradient of the video image will be weakened. The edge of the medium gradient will gradually weaken as the iteration progresses and cannot be detected by the watershed algorithm. Therefore, in order to maintain the sharp edges of the video image and improve the accuracy of segmentation, we use the forward-backward diffusion method to process the original video image, because the forward-backward diffusion can not only effectively smooth the areas with small gradients but also effectively increase the contrast of the medium gradient area.

As shown in Figure 3, we perform forward-backward diffusion filtering iterative processing on the original video image. It can be seen that as the number of iterations increases, the segmented area gradually decreases, and the segmentation result becomes more and more accurate.

3.3. Snake Model

If the problems at the bottom cannot be solved, the tasks at the top will not be carried out. Because the video image is affected by various conditions such as noise and deformation, or due to the limitations of existing technical means, many underlying problems cannot be solved. Because the errors and errors of the bottom layer will propagate to the top layer, this affects the resolution of the high-level problems even more, leading to difficulties in all aspects.

In order to overcome the difficulties caused by the above-mentioned layering mechanism, in computer vision, not only high-level problems need support from the bottom, but the same low-level problems also require knowledge of the high-level. According to this idea, an energy function can be designed. This energy function designs many local extreme points according to the scheme of high-level knowledge, but the calculation of the function energy depends on the video image itself. Therefore, this energy function can converge to an appropriate local extreme point through the interaction between the upper layer and the lower layer, so as to extract the corresponding video image characteristics. Based on the above analysis, an active contour model (ACM) called “snake” is proposed. It is driven by internal and external forces: internal force is the energy of the curve itself, which restricts the shape of the curve; external force is the energy field generated by the target property, which guides the curve to the desired state. When the external force and internal force tend to balance, the energy function reaches a minimum, so that the curve converges to the target position or a position consistent with the nature of the video image itself. Its energy expression is

Among them, represents the parameter of the curve, the active contour is represented by a parameterized curve in the equation, represents the energy of the active contour curve itself, and represents the external energy received by the active contour curve. For the internal energy, that is, the energy possessed by the active contour curve itself, it is defined as

where represents the parameterized active contour curve with respect to the first-order derivative of the parameter and represents the parameterized active contour curve with respect to the second-order derivative of the parameter . The function of the coefficient is to control the contraction speed of the active contour curve, just like an elastic rope. If is larger, the curve contraction speed is faster, so the parameter is also called elastic coefficient. The function of the coefficient is to control the difficulty of the deformation of the active contour curve. If the larger , the less easily deformed the active contour curve. Therefore, the parameter is also called the stiffness coefficient. From this, we can see that if you adjust and to appropriate values, you can keep the active contour curve smooth when it is deformed and change at an appropriate speed.

External energy is an energy parameter defined by external factors. Generally speaking, when the active contour curve reaches the desired target, the overall energy should reach a local minimum, so the external energy should be defined as a negative value. In this way, it can meet the condition of minimum total energy. In the calculation of external energy, we can regard a video image as a continuous function of position . In order to find the contour in the video image, we can use the gradient vector of the video image as the external energy of the active contour; the field has the following methods:

According to the analysis of internal energy and external energy in the previous article, the complete energy expression can be written as follows:

When the energy corresponding to the active contour curve is the smallest, the Euler-Lagrangian equation should be satisfied, so

So the problem of energy minimization is equivalent to solving a partial differential equation. We consider as a function of time and arc length and use finite-difference iteration to solve the above equation, so

The energy of the traditional active contour model is composed of external energy and internal energy. The internal energy fully considers the characteristics of the curve, so the curve can be effectively controlled. However, in the external energy term, only the gradient energy information in the video image is considered, and the external energy is limited to a range very close to the target point, which results in that the active contour curve must be very close to the target when it is initialized. Moreover, due to the small range of the external energy field, the active contour curve cannot enter the recessed area. Therefore, the improvement of the active contour model must focus on expanding the effective range of external energy.

3.4. Active Contour Model of Isotropic Diffusion and Gradient Vector Flow Field

Using the idea of diffusion, the vector diffusion equation is applied to the edge map of the video image. By solving this equation, a new external energy force field can be obtained to replace the potential energy function based on gradient energy, which can be obtained:

This new external energy force field is called the gradient vector flow (GVF) field, and the new model generated by applying it to the active contour model is called the gradient vector flow active contour model.

The vector of the gradient vector flow field is a vector:

It should be obtained by the calculation of the extremum of the functional in the following formula:

The GVF external force field is obtained through multiple diffusions of the edge gradient mapping. Due to the expansion of the external force field, the GVF Snake model has much lower requirements for the initialization of the live contour curve. The traditional Snake model requires that the contour initialization must be in the vicinity of the target contour, but in the GVF Snake model, we can initialize the contour at will. Even if it is far away from the target, it can still accurately converge to the correct position.

3.5. Anisotropic Diffusion and Generalized GVF Snake Model

The GVF Snake model has been greatly improved in effect. It can enter the recessed part, but it still has shortcomings. One of them is that it cannot enter the narrow and narrow recessed part and cannot accurately converge to the corner point. Therefore, it is necessary to improve the GVF Snake model.

The gradient vector flow field is essentially a diffusion equation. The first term in the formula is a smoothing term using isotropic diffusion, which can get a smooth gradient vector flow field; the latter term is a data item; the gradient vector flow field keeps the same with the gradient direction when it is close to the edge feature. However, because the first term is an isotropic diffusion equation, it will cause excessive homogenization and blur. The inability to enter the narrow and small recesses and the inability to accurately converge to the corners is also because the geometric scales at these positions are relatively small. After the diffusion blur, the gradient vector flow field cannot be kept pointing to the correct position. Therefore, it is proposed that the GVF Snake model can be improved with the idea of anisotropic diffusion:

The corresponding Euler equation is

Then, the expression of the vector field is

The first term of the above formula is an anisotropic diffusion, so it can avoid the homogenization blur when the gradient is large. You can refer to the P-M model, let

Then, the generalized gradient vector flow model (GGVF) is obtained. From the comparison of several vector fields, it can be seen that it overcomes the shortcomings of the GVF Snake model that cannot accurately track the corner points and cannot enter the narrow and narrow area. The comparison of vector fields is shown in Figure 4.

4. Results and Discussion

4.1. Experimental Platform

We designed a basketball segmentation and tracking experiment platform based on Matlab. The current interface is the analysis of the single-frame video image segmentation results. The “object segmentation” in the menu is used to execute the video segmentation program; the “route extraction” is to extract the basketball movement route based on the location information of the segmented basketball for simple analysis.

You enter the “object segmentation” processing unit, enter the starting frame number of the video to be segmented, and the “Program Execution” dialog box will pop up. The content of the dialog box includes the center and radius of the basketball circle of the currently processed video frame, program running time, and progress. After the program is executed, click the “Parameter Statistics” button to pop up the “Parameter Statistics” dialog box to record the execution efficiency of the program. You enter the “Basketball Route Extraction” unit, enter the starting frame number that needs to extract the basketball position information, and then, the basketball route extraction result will pop up.

4.2. Analysis of the Results of the Basketball Segmentation Experiment

There is a lot of background information in basketball video, and basketball has a strong structure. Therefore, this paper adopts a video segmentation method based on change area detection, that is, by detecting the changed and unchanged areas of each frame of the video sequence of the video image, the basketball object and the background area are separated. First, Gauss filter is used to smooth the video image; then, the background area is separated by the difference between frames, and the connected object areas are labeled; again, the “basketball recognition strategy” is formulated according to the characteristics of the basketball area, and the possible basketball areas (i.e., candidate basketball areas) are reserved. Further, we adopt the improved active contour model to correct the edge deviation and establish the optimal solution strategy for the effectiveness of the detection area extraction.

We set , and decision threshold . The relevant parameters of the two candidate basketball areas are shown in Table 1. Figure 5 is the final segmented basketball result.

4.3. Results and Analysis of Basketball Tracking Experiment

The hardware and software platforms of the basketball tracking experiment are shown in Table 2.

The experiment selected video clips of two games (Asian Men’s Basketball Championship, China vs. South Korea and Japan vs. Philippines) as test data. The basketball tracking result of the video processing technology based on the diffusion equation model is shown in Figure 6. It can be seen from Figure 6 that when the basketball area is between 100 and 300 pixels, the sum of the tracking speeds of the four video clips is higher than 18 frames per second, which basically meets the real-time needs.

4.4. Basketball Route Description and Basketball Route Extraction Based on MPEG-7

We refer to the motion behavior descriptor of MPEG-7 to describe the basketball motion trajectory. MPEG-7 motion behavior descriptors are divided into five categories: (a)Exercise intensity

Exercise intensity reflects the intensity of exercise, divided into 5 levels; the higher the level, the more intense the exercise. In a basketball game, static shots represent low-intensity sports, while fast-break shots are high-intensity sports. (b)Direction of movement

When there are several objects with different moving directions in the video lens, we use the moving direction to define the main moving direction of the video. (c)Spatial distribution of motion

The spatial distribution of motion characterizes the range and quantity of the motion distribution area. For example, when a news program host broadcasts a program, the video motion fills the entire video image area, while in the aerial shot of a busy street video, the motion continues in many small areas. (d)Movement space position

The spatial position of the motion is the spatial area where the intensity of the motion is concentrated in the duration of the video shot. Using this information, we can easily retrieve and classify video clips whose exercise intensity is concentrated in a certain airspace. (e)Exercise time distribution

The motion time distribution represents the duration of the motion in the video; that is, it is judged whether the motion is in the entire video or only lasts for a period of time.

Combined with the MPEG-7 motion behavior descriptor, it describes the basketball trajectory in the game video, so as to make sufficient preparations for further extracting the characteristics of basketball.

The position information of the basketball in each frame of the video has been obtained before, in order to extract the basketball movement route. First of all, the basketball area is idealized; that is, the center point of the basketball area is used to replace the area; the coordinates of the center of the basketball in each frame of the video are counted to obtain the basketball movement trajectory. The game video frame rate used in the experiment is 25 frames per second, and the basketball movement between frames is not fast. Therefore, we use linear interpolation to extract the basketball trajectory between frames. Figure 7 shows the basketball trajectory extracted from this video based on the diffusion equation model. Among them, the red dots represent the coordinates of the center of the basketball in each frame of the video.

5. Conclusion

This article introduces two commonly used watershed segmentation algorithms and active contour model algorithms in video image segmentation, analyzes their algorithm principles, advantages, and disadvantages, and analyzes how to combine the ideas and methods of anisotropic diffusion against their disadvantages. These two algorithms are processed and improved separately and applied to the segmentation of video images. From the results of the experiment, it can be seen that the improved method has achieved better results, and the effect of image segmentation has been significantly improved. The research topic of this paper is the segmentation and real-time tracking of basketball objects in video. According to the basic characteristics of basketball in the long-range and close-range shots, the corresponding basketball segmentation and recognition strategies are formulated. In order to improve the real-time performance of the algorithm, we make full use of the segmented basketball features and use an improved block matching method to track the position of the basketball in subsequent frames. Finally, the contour deviation caused by the video image noise is corrected, and the validity of the tracking result is checked. In the basketball segmentation of close-up shots, firstly, color segmentation is performed to remove the background area, and then, directional edge detection is performed to determine the edge point and the edge normal direction; the edge point and its direction are used to extract the center and radius of the basketball in a multiscale radius. In the basketball segmentation of the perspective shot, first, Gauss filter is used to smooth the video image noise; then, the frame difference is used to separate the background area; again, the “basketball recognition strategy” is formulated according to the characteristics of the basketball area, and the candidate basketball area is retained. The improved active contour model is used to extract the optimal solution of the candidate basketball area. Using the tonal characteristics of the basketball object template, an adaptive object model is established for basketball object tracking. This model can effectively eliminate nonbasketball areas, thereby reducing the interference of the algorithm by areas unrelated to basketball. From the aspects of matching criteria and search path, the traditional block matching algorithm is improved. The improved method can effectively eliminate the interference caused by the rotation of the basketball while increasing the calculation speed. We utilize the peak characteristics of edge gradient to correct edge deviation caused by noise and basketball contour deformation. Starting from the reality of basketball tracking, the energy calculation method of the active contour model is improved, and the detection index is designed to measure the effectiveness of basketball extraction. Experimental results show that this method can effectively track basketball objects in complex backgrounds and has good robustness to complex sports such as basketball partial occlusion, motion blur, and basketball rotation.

Data Availability

The data used to support the findings of this study are available from the corresponding author upon request.

Conflicts of Interest

The authors declare that they have no known competing financial interests or personal relationships that could have appeared to influence the work reported in this paper.


This work was supported by the National Social Science Fund Project (No. 19BTY087).