#### Abstract

The core issue of automatic manipulator tracking control is how to ensure the given moving target follows the expected trajectory and adapts to various uncertain factors. However, the existing moving target trajectory prediction methods rely on highly complex and accurate models, lacking the ability to generalize different automatic manipulator tracking scenarios. Therefore, this study tries to find a way to realize automatic manipulator tracking control based on moving target trajectory prediction. In particular, a moving target trajectory prediction model was established, and its parameters were optimized. Next, a tracking-training-testing algorithm was proposed for manipulator’s automatic moving target tracking, and the operating flows were detailed for training module, target detection module, and target tracking module. The proposed model and algorithm were proved effective through experiments.

#### 1. Introduction

With the rapid development of industrial technology, manipulators have been successfully applied to original manual operations, becoming the most widely used manmade tool for industrial production [1–6]. The application of manipulators makes production more efficient and flexible. The core issue of automatic manipulator tracking control is how to ensure the given moving target follows the expected trajectory and adapts to various uncertain factors [7–11]. It is of great practical significance to derive an automatic tracking control strategy for moving targets under uncertainties and external interference.

Target tracking is an important prerequisite for manipulator-assisted services. Zhu et al. [12] improved the near-field computer vision system for intelligent fire robots. The improved system can predict the falling jet path under the complex light environment and interference during firefighting, identify the jet path based on length and area ratio, and parametrize and extract the features of jet path by superposing radial centroids. Wu et al. [13] adopted a human-following method suitable for a manipulator containing visual sensors with a limited perception range, integrated two physical motion models into an adaptive trajectory prediction algorithm, and improved the prediction accuracy by adaptive adjustment of model parameters. For the trajectory control of Par4 parallel robot, Zhang and Ming [14] designed a type 2 fuzzy predictive compensation proportional-integral-derivative (PID) controller based on the improved dynamic gray wolf optimizer (GWO) based on the mutation operator and the eliminating-reconstructing mechanism (DMR-GWO2). The proposed controller speeds up the response of the parallel robot and improves the adaptability of the entire system.

In actual conditions, two manipulators are often needed to pick up and place moving objects through the planning and execution of collision-free trajectories. Tika et al. [15] put forward a layered control strategy for collaborative picking and placement tasks in a narrow, shared workspace and realized the synchronous execution of task scheduling in top-level design, path planning, and robot tasks. Xia et al. [16] proposed a visual prediction framework based on time granularity. The core of the framework is an integrated moving target prediction module based on multiple long short-term memory (LSTM) neural network. Compared with the latest prediction algorithms, the framework excels in prediction accuracy, success rate, and robustness. Focusing on the action understanding of mirror neurons, Zhong et al. [17] simulated the walking mode of humanoid robots and predicted the moving direction according to the previous walking trajectory.

Trajectory prediction is the last step in the visual perception of the manipulator. After a series of segmentation, detection, and tracking, the algorithm could determine the type, bounding box, and other information of the object. However, the future movement trend and trajectory of the target must be predicted to realize automatic tracking control. To sum up, the traditional trajectory prediction methods for moving targets mainly rely on features such as color and contour. The recognition effect is very poor, if the target has multiple features. Moreover, the existing moving target trajectory prediction methods rely on highly complex and accurate models, lacking the ability to generalize different automatic manipulator tracking scenarios [18–22]. Therefore, this study develops an approach for automatic manipulator tracking control based on moving target trajectory prediction, aiming to improve the manipulator’s trajectory prediction accuracy and automatic tracking control effect. Section 2 establishes a moving target trajectory prediction model and optimizes its parameters. The established model can predict the position and pose of irregular moving objects at the same time and boast a strong generalization ability. Section 3 details the principles of the tracking-training-testing algorithm for manipulator’s automatic moving target tracking and explains the operating flows for the training module, target detection module, and target tracking module in the algorithm. The proposed model and algorithm were proved effective through experiments.

This study solves the problems of the manipulator in recognition, positioning, and trajectory prediction of moving objects, models the error in target tracking, and tests the feasibility of the proposed method through tracking experiments. The internal parameters of the proposed trajectory prediction network for moving objects were all trained on datasets. The training ensures the degree of modularity and generalization ability of the network. However, the prediction precision of our network could be further improved by changing network structure and modifying network parameters, when the network is applied to predict the position and pose of complex and irregular moving targets.

#### 2. Moving Target Trajectory Prediction Model

The precision of moving target trajectory prediction hinges on the accuracy of motion model. This study establishes a moving target trajectory prediction model based on LSTM, which is known for its good accuracy and generalizability, and further enables the manipulator to recognize, and automatically track and control the moving target.

##### 2.1. Model Construction

To accurately predict moving target trajectory, this study imports the three-dimensional (3D) spatial position of a moving target from time *h* to time *h* + *K* into the trajectory prediction model, which outputs the 3D spatial position of the moving target at time *h* + *K* + 1.

Figure 1 shows the overall structure of our moving target trajectory prediction model. The model consists of an input layer, a hidden layer, an output layer, and a training module. In the input layer, a complete sequence of moving target trajectories is subjected to *Z*-score normalization:

To satisfy the input requirements of the hidden layer, the input data were segmented. Let *K* be the prediction step length of the model. Then, the tensor of the input data after the segmentation can be expressed as follows:

Batch processing is applied on the input data to fully utilize computer resources and improve the training efficiency of the neural network. That is, *A* is treated as a tensor composed of a batch of 3D spatial coordinates [*r*, *K*, 3], where *r* is the number of batch processing samples. The training accuracy of the model must ensure that each batch of data is a complete trajectory of the moving target; i.e., the batch size should be defined as . Then, we have the following equation:

The theoretical output of the input layer can be expressed as follows:

The hidden layer in the trajectory prediction model contains *K* LSTM nodes, which are connected in chronological order. Each node has *F* LSTM units. The output of the hidden layer can be expressed as follows:

The dimensionality [*r*, *K*, *F*] of *O* should be consistent with that of model output. Let be the weight of a fully connected layer, and *t* be the output of the output layer. Before outputting the predicted position of the moving target, the data must be handled by a fully connected layer:

To test the prediction accuracy, the number *r* of batch processing samples is set to 1. The first *K* 3D spatial coordinates of a complete trajectory in the test set are imported:

Based on the input , the model outputs the predicted trajectory:

Let be the 3D spatial position predicted for the moving target at time *K* + 1. This position is merged with the last 3D spatial positions in to obtain the new input for the trajectory prediction model:

Then, is imported to the trajectory prediction model. The model will output the predicted 3D spatial position of the moving object at time . The above steps are iteratively executed, and the final prediction of the 3D spatial position of the moving object can be obtained as follows:

The fitting and prediction accuracy of the model can be quantified by the error between input *A* and output *t*.

Both the predicted value and theoretical output of the trajectory prediction model are 3D spatial coordinates. The loss of the model is calculated by the Euclidean loss function. Let b be the theoretical output of the model. The error between predicted value and theoretical output can be calculated by the following equation:

The model training aims to gradually reduce the value of the loss function. Based on the AdaGrad algorithm, the learning rate *δ* of our model is updated automatically. Let *ξ* be the small constant to prevent denominator from being zero; *ω* be the weight parameter of the model. Then, the model can be updated by the following equation:

##### 2.2. Model Parameter Optimization

There are many parameters in our trajectory prediction model. The most critical ones include prediction step length *K*, the number of hidden nodes *F*, and the learning rate *δ*. To weaken their influence on the prediction of moving target trajectory, this study firstly evaluates the prediction accuracy on all test samples and then chooses the optimal combination of *K*, *F*, and *δ*, which leads to the highest prediction accuracy. The objective function can be expressed as follows:

The multilayer grid search algorithm is adopted to process *K*, *F*, and *δ* to determine the best values of these crucial parameters. The grid search is carried out from inside to outside in three steps:

*Step 1. *Set the number of batch processing samples *r* and number of training steps *T*_{steps}, which are two key parameters, and preset the value ranges of *K*, *F*, and *δ* based on formula (13).

*Step 2. *Traverse *K*, *F*, and *δ* layer by layer, and implement model training and prediction in the innermost layer. After the training, maintain the three parameters to obtain the fitting and prediction accuracies of the model.

*Step 3. *Sort the parameter search results in descending order by the prediction accuracy, and select the *K*, *F*, and *δ* in the top-ranking combination for the optimal model.

#### 3. Automatic Tracking Control Algorithm

##### 3.1. Algorithm Principles

Based on machine vision, manipulator moving target tracking might involve multiple moving targets at a time and needs to consider multiple motion states of each target. The moving targets face changes in moving direction, speed, color, and brightness, and could be occluded by obstacles. Therefore, the tracking technology should be able to detect the 3D spatial position of each moving target in real time and judge whether the target is missing or occluded. This study proposes a tracking-training-testing algorithm for manipulator’s automatic moving target tracking and combines the algorithm with moving target trajectory prediction to enable manipulators to grasp, as well as automatically track and control targets.

The automatic tracking algorithm can select the moving target from each frame image of the video stream. The architecture of the algorithm is shown in Figure 2. The training module processes the detection result of the target detection module and the tracking result of the target tracking module. The processing and feedback results from the training module are used to update the target detection module and the target tracking module. This cyclic optimization process can handle complex situations, such as the appearance changes in the moving target over time and the temporary disappearance of the moving target from the shooting range, thereby ensuring the target identification and tracking effects of the algorithm.

Let GYH be the normalized cross-correlation coefficient. To select the moving target from the video frame, the similarity between two adjacent frames and must be defined before analyzing the main modules:

The matching image set *N* containing both positive samples and negative samples of moving targets can be expressed as follows:

Then, *n* positive sample and *n* negative samples are sorted in the order of *i* = 1, 2, 3, …, *n* and then added to the matching image set.

The similarity between a matching image *N*_{G} and each frame can be divided into the similarity with the nearest neighbor of and the similarity with the nearest neighbor of :

The similarity between the frame and the labeled first half of the positive samples can be calculated by the following equation:

The cross-correlation of can be calculated by the following equation:

Formula (18) shows that the value of RE^{s} falls in [0, 1]. The greater the RE^{s}, the more credible that the frame contains a moving target. The conservative similarity of can be calculated by the following equation:

The cross-correlation obtained by formula (18) is the threshold for the nearest neighbor classifier that determines the similarity (RE^{s}, RE^{d}) between frame and matching image *N*_{G}. If RE^{s} (, *N*) > *ρ*_{MM}, is a positive sample; if RE^{s} (, *N*) < *ρ*_{MM}, is a negative sample. Here, RE^{s} (, *N*)-*ρ*_{MM} is the classification threshold ensuring the convergence of the classifier.

##### 3.2. Target Detection Module

The variance classifier is the first link of the cascade classifier in the target detection module. Let *Q* () be the expectation of solved by the integral image method. Then, the variance of any frame can be calculated by the following equation:

If the total variance of gray values for all pixels in the frame within the window is smaller than half of the total variance of gray values for all pixels in the moving target box, then the window is invalid and needs to be removed. In this way, half of the image contents, including ground and shadows, can be eliminated.

The ensemble classifier is the second link of the cascade classifier in the target detection module. The frame outputted by the variance classifier is imported to the ensemble classifier composed of *m* basic classifiers. Here, each basic classifier is a decision tree (DT). The output of classifier *i* is a posterior probability vector composed of code a:

The *m* classifiers output *m* posterior probability vectors. The mean of all vectors can be calculated by the following equation:

If , the window is retained; if , the window is eliminated.

As the eigenvalue of the frame, the combined code vector is distributed to all the basic classifiers of the ensemble classifier. Each basic classifier corresponds to a posterior probability. The *i*th posterior probability is denoted as . If the posterior probability of each basic classifier is described by binary code a, thenwhere

During initialization, , and the posterior probability corresponding to each basic classifier characterizes a negative sample. During later training, the ensemble classifier classifies the labeled frames and updates (*b|a*) (as shown in Figure 3).

Most unqualified contents are eliminated from the input frame through the filtering by both variance filter and ensemble filter. The filtered results are further processed by the nearest neighbor classifier. If RE^{s} (, *N*) > *ω*_{MM}, the frame content in the scanning window is a positive sample.

##### 3.3. Target Tracking Module

The target tracking module combines the Lucas–Kanade (LK) optical flow method with the forward and backward error tracking theory. The forward and backward directions refer to the positive and negative directions of the sequence of video frames, respectively. If there is a large error between the target tracking results in the two directions, then the predicted trajectory of the moving target must be incorrect and unreliable. The forward-backward error helps to judge whether the moving target is tracked successfully, but cannot identify unobvious errors in trajectory prediction. Therefore, this study designs an image frame difference comparison method for slow-moving target tracking points. The frame sequence of slow-moving target can be expressed as follows:

Let be the coordinates of the moving target at time *τ*; be the times of forward tracking of point . Then, the forward trajectory tracking sequence of the moving target can be given by the following equation:

The forward tracking and backward tracking are denoted by subscripts *x* and *y*, respectively. Then, the pixel coordinates are backward tracked to the previous frame. Then, the backward trajectory tracking sequence can be given by the following equation:

Combining formulae (26) and (27), the tracking error of the moving object can be obtained by the following equation:

To sum up, the forward and backward tracking errors can be obtained by formula (28), as long as a suitable threshold is determined for different image sequences. Then, it is possible to judge the success or failure of target tracking. Figure 4 illustrates the flow of tracking error calculation.

##### 3.4. Training Module

The training module contains the classifier to be trained, labeled training set, positive/negative training sample generator, etc. The classifier is trained on the training set to achieve comprehensive integrated learning. Figure 5 explains the flow of the training module. During classifier training, the training quality is closely associated with the absolute number of labeled positive and negative samples. Hence, the training module should be able to quantify the relationship between the classifier performance and the absolute number of samples. The classifier performance can be characterized by the reliability of positive sample labels, the incorrect detection probability of negative samples, the accuracy of negative sample labels, and the incorrect detection probability of positive samples.

The reliability of positive sample labels can be characterized by the ratio of the number of correctly detected positive samples to the sum of the number of correctly detected positive samples and the number of incorrectly detected positive samples :

The incorrect detection probability of negative samples can be characterized by the ratio of the number of correctly detected positive samples to the number of incorrectly detected negative samples Φ:

The reliability of negative sample labels can be characterized by the ratio of the number of correctly detected negative samples to the sum of the number of correctly detected negative samples and the number of incorrectly detected negative samples :

The incorrect detection probability of positive samples can be characterized by the ratio of the number of correctly detected negative samples to the number of incorrectly detected positive samples Ω:

The classifier performance evaluation equations (29)–(32) must satisfy the following equation:

The number of incorrectly detected negative samples Φ and the number of incorrectly detected positive samples Ω can be, respectively, updated by the following equation:

Assume . Then, a matrix *Q* can be defined as follows:

After rewriting formulae (25) and (26) as matrices, the recursive formula of can be established as follows:

The above formula shows that the recursive system of the manipulator’s moving target tracking is both discrete and dynamic. Thus, the ultimate control goal of our algorithm is to gradually reduce the system error increment to zero, with the growing number of iterations.

#### 4. Experiments and Result Analysis

The multilayer search algorithm was introduced to optimize the three parameters *K*, *F,* and *δ* of the proposed moving target trajectory model. Firstly, the number of the training steps was set to 120, and the value ranges of the three parameters were preset as follows: *K*∈{15, 20, 25, 30}, *F*∈{60, 120, 180, 240}, and *δ*∈{0.01, 0.02, …, 0.1}. The objective function is to maximize the prediction accuracy of moving target trajectory, i.e., minimize the prediction error. The possible parameter combinations were sorted in descending order of error. Table 1 lists the top five parameter combinations and their errors. It can be seen that the optimization of the three parameters greatly enhanced the accuracy of our moving target trajectory model.

The three key parameters of the moving target trajectory prediction model were optimized as *K* = 30, *H* = 60, and *δ* = 0.08. Next, the hidden units in the hidden layer nodes were configured as recurrent neural network (RNN) and gated RNN (GRNN). The prediction results of these two models were compared with those of our model (Table 2). Our model achieved better training accuracy and test accuracy than RNN and GRNN.

Figure 6 records the loss variations of different prediction models during the training. Overfitting occurs to the RNN when the training lasts too long; i.e., the number of iterations is too large. As shown in Figure 6, the loss of the RNN dropped the fastest, but the loss of our model gradually moved below that of RNN and GRNN, with the growing number of iterations.

The prediction error was defined as the distance from the spatial coordinates on the predicted trajectory of the moving target to those on the actual trajectory. Table 3 compares the prediction errors of our model with RNN and GRNN. When too many trajectory points needed to be predicted, RNN had a lower prediction accuracy than GRNN and our model, because it cannot effectively process the historical positions on distant trajectories. Our model surpassed the GRNN by 56.7% in the prediction accuracy of the spatial coordinates on the trajectory of moving targets.

Figures 7 and 8 show the predicted trajectory of moving targets and the predicted grasping position trajectory of the manipulator. Figure 9 presents the prediction error of moving target trajectory. Table 4 lists the prediction error of moving target trajectory. Most errors were within 0.2 cm, which verify the generalizability of the proposed tracking control algorithm.

To verify the learning effect of our training module, the probability density of classification error was calculated. The classification error of the classifier fell in (−0.9142, 0.8747), which basically obeys normal distribution (as shown in Figure 10).

#### 5. Conclusions

This study explores how to realize automatic manipulator tracking control based on moving target trajectory prediction. Firstly, a moving target trajectory prediction model was established, and its parameters were optimized. Next, a tracking-training-testing algorithm was proposed for manipulator’s automatic moving target tracking, and the operating flows were detailed for training module, target detection module, and target tracking module. The experimental results show the effectiveness of the proposed model and algorithm. During the experiments, the parameter combination was optimized, the corresponding errors were obtained, and the values of three key parameters *K*, *F,* and *δ* were optimized. The prediction results and losses of different models were compared, revealing that our model is more accurate in prediction than other models. Finally, the moving object trajectory and the manipulator’s grasping position trajectory were predicted, and the prediction error of moving target trajectory was used to confirm the generalizability of the proposed tracking control algorithm.

#### Data Availability

The data used to support the findings of this study are available from the corresponding author upon request.

#### Conflicts of Interest

The authors declare that they have no conflicts of interest.

#### Acknowledgments

This research was supported by the National Natural Science Foundation of China (Grant No. 51902094).