#### Abstract

This paper presents a predictive control strategy for an image-based visual servoing scheme that employs evolutionary optimization. The visual control task is approached as a nonlinear optimization problem that naturally handles relevant visual servoing constraints such as workspace limitations and visibility restrictions. As the predictive scheme requires a reliable model, this paper uses a local model that is based on the visual interaction matrix and a global model that employs 3D trajectory data extracted from a quaternion-based interpolator. The work assumes a free-flying camera with 6-DOF simulation whose results support the discussion on the constraint handling and the image prediction scheme.

#### 1. Introduction

Past decades have witnessed the extensive development of the visual servoing (VS) control. Three fundamental schemes have practically represented most of VS implementations [1, 2]. First, the image-based visual control (IBVS), also known as 2DVS, employs an error computation between the visual features belonging to a target object in a given image and its corresponding features in a target image. Such error is subsequently employed as guidance for the visual control algorithm just as it is carefully detailed in the following. Second, the position-based visual servoing (PBVS), also named as 3DVS, works entirely on the visual computation of geometric poses, whose values are subsequently used to regulate the camera movement. Likewise, a third group is represented by a wide number of hybrid VS approaches that generally profit over a mindful combination between 2DVS and 3DVS advantages.

In particular, the classic IBVS control problem is defined as an exponential minimization of the aforementioned image plane error between the current and target image . In turn, such error can be subjected to a classic minimization procedure assuming a gradient-like approach such that . A well-known relationship between the object’s velocity and its corresponding image plane velocity can thus be defined by stocking each point velocity relationship into a single matrix known as the interaction matrix or the visual Jacobian. Mathematically, the overall velocity relationship can thus be defined (see [1]) as with being the interaction matrix and representing the velocity screw vector over time. A classical feedback control law can thus be defined asIn this case, is the pseudoinverse of the image Jacobian matrix and a negative constant, with being the resulting control signal. Despite the implementation of such VS scheme being fairly simple, some important drawbacks have been highlighted by Chaumette in [3], with unstable behavior arising from the tracking of large displacements and complex rotations, or from the generation of nonfeasible motions. Therefore, the handling of either the 2D constraints or the 3D limitations, as well as the generation of feasible trajectories for a given visual task, must be all appropriately addressed.

Two constraints must be appropriately handled in order to assure an appropriate visual control behavior: first, the well-known visibility constraint that refers to the adequate handling of the control problem in order to assure that visual features always remain within the camera field of view and second the 3D constraint that challenges the generation of convenient visual servoing schemes that yield admissible camera motions within a valid workspace.

The use of optimal control fundamentals for visual servoing has been defined as an appropriate and convenient tool to build visual servoing schemes that carefully considered the aforementioned visual constraints. Actually, several applications have been reported in the literature over the last two decades. First, the seminal works of Hashimoto and Kimura in [4] and Schramm and Morel in [5] that incorporated an LQ-based optimal control scheme and a Kalman filter-based algorithm, respectively, in order to guide the movements of a robotic manipulator.

Other approaches have capitalized the advantages of the LMI approach to build predictive control schemes for visual servoing [6, 7]. Despite the fact that such works have focused over the designing of an appropriate control law for the visual servoing scheme, other proposals have also included optimal schemes for the combination of path planning and trajectory tracking in order to assure the fulfilment of the visibility constraint and the generation of an optimal trajectory for the camera. Excellent examples of such combination can be found in the works of Schramm and Morel in [8] and the use of LMI structures made by Chesi in [9]. In the particular case of path planning, it is important to consider the work of Mezouar and Chaumette [10] and the robust approach proposed later by Kazemi et al. in [11]. In this case, an LMI based algorithm is used to define an optimal path planning solution assuming that not a unique solution for the problem may exist and also that it may not be unique, while the required camera tracking is supplied through a classic image-based visual controller [12].

Other optimal VS control implementations include the use of predictive control to compensate for errors in the tracking task of a visual feedback scheme in case of no prior information about the 3D model being supplied to the visual controller [13] or in the case of using active filtering through predictive control for biomedical applications that support robotized surgery [14].

Recently, the strategy to incorporate the handling of both visual constraints, that is, the visibility and the feasible motion constraint, within the visual control structure has been focused on expressing the overall visual task from a nonlinear optimization perspective. Therefore, this paper presents a novel optimization scheme that employs an evolutionary optimization method to handle both constraints through a visual predictive control scheme. Under such circumstances, 3D constraints can be considered as state variables while the visibility constraint can be assumed just as a constraint within the output space, just as it has been done in [12]. In order to provide an appropriate model prediction agent, two options are to be considered following the proposal of Allibert and Courtial in [15]. First, a local model uses the classic image Jacobian matrix while a second test uses a quaternion-based 3D trajectory generator. As it will be carefully discussed, the optimization algorithm uses prediction to improve the overall visual servoing performance by means of a predictive control structure that has been specifically designed to fit within the visual control scheme.

Just as it has been widely demonstrated, the use of optimization within the visual servoing control scheme has delivered some relevant contributions in particular for the image-based schemes that naturally handle the most important visual constraints at the same time control signals are generated with remarkable examples being found in [12, 15–17]. However all these solutions use classic optimization methods in order to minimize an objective function since the goal of an optimization scheme is to find an acceptable solution of a given objective function that is defined over a given search space; novel methods that are known as Evolutionary Methods have been proposed as a handy alternative.

In particular, evolutionary algorithms (EA), which are considered as stochastic optimization methods, have been developed by a combination of rules and randomness that mimics several natural phenomena that include some evolutionary processes such as the evolutionary algorithm (EA) proposed by Fogel et al. [18], De Jong [19], and Koza [20], the Genetic Algorithm (GA) proposed by Holland [21] and Goldberg [22], the Artificial Immune System proposed by de Castro and bon Zuben [23], and the Differential Evolution Algorithm (DE) proposed by Storn and Price [24]. Some other methods which are based on physical processes include the Simulated Annealing proposed by Kirkpatrick et al. [25], the Electromagnetism-Like Algorithm proposed by Birbil and Fang [26], and the Gravitational Search Algorithm proposed by Rashedi et al. [27]. Also, there are other methods based on the animal-behavior phenomena such as the Particle Swarm Optimization (PSO) algorithm proposed by Kennedy & Eberhart [28], the Ant Colony Optimization (ACO) algorithm proposed by Dorigo et al. [29], and the BAT algorithm proposed by Yang [30], which is of special importance for this paper.

In particular, this paper approaches the IBVS from an optimization-like perspective that naturally supports the inclusion of visual constraints in the implementation of the vision-based control scheme. As a result, the overall performance of the visual servoing scheme is improved at the same time that the aforementioned constraints are carefully taken into consideration.

The paper has been developed as follows. Section 2 presents an overview of the overall optimization strategy, the control scheme, and its mathematical formulation, as well as the management of image-based constraints that support the optimal IBVS approach. Section 3 focuses on the principles of the BAT optimization algorithm and its basic operational principles. Section 4 discusses the local and global models that are required in the image-prediction scheme, which in turn are represented by the classic IBVS control algorithm and the quaternion-based guidance. Section 5 presents some simulation of the free flying 6-DOF camera in order to demonstrate the active contribution of the the algorithm’s tracking performance and discuss the differences between using the local or the global model for prediction. The last section draws some final conclusions.

#### 2. An Optimization Approach to IBVS

##### 2.1. Structure of the Control Scheme

One of the most successful strategies to incorporate optimization into a feedback control scheme is beyond any doubt of the predictive control. In turn, one of the most well-known structures for predictive control is the internal model control approach [31], whose basic structure has been customized for the image-based visual servoing in the work of Allibert [12]. The basic structure is reproduced in Figure 1 where the robot and its attached camera are modelled inside the plant block. The control input to the system is represented by while the output has been marked as which represents the image plane coordinates of four selected features to track in the image of the object of interest. However, as it is typical in IBVS, the scheme requires the definition of desired (target) locations for the object features in the image, typically represented by . By making use of the error model for IBVS from (1), the predictive control is based upon a generalized error that is defined by the difference between the current plant output at time and the corresponding model output. Define such generalized error as the difference of the system’s output and the predicted model output , yielding , at time . The algorithm should assure that a desired trajectory of visual features on the image plane follows an adequate sequence of points in order to guarantee the fulfillment of both visual constraint that have been mentioned earlier. Therefore, an easy definition for the required trajectory can be defined as the difference between the target feature locations and the preregistered plant-model error , which in turn generates the following expression:The overall error that includes the plant-model difference at time can be included yielding: A very interesting fact emerges as the overall equation is rewritten as follows:This last expression holds a key issue for the optimization approach of IBVS schemes. The minimization of the difference between the desired visual features location and the system’s output corresponds to minimizing the difference between the required visual trajectory and the model output . Actually, the last fact supports the operation of the optimization algorithm that is to be completed if an objective function and some operative rules are defined as it is discussed below.

##### 2.2. Building the Mathematical Framework

As explained above, the definition for the predictive control structure depends on drawing an appropriate objective function of the form: , which will yield a control sequence of the form:with and representing the prediction and control horizon, respectively. The prediction horizon represents the amount of forecast terms to be calculated in advance from the model while the control horizon holds the number of calculated terms that are actually applied to control the plant [32]. In the particular IBVS implementation, only the first term of the control horizon is actually applied to the system [12].

Considering that the overall problem is managed over the image plane and that the visibility constraint is referred to the image plane, the objective function can be initially defined as follows:with being a weighting symmetric definite-positive matrix with the dynamics of the system being described by the nonlinear system: and with and representing the predicted state at time with . The variables for the state and the control signal are defined according to and the model output as . It is important to note that the state computation can vary depending on the particular prediction model that is employed. This issue is carefully addressed in the following.

##### 2.3. Constraints of the Predictive IBVS

Since one of the immediate advantages of the optimization-like approach of the IBVS control is the natural handling of inner constraints of the visual challenge, it is important to denote how such constraints are to be managed by the proposed structure.

The most important constraints have been previously identified as the visibility constraint and the 3D motion constraint. The case for the visibility constraint is also known as the 2D condition. It aims to assure that the location of object’s features of interest for the visual algorithm always remains within a valid location in the image plane. On the contrary, this property can be used to denote inconvenient areas within the image. In terms of the optimization algorithm, the constraint is simply introduced under the limit that includes both the lowest and the highest accepted location within the image space.

On the other hand, the generation of valid 3D trajectories also can be easily included in the optimization process. Since each robotic device must comply with mechanical and dynamic limitation due to workspace limits or actuator saturation, each kinematic pose can be defined in terms of the corresponding instantaneous generalized coordinates . Likewise, the overall pose can also be geometrically constraint under well-known properties such as the full-rank in the instantaneous Jacobian matrix [33]. Therefore, again both considerations can be introduced under the following expressions: with and being minimum and maximum allowed pose while and represent the lowest and the highest generalized coordinate limits. In a similar fashion, it is even feasible to include other mechanical constraints such as actuator limits or torque or force constraints [12]. The optimization procedure commonly includes all aforementioned constraints in the form of nonlinear expression that can be evaluated as the overall predictive algorithm evolves.

##### 2.4. Optimal Approach to IBVS

This section discusses the step by step implementation of the control structure presented by Figure 1. Two parameters are of vital importance in the implementation of the prediction horizon and the control horizon value. Both will in turn coordinate the extent of the optimization influence inside the predictive control scheme. The visual predictive controller and its corresponding optimization cycle are initially computed starting from the error calculation. Such error value is subsequently used to draw the desired trajectory over the number of steps that are defined by the prediction horizon . A step-by-step description is presented below.(1)Current location of visual features is registered.(2)Calculate the value of error , assuming it is kept constant during the number of steps included in the prediction horizon ; that is, .(3)Compute the desired trajectory according to , also .(4)The measured current feature location in the image plane is employed to initialize the model output , which in turn constitutes the feedback loop that is required by the internal model control structure [31].(5)The optimal control signal is defined by the optimization algorithm according to (5); its value is kept constant over the interval to , with and being the control and the prediction horizon, respectively.

Evidently, the two most important parameters in the optimization process are the and . The value of is vital to guarantee an adequate equilibrium between the system stability and the computational feasibility of the overall implementation. A high value of implies the generation of softer control signals while a small value allows a wider exploration of novel control values at the cost of weakening the overall system stability. On the other hand, the control horizon value regulates how many steps forward are required to reach the objective. A high value of accounts for a slower control behavior that is not feasible for visual control implementations. In practical terms, the value of is commonly assigned to 1, which corresponds to keeping the control signal constant over the number of steps previously defined by the prediction horizon .

Finally, it is important to discuss about weighting matrix that is a third participant of the optimization configuration set. Its value is commonly assigned to an identity matrix of dimension , despite some successful examples of using a time-varying matrix values in order to increase the sensibility to the error value when the steps are close to reach the horizon value [15].

Since the overall mathematical design of the visual predictive control has been envisioned, the study should turn to discuss over the feasibility of employing an evolutionary optimization algorithm to increase the performance of a predictive visual control structure.

#### 3. The BAT Evolutionary Algorithm

Approaching the IBVS from an optimization-like perspective naturally supports the inclusion of visual constraints in the implementation of the vision-based control scheme. In particular, the use of the BAT evolutionary algorithm as the main optimization procedure provides an easy implementation while still delivering an acceptable performance.

The analogy supporting the BAT algorithm is based on the echolocation ability that is exhibited by microbats in their quest for food. Bats generate an ultrasonic beam that can vary its pulse frequency or its intensity which is commonly known as the loudness in the algorithm. The ultrasonic signal is delivered in advance to their movements. By using loudness variations, intensity variations between both ears and the time delay in receiving the signals back, bats are able to reconstruct an overall scenario despite the fact that they may move through a varying context.

The BAT algorithm is therefore built over the assumption that they fly at random while looking for food. Such movement is registered at position with velocity assuming the bat emits a fixed frequency and a variable with loudness . During the search, the pulse emission rate can vary in accordance to the proximity of the target. For simplicity the analogy considers frequency to fall in the interval assuming that higher frequencies imply a shorter travelling distance. Under the same simple assumption, the rate pulse is computed as , with 1 representing the maximum rate of pulse emission [30].

The heart of the BAT algorithm is centered over the location computation for virtual bats. The movement is administered over the -dimensional search space by updating positions and velocities as follows:In this case, represents the current global best solution after assessing all current available solutions. Frequency is computed through the difference between and , with being a uniformly distributed random value. In practice, the value of is of vital importance because it controls the movement scope for each particle.

In a similar fashion, loudness and pulse rate are defined with regard of the bat’s analogy. Loudness should decrease as the bat is approaching its prey. A range between 0 and 1 is used to implement this feature, with being used when one bat (searching particle) has found a prey (minimum) and therefore is not emitting a sound signal; otherwise signals when the bat is searching through the space and therefore producing its maximum sound. An easy implementation considers a variable , yieldingThe pulse emission rate is defined under a similar scheme but assuring an exponential decay on its influence as time is evolving as follows:With representing sampling time, , and being a tuning constant. Avoiding any loss of generality, accounts for the last updated value of pulse emission rate. It must be noticed that according to the original BAT implementation [30], loudness and pulse emission rate will only be updated if new solutions are improved which means that searching particles are moving closer to an optimal solution.

The overall BAT algorithm can be summarized over the following pseudocode:(1)Define the initial population and initial velocities in vector .(2)Select a pulse frequency at and pulse rates and loudness values .(3)Do until get to number of iterations:(3.1)Calculate new solutions through frequency .(3.2)Update velocity and location for each bat (particle), using (8) and (9).(3.3)Generate a random value and compare to .(3.4)If , one solution among best solutions must be chosen. A new local solution must be generated around that selected best solution.(3.5)Using a random bat’s fly (particle movement), generate a new solution.(3.6)If and , then accept new solutions, increase , and decrease .(3.7)Reevaluate all particles to find the current new best .(4)Publish results.

In practical grounds, the use of a similar value for and yields a similar treatment for decreasing the loudness and for increasing the emitted pulses rate at the time a prey (minimum) is being located. Appropriate values for both values must be experimentally determined. In our implementation a simple selection of has been used with good results. Several other BAT implementations with different parameters settings are reported in [32].

#### 4. The Local and Global Mathematical Models for Optimization

Once the BAT optimization algorithm has been carefully developed, the discussion should turn into the selection of the plant model that is to guide the predictive control strategy. The model is used to predict the movement of visual features with respect to the camera velocity over a finite prediction horizon [12]. The classic visual servoing scheme can be easily implemented to provide a simple model, denoted as SM in the following. Such model uses fundamental equations to define the movements on the image plane as a result of the camera movement. This paper has also explored the use of a quaternion-based interpolation to generate the required location of the visual features over the image plane. Fundamentals of the classic visual servoing model, denoted as SM, are sketched below.

##### 4.1. Local Model: The Classic Visual Servoing Scheme

The most well-known visual servoing scheme employs point-like image features as main reference. The location of such features in the image plane is denoted as , with referring to the feature number and and representing the vertical and horizontal image plane coordinates, respectively. For a given 3D point in the space, that is, , that is defined with respect to the camera frame, its projection into the image plane is easily defined in normalized coordinates assuming and . Likewise, the camera velocity is defined through the classic screw vector . A careful review of the fundamentals of visual servoing in [1] explains the velocity relationship from the camera velocity to the image plane feature’s velocity through the following relationship:where represents the image Jacobian or interaction matrix which holds the required velocity relationship. The overall matrix is built by literately piling up the following matrix for each feature that is being characterized, as follows:The location of each feature can thus be defined through a simple integration method as follows:with representing the sample time, defining the screw vector, and accounting for the best estimation of , considering that depth information is required and must be either calculated through a 3D model of the object of interest or through a careful assessment of its approximated value. In this paper, we use the advantage of holding a full 3D model of the image formation and therefore its computation can be exactly defined. Following the procedure in [12], the optimization process can be acutely described by considering each feature as a state which allows expressing the overall optimization function as follows:By using such optimization function, 2D constraints are naturally handled within the SM visual servoing model.

##### 4.2. Global Model: Spherical Interpolation

A global model is required to generate an alternative option in order to support the optimization contribution within the visual servoing scheme. In this case, the quaternion-based interpolation, also known as slerp [34], is a very useful tool considering its intrinsic advantages such as the smooth interpolation, fast concatenation and simple inversion of angular displacements, and a quick conversion to homogeneous transforms.

The classic visual servoing problem considers both a start and a target pose which, in turn, can be easily expressed in quaternion grounds. Once both are converted, the interpolation is easily computed by the following expression:With being the step index, whose value defines the interpolation step, with signaling for the last step in the sequence. The spherical interpolation is incorporated into the predictive control strategy as the model guiding the BAT optimization algorithm. It compares the fitness of the proposed interpolated pose through its corresponding features and those features that are generated by each proposed particle. The comparison variables can be easily explained by the following three cases that are illustrated by Figure 3, as follows:(1)Particles generated by the BAT evolutionary algorithm are depicted in bold circles while the particle generated by the spherical linear interpolation is represented in a void circle. The algorithm compares the fitness value for each candidate solution. In this case, one of the BAT generated particles is selected.(2)In the second case, the particle generated by the slerp interpolator is chosen as it has obtained the best fitness value.(3)The third case shows when the algorithm has reached the desired location and the visual servoing task is finished.

#### 5. Simulations

For all the simulations in this paper, the sample time is constraint by the number of frames per second that is provided by the camera. Since a 30 frames per second device is used, the sampling period is ms. A free-flying camera is located in an initial position defined by vector . First three values are defined in meters while last three components, the roll, pitch, and yaw angles, are referred in radians. Figure 2 shows the initial and final pose for the camera. It also illustrates the four features that represent the object of interest. The simulation uses the classic SM model with the classic image Jacobian matrix.

Figure 4 shows the evolution of the visual servoing control, with the graphs showing, in clockwise order from the upper left corner, the resulting screw vector evolution, the camera pose error, considering location and orientation, and the image plane feature trajectories and the image errors in the horizontal and vertical direction, respectively.

The predictive visual servoing scheme is tested over the same initial conditions. 20 bat particles are initialized to random positions; the pulse emission rate and the loudness value are defined to , with the decay variable falling within the range of 0 to 1.

Figure 5 presents the results of the predictive visual servoing scheme with a BAT optimization algorithm in the control feedback. The graphs show the evolution of the visual servoing control featuring, in clockwise order from the upper left corner, the resulting screw vector evolution, the camera pose error, considering location and orientation, and the image plane feature trajectories and the image errors in the horizontal and vertical direction, respectively.

In practical grounds, the classic visual servoing scheme required 39.62 seconds to complete the full servoing task. On the other hand, the BAT-based predictive visual controller required only 13.28 seconds in order to accomplish the same assignment.

A very handy comparison of the BAT visual predictive controller is related to the execution time of such algorithms and their contrast to the use of a quaternion-based simple interpolator. Figure 6 shows the time improvement of the BAT-based predictive visual control scheme.

Figure 6 shows a time comparison between the classic IBVS method, the pure-quaternion solution, and the BAT-based predictive visual control scheme. The BAT-based scheme shows a sharper trajectory in the image plane as a result of the improvement related to the optimal use of the solution holding the best fitness value. It is important to remind that the scheme is also capable of naturally handling both of the required visual constraints.

A challenging second experiment is performed in order to demonstrate the handling of both aforementioned visual constraints. A typical problem in the classic visual servoing scheme emerges when the required movement is a rotation of *π* radians over the vertical axis of a given object; its visual feature tracking typically tends to fail in the control law, because such movement implies a sudden backwards movement of the camera, which in turn yields a tracking failure for the object of interest features.

The use of the predictive visual control scheme with the BAT optimization algorithm in the feedback loop yields a solution for the aforementioned problem. Thanks to the particle delivery and the fitness evaluation for each particle during the control phase, the controller manages to generate appropriate spherical trajectories that avoid any instability due to the feature trajectory crossing in the image plane [3]. Figure 7 shows an illustration of the optimal controller response to such circumstance.

Notice in Figure 7 that the generated screw vector requires velocity only in the direction of as the controller aims to turn the overall problem 180 degrees. A schematic view is illustrated by Figure 8. It is evident how the required movement is simply a rotation around the vertical axis. The required time to complete the turning of image features reaches the 27 seconds. The feature trajectories in the image plane seem to describe a circular movement as a result of the contribution from the quaternion-based interpolator.

#### 6. Conclusions

This research has demonstrated the usefulness of a predictive control strategy for an image-based visual servoing scheme that employs an evolutionary optimization algorithm to improve the performance of the servoing task. The visual control task is approached as an optimization problem that naturally handles relevant visual servoing constraints such as workspace limitations and visibility restrictions. Two models have been used in the implementation: a simple model based on the classic visual servoing scheme and a quaternion-based slerp interpolator. The mindful contribution of both models is controlled through a particular objective function that evaluates the fitness of the proposed interpolated pose through its corresponding features and those features that are generated by each proposed particle. In practical grounds, the spherical interpolation is incorporated into the predictive control strategy as the model guiding the BAT optimization algorithm. The simulation results support the contribution of the proposed scheme regarding the handling of required visual constraints.

#### Conflict of Interests

The authors declare that there is no conflict of interests regarding the publication of this paper.