Journal of Optimization

Journal of Optimization / 2018 / Article

Research Article | Open Access

Volume 2018 |Article ID 3213484 | https://doi.org/10.1155/2018/3213484

Gisela C. V. Ramadas, Edite M. G. P. Fernandes, António M. V. Ramadas, Ana Maria A. C. Rocha, M. Fernanda P. Costa, "On Metaheuristics for Solving the Parameter Estimation Problem in Dynamic Systems: A Comparative Study", Journal of Optimization, vol. 2018, Article ID 3213484, 21 pages, 2018. https://doi.org/10.1155/2018/3213484

On Metaheuristics for Solving the Parameter Estimation Problem in Dynamic Systems: A Comparative Study

Academic Editor: Liwei Zhang
Received16 Jul 2017
Revised11 Dec 2017
Accepted31 Dec 2017
Published29 Jan 2018

Abstract

This paper presents an experimental study that aims to compare the practical performance of well-known metaheuristics for solving the parameter estimation problem in a dynamic systems context. The metaheuristics produce good quality approximations to the global solution of a finite small-dimensional nonlinear programming problem that emerges from the application of the sequential numerical direct method to the parameter estimation problem. Using statistical hypotheses testing, significant differences in the performance of the metaheuristics, in terms of the average objective function values and average CPU time, are determined. Furthermore, the best obtained solutions are graphically compared in relative terms by means of the performance profiles. The numerical comparisons with other results in the literature show that the tested metaheuristics are effective in achieving good quality solutions with a reduced computational effort.

1. Introduction

The present paper addresses the problem of finding a set of parameter values in a dynamic system model that calibrates the model so that it can reproduce the existing experimental data in the best possible way [1]. This is performed by minimizing an objective function that measures the goodness of the fit. Thus, an objective function, which gives the sum of the squared errors between the model predicted state values and the observed values (at certain time instants of a fixed interval), is minimized, subject to a system of ordinary differential equations (ODE). The dynamic model defined by the system of ODE simulates the time varying processes that take place in a certain time interval. This problem is known in the literature as the dynamic model based parameter estimation (DMbPE) process. Solving this problem is crucial in systems biology and medicine, with a great impact on both pharmaceutical and biotechnological industries [2]. The problem is also very common in the chemical engineering area [3] and it has been extensively used to describe physical phenomena [4]. The DMbPE problem may involve nonlinear differential-algebraic equations and a multiplicity of local solutions may exist. Nonconvexity and ill-conditioning issues of the problem may be addressed with efficient global optimization (GO) methods and proper regularization techniques [5]. The problem is even more difficult than that of an algebraic model. The nonlinear dynamic behavior makes the analytical approach rather complicated, if not impossible, for most of the real phenomena. Numerical direct methods are therefore good alternatives to solve the DMbPE problem. Like any parameter estimation problem, the main advantage is that the intended global minimum is known in advance. The most well-known methods, like Levenberg-Marquardt, gradient descent, and the Nelder-Mead method, which have been extensively used in parameter estimation problems, are local optimization methods and cannot guarantee convergence to a global solution. They are able to exploit a specified search region around a given initial value, but the global solution is difficult to find when the initial value is far from the required solution. Algorithms for GO can be roughly divided into two categories: exact algorithms and heuristics [6].

Sörensen and Glover [7] have defined metaheuristic as follows: “A metaheuristic is a high-level problem-independent algorithmic framework that provides a set of guidelines or strategies to develop heuristic optimization algorithms.” Further, an implementation of a heuristic optimization algorithm that follows the strategies laid out in the metaheuristic framework is also referred to as a metaheuristic. This term appeared for the first time in a publication by Glover [8]. Metaheuristics are approximate methods or heuristics that are designed to search for good solutions with less computational effort and time than the more classical algorithms. While heuristics are designed to solve a specific problem, metaheuristics are general-purpose algorithms that can be applied to solve almost any optimization problem. The metaheuristics evaluate a set of potential solutions of an optimization problem and perform a series of operations on those solutions in order to find different and hopefully better solutions. Depending on the way these operations are carried out, a metaheuristic can be classified as a local search, constructive, or population-based metaheuristic. For details, the reader is referred to [7].

Most metaheuristics use random procedures that invoke artificial intelligence tools and simulate nature behaviors and their performance does not depend on the properties of the problem to be solved. They are alternative methods to find good approximations to optimal solutions of GO problems. In addition, they are derivative-free and easy to implement [9]. Metaheuristics can locate the vicinity of global solutions with relative efficiency although global optimality cannot be guaranteed in a finite number of iterations. On the other hand, exact methods for GO can guarantee global optimality for certain problems but the required computational effort increases often exponentially with the problem size [10, 11].

Previous use of metaheuristics, mainly those that use metaphors based on natural evolution and on behavior of animal swarms, to solve some DMbPE problems showed that they are able to provide good quality solutions when real experimental data and noisy artificial data are considered. In [15], a hybrid metaheuristic algorithm that introduces evolutionary operations, namely, mutation and crossover, into the firefly algorithm (FA), is presented. Two variants of the differential evolution (DE) method—a trigonometric and a modified version—have been implemented in [16, 17]. The authors in [2] developed a new procedure based on the scatter search (SS) methodology. A slightly different version of the SS, which uses the fmincon solver from Matlab (Matlab is a registered trademark of MathWorks, Inc.) as an improvement method, is tested in [12] to solve three well-known DMbPE problems, including the chemical isomerization of -pinene. This latter problem is analyzed and solved by the FA in [13]. Alternatively, exact methods for GO have also been used to solve DMbPE problems. These include the spatial branch and bound (BB) algorithm to compute a global solution of the partially discretized DMbPE problem [18], the -BB method to obtain a global solution within an -precision of the completely discretized DMbPE problem [10], and a method based on interval analysis and on the partially discretized DMbPE [19]. In [14], a deterministic outer approximation approach is applied to the reformulation of the DMbPE problem as a finite NLP by applying a complete discretization using orthogonal collocation on finite elements.

Since the problem of estimating the parameters of a dynamic model is important, the contribution of this study is concerned with the implementation and practical comparison of five very simple and easy to implement metaheuristics, hybridized with a local intensification phase, when solving the finite nonlinear programming (NLP) problem that arises when a numerical direct method of a sequential type is used to locate a global optimal solution to the DMbPE problem. The unknown parameters are the decision variables of the NLP problem.

The selected metaheuristics are very popular and have been used to solve a variety of real-world applications. The selection includes the FA [20], the Harmony Search (HS) algorithm [21], the DE algorithm [22], the Artificial Bee Colony (ABC) algorithm [23], and the Directed Tabu Search (DTS) algorithm [24]. During the local intensification phase, the well-known Hooke-and-Jeeves (HJ) local search [25] is implemented with all the metaheuristics. These local search enhancements lead to much faster convergence and provide good quality solutions. The practical comparison carried out in the present study aims to analyze the performance of each metaheuristic in terms of the quality of the obtained solutions. To test the five metaheuristics, nine DMbPE problems were selected from the literature [4, 10, 13, 19, 26], yielding 12 instances due to different experimental data. The problems are described in the Appendix.

The remainder of the paper proceeds as follows. Section 2 addresses the DMbPE problem and the sequential numerical direct method, and Section 3 introduces the selected metaheuristics. Section 4 contains the results of all the numerical experiments and the conclusions are summarized in Section 5.

2. Sequential Direct Method to Solve the DMbPE Problem

To solve the DMbPE problem, the sequential numerical direct method is used. For completeness, the DMbPE problem reads as follows. Find such thatwhere represents the objective function that depends on , a vector of the decision variables (the parameters that need to be estimated); is the vector (with length ) of the state variables that depend on (the independent variable); and represents the vector with the derivatives of with respect to . represents the number of experimental data available for each ; , , , are the experimentally observed values; is the system of first-order ODE; is the vector of initial values to the variables , ; and and represent the vectors with the lower and upper bounds on the parameters, respectively. and are the initial and final values, respectively, of the independent variable .

To solve problem (1), there are indirect methods and direct methods [27]. Indirect methods use the first-order necessary conditions from Pontryagin’s minimum principle to reformulate the problem as a two-point boundary value problem. The latter may become difficult to solve especially if the problem contains state variable constraints.

In direct methods, the optimization present in (1) is performed directly. This type of method discretizes the problem and applies NLP techniques to the resulting finite-dimensional optimization problem. There are two types of direct methods. In the sequential direct method, the DMbPE problem is transcribed into a small finite-dimensional optimization problem through the discretization of the decision variables only. Hence, the optimization is carried out in the space of the decision variables only. Given a set of values for the decision variables, the system of ODE is accurately integrated (over the entire time interval) using a specific numerical integration formula (with error control mechanisms to enforce the accuracy of the state variable values), so that the objective function value can be evaluated [3]. Thus, the ODE are satisfied at each iteration of the NLP solver. The method is called sequential, since the processes of minimizing the sum of the squared errors and solving the ODE are done in a sequential manner. This method may lead to a slow convergence process since the system of ODE is solved again and again each time an objective function value is required. In the simultaneous direct method, approximations to the solution of the system of ODE are computed instead. The optimization is carried out in the full space of the discretized decision and state variables, which may lead to very-large-dimensional finite NLP problem with equality constraints (emerging from the discretization of the system of ODE), in particular if a fine grid of points is required to obtain a high level of integration accuracy. Thus, very efficient NLP solvers are required when solving the finite problem.

We use the sequential direct method and the system of ODE is numerically integrated by the Matlab function ode15s. The default value for the scalar relative error tolerance is set to and all the components of the vector of absolute error tolerance are set to . The sequentially discretized NLP problem is then solved by a stochastic metaheuristic. This is further elaborated in the next section.

3. Stochastic Metaheuristics

To compute global optimal solutions to NLP problems, stochastic or deterministic methods are available. Stochastic GO methods are able to provide a near-optimal solution in a short CPU time, although it may not be globally optimal. In contrast, deterministic GO methods provide an interval within which the global optimal solution falls, although they require very large computational efforts [14]. Stochastic metaheuristics, such as FA, HS, DE, ABC, and DTS, include randomly generated numbers from probability distributions to define a set of solutions on the search region, as well as to move those solutions to hopefully better positions. They ensure convergence to a global solution in probability, while deterministic methods tend to guarantee asymptotic convergence. Global search techniques use exploration and exploitation search procedures that aim to diversify the search so that a global optimal solution is located and to intensify the search so that a good approximation is computed in a promising area of the search space.

Most stochastic metaheuristics are classified in terms of source of inspiration as nature-inspired algorithms. The well-known swarm-intelligence-based algorithms belong to a wider class called the bioinspired algorithms, and these are a subclass of nature-inspired algorithms [28]. The most popular are the algorithms based on swarm intelligence, with the FA and the ABC algorithm being two examples. On the other hand, the HS is not a bioinspired algorithm and has its inspiration from music. Although the DE algorithm is not based on any biological behavior, it can be classified as a bioinspired algorithm due to the keyword “evolution,” but it is not swarm-intelligence-based [28]. Finally, the DTS algorithm is not a nature-inspired algorithm, but it has the particularity of using past information of the search to record it in memory. The majority of the metaheuristics define a set of approximate solutions at each iteration, called population, and generate new trial solutions that may be accepted as the solutions for the next iteration, if some improvement is detected relative to the current set of solutions.

We use the notation to represent the th point of a population of points, possible solutions of problem (1). The set defines the bound constraints of the problem.

3.1. The FA

The FA is a bioinspired metaheuristic algorithm that is capable of converging to a global solution of an optimization problem. It is inspired by the flashing behavior of fireflies at night [20, 29]. The main rules used to construct the FA are as follows: (i) all fireflies are unisex, meaning that any firefly can be attracted to any other brighter one; (ii) the brightness of a firefly is determined from the encoded objective function; (iii) attractiveness is directly proportional to brightness but decreases with distance.

Table 1 lists the most relevant nomenclature used in the basic FA. At the beginning, the positions of the fireflies in the search space are randomly generated. Then, they are evaluated by computing the corresponding objective function value and the points are ranked in ascending order; that is, is the brightest firefly, is the second brightest, and so on. In the classical FA, a firefly () is moved towards the brighter fireflies as follows:where the second term on the right-hand side of (2) is due to the attraction while the third term is due to randomization with , is the attraction parameter when the distance is zero, is a number uniformly distributed in , and is a vector that aims to scale the movement to the set . The parameter is crucial to speed the convergence of the algorithm and it can take any value in the set . When , the attractiveness tends to the minimum constant value . To accelerate convergence and at the same time prevent premature convergence, the sequence of values is made to slowly decrease along the iterative process. If a component of the final position of a firefly falls outside the set , it is shifted onto the boundaries.


Number of fireflies in the population
Attractiveness of a firefly
Absorption coefficient/variation of attractiveness
Randomization parameter
The brightest firefly
The less brighter firefly

3.2. The HS Algorithm

The HS algorithm was developed to solve GO problems in an analogy with the music improvisation process where music players improvise the pitches of their instruments to obtain better harmony [21, 30]. At each iteration , the basic HS algorithm provides a set of solution vectors, available in the harmony memory (HM), from which the best and the worst solutions, in terms of objective function values, are identified. Table 2 lists the most relevant nomenclature used in the HS algorithm.


HMHarmony memory
Size of the HM
The best solution in HM
The worst solution in HM
Harmony memory considering rate
The pitch adjusting rate
Distance bandwidth

The HM contains solution vectors that are maintained in memory throughout the iterative process. First, the solutions in the HM are randomly generated in the search space , , . Then, they are evaluated and the best harmony, , and the worst, , in terms of objective function value are identified. Hereafter, at each iteration, a new harmony is improvised; that is, a new vector is generated, using three improvisation operators. When the HM operator is used, the component of is chosen from the HM with probability ; otherwise, the random selection operator is used and the component is randomly generated in :for , where is a random number uniformly distributed in . Finally, the pitch adjustment operator is subsequently applied with a probability , which varies with the iteration counter , to refine only the components produced by the HM operator, as follows:where is the distance bandwidth that depends also on [31]. Finally, the components of are checked against the bounds and projected onto the boundaries if they fall outside. In the final stage, if , the HM is updated since the new harmony is included in the HM, replacing the worst one.

3.3. The DE Algorithm

The DE is a bioinspired population-based algorithm that relies on three strategies—mutation, crossover, and selection—to define the solutions/points for the next iteration [22]. The most important nomenclature and parameters are shown in Table 3.


Number of points in the population
Amplification parameter
Mutant point
Trial point
Crossover parameter

The initial population of points, , , is randomly generated in the search space . The most commonly used mutation strategy defines the mutant point , , as follows:with uniformly chosen random indices , , and from the set , mutually different, and is a real parameter in which controls the amplification of the differential variation, . The indices , , and are also chosen to be different from the index . is called the base point.

The components of the mutant vector are then mixed with components of the vector to generate the trial point, . This strategy is referred to as crossover and can be described as follows:for , where denotes a random number uniformly generated within the interval and aims to perform the mixing of the component of the points, , and , an index randomly selected from , aims to ensure that gets at least one component from . Finally, the components of are projected onto the boundaries of if they fall outside. Then, a selection strategy compares each trial point with in terms of objective function values and the best one is selected for the population of the next iteration. The variant described above is referred to as DE/rand/1/bin, where “rand” specifies the vector that is mutated, “1” defines the number of difference vectors, and “bin” means that crossover is based on independent binomial experiments [22]. There are other frequently used DE variants, for instance, the DE/best/1/bin which uses the best point of the population as the base point to define the mutant point and the DE/rand-to-best/1/bin with two differential variations:where is the best point in the current population [32].

3.4. The ABC Algorithm

The ABC algorithm is an optimization algorithm based on the intelligent behavior of honeybee swarms [23]. The colony of artificial bees of size includes the employed bees, the onlooker bees, and the scout bees. The first half of the colony consists of employed bees and the second half constitutes the onlookers and scouts. The position of a food source represents a possible solution of the optimization problem and the amount of nectar in the food source gives the quality of that solution. The number of food sources, , is taken to be equal to the number of employed bees. The most important nomenclature and parameters in the ABC algorithm are shown in Table 4.


Number of bees in the colony
Number of food sources
Food source/solution
Mutant solution
“limit”Limit for abandonment

At the initial stage, a set of food source positions are randomly selected by the bees; that is, the positions are randomly generated in the search space , and their nectar amounts are determined in terms of fitness values. During the employed bee phase, new candidate food positions, called mutant solutions, are produced as follows:for , where (different from ) and are randomly chosen indexes from the sets and , respectively, and is a random number uniformly generated within the interval . If the mutant component falls outside , it is shifted to the boundaries. Afterwards, the mutant solution is compared with and a greedy selection is applied to choose the one with better nectar, that is, with better fitness, as described in (9). If the current solution is maintained (meaning that it has not been improved), its trial counter is increased.

On the other hand, during the onlooker bee phase, the food sources are randomly chosen according to probability values, (associated with the food sources), that depend on their fitness evaluated byThe mutant solution is produced from the chosen old one as previously shown in (8) and the greedy selection is applied between the current and its mutant. Finally, in the scout bee phase, if the trial counter of the solution that has not been improved mostly exceeds a predetermined value, called “limit,” that solution is abandoned and replaced by a position randomly generated in the search space [33].

3.5. The DTS Algorithm

The tabu search (TS) algorithm, introduced to continuous optimization in the paper [34], is capable of guiding the search out of local optima and exploring new regions for the global optimum. This is a point-by-point iterative procedure that maintains a list of the most recent movements, denoted by “tabu list” (TL), so that the movements that lead to solutions previously visited are avoided.

The DTS method developed in [24] is hybridization of the TS with a direct search method that aims to search the neighborhood of a local minimum. The method comprises three main search procedures: exploration, diversification, and intensification. The main loop consists of the exploration and the diversification search procedures. The exploration search aims to explore the search space and uses direct search methods to be able to stabilize the search, in particular in the vicinity of a local minimum. Cycling is prevented by the TL and by another four TS memory elements: the multiranked tabu list, the tabu region, the semitabu region, and the visited region list (VRL). The VRL works as a diversification tool and is used to direct the search towards regions that have not been visited in the search space. The most important nomenclature and parameters of the DTS algorithm are listed in Table 5.


TLTabu list
VRLVisited region list
Ratio of accepting diversification point
Edge length (with )
Region radius in VRL ()

During the exploration procedure, DTS uses an adaptive pattern search strategy to generate an approximate descent direction (ADD) for the objective function at the current approximation. First, based on an approximation , pattern directions parallel to the coordinate axes are constructed and trial points , , are generated, along these directions with a given step length. Thus, the ADD is obtained bySecond, two local trial points and are then generated along with two different step lengths aiming to explore the region along [24]. The diversification procedure aims to randomly generate a new trial point outside the previously visited regions. The VRL contains the centers of the visited regions and the frequency with which these regions are visited. If the shortest distance of to a region’s center, weighted by its frequency, exceeds the predefined region radius, given by , the point is accepted; otherwise, a new trial point is generated. When one of the best obtained trial solutions is sufficiently close to a global minimum, or its value has not been changed for a specified number of iterations, the DTS algorithm leaves the exploration and diversification search procedures and invokes an intensification procedure. Here, the HJ local search is used to compute a solution still closer to the global minimum (see the next section).

3.6. Hooke-and-Jeeves Local Search

A pattern search method directs the search towards a minimizer using a pattern of specific number points. At least points must be provided by the pattern, where is the number of variables. Based on a current approximation, the HJ method uses a pattern search strategy to define a trial solution that gives an improvement in the objective function value, when compared with the current point [25]. Let the current approximation be denoted by , where represents the iteration counter in this iterative process. Then, a trial point, , is generated along a search direction (starting from the current point) with a certain step size as follows:where is the search direction chosen from a finite set of positive spanning directions in . The most used set contains the coordinate directions, defined as the positive and negative unit coordinate vectors . The most important property of this set is that at least one of the coordinate directions is a descent direction for the objective , when is not a stationary point of . When the search fails to generate a trial point that is better than , the iteration is called unsuccessful, the step size is halved so that a refined search can be carried out, and . If falls below a given stopping tolerance, the algorithm terminates and the current is considered an approximate minimizer of . However, when at the end of each iteration the objective function value has reduced, then the iteration is called successful, is not changed, and .

For a fair comparison, and to improve the quality of the produced solutions, we propose the implementation of the HJ intensification phase with all the above-mentioned metaheuristics. Based on the final solution provided by the metaheuristic, the HJ local search algorithm is invoked and allowed to run for iterations. The number of function evaluations required by the HJ local search is added to the function evaluations of the metaheuristic, herein denoted by , to produce the total number of evaluations of the run.

4. Numerical Experiments

This section aims to present and analyze the statistical significance of the numerical results that were obtained when the metaheuristics are used to solve the DMbPE problems in the sequential direct method context. The dynamic system models were coded in the Matlab programming language and the computational application of the sequential direct method with the FA, HS, DE, ABC, and DTS codes was developed in Matlab programming environment. The computational tests were performed on a PC with a 2.2 GHz Core i7-2670QM and 8 GB of RAM. The parameter values for the tested algorithms are set as shown in Table 6. Other values have been tested, in particular for the parameter in the FA, DE, and ABC and the parameter in HS, but those used seem adequate with reduced computational effort.


FA

(initial)0.25
(constant)1
1
0.2

HS

0.95

DE

, 0.9
MutationDE/rand-to-best/1

ABC

“limit”100

DTS

1
0.1
2

Nine case study problems have been selected to analyze the performance of the described stochastic algorithms. A comparison is made considering the quality of the solutions and the time spent to reach the solution after a threshold number of function evaluations. The problems and the experimental data can be found in the Appendix.

Table 7 shows (the average of the obtained final solutions), St.D. (the standard deviations of the solutions), and (the average time in seconds) after running each instance (inst.) 10 times with each metaheuristic (MH). For these experiments, each algorithm is terminated solely with the condition , where we have tested three values of : , , and . The goal here is to analyze the quality of the obtained solutions. Since the target objective value is 0, the lower the value of the better.


Inst. MH
St.D.St.D.St.D.

α-PIN5FA
HS
DE
ABC
DTS

BEL_A2FA
HS
DE
ABC
DTS

BEL_B2FA
HS
DE
ABC
DTS

CAT3FA
HS
DE
ABC
DTS

IRR12FA