Abstract

Current methods to identify Wiener-Hammerstein systems using Best Linear Approximation (BLA) involve at least two steps. First, BLA is divided into obtaining front and back linear dynamics of the Wiener-Hammerstein model. Second, a refitting procedure of all parameters is carried out to reduce modelling errors. In this paper, a novel approach to identify Wiener-Hammerstein systems in a single step is proposed. This approach is based on a customized evolutionary algorithm (WH-EA) able to look for the best BLA split, capturing at the same time the process static nonlinearity with high precision. Furthermore, to correct possible errors in BLA estimation, the locations of poles and zeros are subtly modified within an adequate search space to allow a fine-tuning of the model. The performance of the proposed approach is analysed by using a demonstration example and a nonlinear system identification benchmark.

1. Introduction

Nonlinearities are present to a greater or lesser extent in all real processes. When nonlinearities are weak, linear models can be successfully used to forecast the evolution of variables or to design control schemes. Currently, a lot of methods to build linear models can be found in the literature [15]. However, when nonlinearities are hard, linear models just can be used in a specific operation range. If process operating range is large, cause-effect relationship should be represented by a nonlinear model. An alternative to nonlinear system modelling is to describe the process phenomena using rigorous first-principles formulation [68]; nevertheless, in most cases, it can be a very challenging task. Another alternative is the use of soft computing methods for process identification. In this framework, nonlinear system identification has attracted considerable interest of researchers over the past few years. Nowadays, nonlinear identification is an open research topic where some benchmark problems have been proposed [912] and real measurement data are available for testing and validate different nonlinear identification methods (e.g., DaISy database [13]).

One of the most challenging problems regarding nonlinear system identification is the selection of a good model structure. Currently there are several structures based on neural networks [14], block-oriented models [15, 16], Volterra series [17], NARMAX models [18], and fuzzy models [19], among others. A review of black box methods to nonlinear identification can be found in Suykens and Vandewalle [20].

In this paper, block-oriented models are considered, which are a class of nonlinear representations consisting of linear time-invariant (LTI) systems coupled with nonlinear static functions (NL) [21]. Within this class of models, the most popular ones are Wiener (LTI-NL), Hammerstein (NL-LTI), Wiener-Hammerstein (LTI-NL-LTI), and Hammerstein-Wiener (NL-LTI-NL) models [15]. Nowadays, several methods to identify these models can be found in the literature. An interesting classification of contributions that have been developed until the last decade can be found in Lopes dos Santos et al. [22].

Block-oriented models are attractive for their simplicity and great capabilities to model nonlinear dynamic systems [2328]. Specifically, Wiener-Hammerstein models have proved to be able describe several systems like a paralyzed skeletal muscle [29, 30], a limb reflex control system [31], a DC-DC converter [32], a heat exchanger system and a superheater-desuperheater in a boiler system [33], and a thermal process [34], among others [35]. Not only does the study of block-oriented models address parameters estimation, but also these structures are used to implement modern control strategies [3640].

In the context of block-oriented models, knowledge of process dynamics can be a good starting point for identification [52]. In this regard, the Best Linear Approximation or BLA of a nonlinear system [5255] can be used. For the specific case of Wiener-Hammerstein models, the BLA does not provide information about the dynamics of each LTI block. Therefore, all BLA-based Wiener-Hammerstein identification methods have concentrated their efforts on the BLA division to generate initial estimates and avoid suboptimal local minima in an optimization procedure. In Sjöberg et al. [45], both LTI subsystems are initialized with all possible BLA partitions and least squares optimization is applied to fit the nonlinearity. Although identification results are very good, the number of possible partitions (combinations) grows with the number of poles and zeros of the BLA and therefore the computational cost required for this method can be very high.

To avoid multiple BLA divisions, in Lauwers [44] and Sjöberg et al. [45] an “advanced” method is proposed where both LTI subsystems are overparameterized with all poles and zeros of the BLA. This method is formulated as a linear-in-the-parameters total-least-squares problem for which the back LTI subsystem is inverted and basis functions are used to represent both linear subsystems. To minimize the effect of overparameterization, an order reduction technique is applied. However, since the formulation is based on neglecting the effect of disturbances, the solution is in general not consistent if there is noise on the output. In addition, the BLA is required to be invertible.

Another approach to initialize Wiener-Hammerstein models is presented by Westwick and Schoukens [42], where the poles and/or zeros of the BLA are classified by using a nonlinear transformation of the input and the output residuals (quadratic/cubic BLA). On the same context of QBLA/CBLA and in line with “brute-force” method, Westwick and Schoukens [46] propose a scanning technique for a rapid evaluation of all possible BLA partitions between both LTI blocks of the Wiener-Hammerstein system. With this evaluation, the vast majority of possible partitions are discarded. Both proposals based on QBLA/CBLA show excellent results and overcome some disadvantages of “brute-force” and “advanced” methods; however, the QBLA/CBLA estimation can be difficult (high variance) since it is estimated from the BLA residuals.

In a more recent work, Schoukens et al. [41] propose a more robust method based on QBLA/CBLA. Unlike the two previous proposals, the BLA is split into a nonparametric framework. This avoids mainly the parameterization of the QBLA that can be tedious given that the number of poles and zeros tends to be high. Once the front and the back dynamics of the Wiener-Hammerstein model have been identified, a parameterization of both LTI blocks is required in an additional step. This step can be complicated because a linear phase shift can be present in the nonparametric estimate.

To avoid QBLA/CBLA estimation, in Vanbeylen [43] a fractional model parameterization based in multiplicities (powers) of poles and zeros is presented. The problem is formulated in the frequency domain and fractional exponents indicate which poles and zeros belong to each subsystem after an optimization problem is solved. Once the poles and zeros of the BLA have been classified, both LTI blocks must be parameterized in an additional step.

All methods mentioned here, each with its advantages and disadvantages, identify Wiener-Hammerstein models from the BLA. However, all require high user interaction to parameterize the LTI blocks and/or a final optimization to refit all parameters of the Wiener-Hammerstein model. In this work, we show that it is possible to obtain a good Wiener-Hammerstein model from the BLA by solving a single optimization problem, where user interaction is only required at the beginning, just configuring simple parameters of an evolutionary algorithm.

Hereinafter, this paper is organized as follows. In Section 2, Wiener-Hammerstein systems formulation is revisited together with some relevant information about the BLA. The proposed evolutionary algorithm, WH-EA, is presented and described in detail in Section 3, while its application and results on a numerical example and on the benchmark data SYSID’09 are presented in Section 4. Finally, in Section 5, some conclusions are reported.

2. Background

2.1. Wiener-Hammerstein Model

A Wiener-Hammerstein model consists of two LTI subsystems and surrounding a static nonlinear function (Figure 1). Both LTI subsystems can be represented in the discrete-time domain as rational transfer functions in factorized form:where is the discrete-time operator, , and represent the static gain, poles, and zeros of the front LTI block, respectively, and , and represent the static gain, poles, and zeros of the back LTI block, respectively.

The nonlinearity can be represented as a linear combination of a finite set of basis functions:where and are the input and output of the static nonlinearity, are weighting parameters to be estimated, and are basis functions.

From (1) and (2), the output of the Wiener-Hammerstein model is analytically related to the input through the following expression:where

The challenge is to find the best so that the predicted output is as close as possible to the measured output . Without prior knowledge of the system, this identification problem is not easy to solve, because there are some inconveniences that must be overcome:(i)Parameters , , , and are not known (i.e., the structure of the LTI blocks is unknown).(ii)The order and basis functions for nonlinearity are not known.(iii)Without adequate initial values of , it is quite possible that the optimization process, trying to find the best , gets stuck in a local minimum.(iv)Internal variables and are not measurable.

As a complement to the formulation presented, the following assumptions are made about the system.

Assumption 1. The nonlinear system to be identified can be described by (3).

Assumption 2. The Wiener-Hammerstein system will be identified from an input/output data set . The input signal is Gaussian or equivalent (see Section 2.2 for more details), while the measured output may be corrupted by stationary additive noise . It is further assumed that the noise is independent of the input excitation signal:

Assumption 3. There is no cancellation of poles and zeros and all poles of both LTI subsystems must be within the unit circle.

Assumption 4. Nonlinearity is static and its current output only depends on the current input (i.e., the nonlinearity has no memory).

2.2. The Best Linear Approximation (BLA) of a Wiener-Hammerstein System

The BLA of a nonlinear system for a given class of excitation signals is a linear model that minimizes the expected mean square error between the true output of the nonlinear system and the output of the linear model [56]:where is the input that excites the nonlinear system, is the measured output, and is the expectation operator. An alternative way to obtain the BLA of a nonlinear system is in a nonparametric framework:where is the cross-power spectrum between the output and the input and is the auto power spectral density of [54, 57].

The BLA depends on the excitation power spectrum (bandwidth and amplitude level) and excitation probability density function. Therefore, obtaining the BLA is restricted to the type of input signal that excites the process. Most estimation methods to obtaining the BLA use Gaussian noise signals or equivalent [58].

When a nonlinear system is excited with a Gaussian signal or equivalent, according to Bussgang’s theorem [59], the nonlinearity can be replaced by a constant (). Therefore, in the specific case of a Wiener-Hammerstein system, the BLA can be defined by the following expression:where represents the dynamics of the nonlinear system:

It is evident that and are the poles and zeros that must be assigned to and . Although the BLA does not provide information to distinguish the dynamics between both LTI subsystems, knowledge of the overall dynamics of a Wiener-Hammerstein system is a good starting point to identify such systems.

3. The Evolutionary Algorithm (WH-EA)

In this paper, the identification of a Wiener-Hammerstein system is addressed as an optimization problem, which is formulated considering the following issues:(i)The BLA is estimated in the first instance.(ii)The poles and zeros of the BLA must be classified to find the dynamics of the front and back of the Wiener-Hammerstein model.(iii)The pole-zero locations of the BLA can change moderately to improve modelling errors.(iv)Without loss of generality, it is possible to model a Wiener-Hammerstein system considering that both linear blocks have unit gain (gains from and are not part of the optimization problem).(v)The nonlinear static function is modelled as a piecewise function represented by a set of points in the , plane.

To explain in detail how WH-EA works, this section has been divided into three parts. First part explains how Wiener-Hammerstein model is coded for individuals in the population; in addition, the optimization problem statement is presented. Second part explains in detail the customized genetic operators developed. Finally, the third part explains the general procedure of WH-EA.

3.1. Optimization Problem Statement

Changing pole/zero locations of the BLA to improve modelling error implies new estimates around the known values. These locations for both linear subsystems are coded in a single vector as follows:where and contain the locations of the real zeros and poles, respectively, and and contain the real and imaginary parts of complex conjugate zeros, respectively, while and contain the real and imaginary parts of complex conjugate poles, respectively. The values of , , , and depend on the number of zeros and poles (real and/or complex conjugates) of the BLA.

Poles and zeros contained in (10) must be classified to obtain the dynamics of the front and back blocks of a Wiener-Hammerstein model. This classification is performed using a binary vector:

The first part of the vector, , is associated with , and indicates the zeros classification, while its second part, , is associated with , indicating the poles classification. Note that imaginary parts are not considered for classification since they are already associated with their corresponding real parts. It is assumed that if , the corresponding element of with (i.e., a real zero or a pair of complex conjugated zeros) will belong to the subsystem ; otherwise, it will belong to the subsystem . In the same way, this correspondence can be applied to classify the poles using .

For example, if a nonlinear system is approximated by a BLA with four poles, , , and , and three zeros, , , then , , , and , would be structured as , and vector should contain five elements whose values switch between zero and one as the algorithm evolves. By way of illustration if , then would have two zeros and two poles: , , , while would have a zero and two poles: , .

With respect to nonlinear static function, let us consider that it is represented by a set of points:where the pairs correspond to their coordinates in a two-dimensional - plane. The location of these points and the interpolation method used will determine the quality of the captured static nonlinearity.

The proposed evolutionary algorithm is based on stochastic population of candidate solutions (individuals). Each individual contains genetic information related to(i)the pole/zero locations in the Z-plane of the linear subsystems ,(ii)the point coordinates representing the nonlinear static function (),(iii)and the pole/zero classification for blocks and (),

such that any Wiener-Hammerstein model ((1) and (2)) can be easily described from this coded information. Recall that gains from linear blocks are assumed to be and that parameters , , , and will be implicitly optimized and they will depend on the structure of vector .

To find the best set of parameters, an optimization problem is stated based on a prediction-error method and the typical mean-squared error criterion (although any other criteria can be used in the proposed method, such as the mean absolute or maximum error criteria):where and the solution of the optimization problem is stated aswhere contains the genetic information from the best individual at the end of generations.

3.2. Genetic Operators

Customized mutation and crossover operators will be developed taking in mind the problem at hand: to identify all parameters of the Wiener-Hammerstein model in a single optimization trial. Figure 2 shows the structure of an individual as well as the genetic operators developed on each piece of genetic information. Note that and have been introduced into the formulation. Subscript represents an individual in the population, while the superscript indicates the current population.

The specific mutation and crossover operators designed are randomly selected to maintain a balance between exploration and exploitation of the search space. Mutation operations are used to maintain genetic diversity, while crossover operations allow genetic information from the best individuals to be combined and disseminated throughout the generations. Further details on how the algorithm works will be given in Section 3.3.

3.2.1. Location in the Z-Plane of Poles and Zeros

Theoretically in a Wiener-Hammerstein model, the pole-zero locations of and subsystems correspond to the pole-zero locations of the BLA; however, it is well known that once the BLA has been divided, a refit can be used to improve the modelling error. In this regard, the proposed algorithm considers that while the BLA is divided and nonlinearity is captured, the pole-zero locations can change subtly.

Both operations used on this portion of genetic information produce offspring vector , which directly inherits from its parent all the genetic information except in a gene. This gene will be selected using a random integer number and modified according to the corresponding genetic operator mutation M.1 or crossover C.1.

Mutation M.1. The selected gene is mutated to explore in an individualized way new pole-zero locations of the BLA. A new location is determined by a random number with Gaussian distribution:where . and represent the jth elements of vectors and , respectively.

The new locations for poles and zeros are explored within a search space defined by and ; therefore, , where and are the jth elements of vectors and , respectively (see Section 3.3 for more details on search space for poles and zeros).

Aggressiveness of mutations can be controlled through the standard deviation:where is the predefined number of algorithm generations; is the rate at which the standard deviation will decrease from to as the generations pass (see Figure 3); the parameter bounds the limits of the interval in which the selected gene can be moved. For this mutation, . Variation of will allow mutations to be more subtle in the last generations to achieve a fine-tuning of the corresponding parameters.

Crossover C.1. The selected gene is formed using genetic information from the parent, , combined with the corresponding genetic information from the best individual, , in the current population:where .

3.2.2. Nonlinear Static Function

As the algorithm evolves, points for nonlinear static function must be located adequately in the - plane. Here any type of interpolation can be used to capture the static nonlinearity. To achieve a good fit, mutations M.2 and M.3 plus a crossover operation are used. Both mutations used on this portion of genetic information produce offspring vector , which directly inherits from its parent all the genetic information except in two genes. This pair of genes represents the coordinates of the point that will be modified. Unlike mutation operations, the crossover operation generates offspring with a single modified gene which corresponds to the ordinate of a point. Given the correspondence between the abscissa and the ordinate of a point, for the three operations a single integer random number will allow us to select the gene(s) to be modified.

Mutation M.2. This genetic operation allows us to explore in the - plane new positions for the points. The mutation in both genes is handled by random numbers () with Gaussian distribution:with . and represent the elements of vectors and , respectively. To avoid overlapping points, bounds for mutations on the abscissa axis are set depending on the selected point to mutate () and the location of its neighbors according to(i); if ,(ii); if ,(iii); if ,

where is a user-defined parameter that indicates how close the points can be located. To achieve a good fit of the nonlinearity must be small, relative to the search space on the abscissa axis defined by and . Note that the horizontal boundaries for the endpoints are delimited by their position and their left or right neighbor, respectively. Bounds for mutations on ordinate axis are fixed and equal for all points. This allows each point to move freely throughout the search space on the ordinate axis defined by and . The vertical and horizontal bounds for a selected point are illustrated in Figure 4. When mutation M.2 is required, the selected point can be changed to a new position within the grey rectangle. Details on how to determine , , and will be given in Section 3.3.

To achieve a fine-tuning of all nonlinearity points, mutations’ aggressiveness can be controlled through standard deviation (17) as in mutation M.1. Note that is constant for all mutations over ordinate axis, while for abscissa axis mutations, can be calculated as

Mutation M.2 has great potential to explore the searching space. This genetic operation will locate points where there are slope changes. During first generations, it is useful to shape nonlinearity, while in last ones, it allows a refinement. However, when a point is located where there is a slope change, it could be kept in this location until the end of generations, especially when there are abrupt changes in the slope. Because jumps between points are not allowed with mutation M.2, when a point is kept in a place where there is a significant change of slope, one or more points would remain trapped to the left or right of it. This would lead to having redundant points in a segment that would not require so many or worse, to having a segment (curvature) that would not contain enough points. To avoid this drawback, the exploration in the search space is complemented with mutation M.3.

Mutation M.3. This genetic operation is designed to concentrate as many points as possible on the curvatures that nonlinearity can have. Therefore, it will be required that each point can be displaced on the abscissa axis by jumping one or more positions of the other points. Let us define a segment as the horizontal space between two consecutive points (so for points there will be segments); then a random integer number will indicate to which segment the selected point will move. The first half of the offspring vector is found using the following expression:with . Note that if or , the corresponding point will not make a jump but it will be located at the midpoint between its current position and the position of the point on the right or left, respectively. As in the mutation M.2 to prevent points from getting too close together, the parameter is also used in this mutation; therefore, a jump is conditioned to the space available in the selected segment to accommodate a new point. Minimum space should be . If this condition is not met, must be regenerated to randomly search for another segment.

To provide a smooth transition between adjacent segments, gene mutation corresponding to the position on the ordinates axis is performed using a quadratic interpolation. To do that, three neighboring points are required. The second half of the offspring vector is found using the following expression:with . is the -coordinate of the selected point to mute which can be found with (21); , , and are the coefficients of the quadratic polynomial defined by three adjacent points selected once the new -coordinate of the point that is mutating is known. The three adjacent points can be selected directly when a point has mutated to the first or last segment,while if the point has mutated to a nonextreme segment, the three adjacent points can be selected using the two points that define that segment plus one on its right or left. For more effective exploration, a random number is used for selection:

Figure 5 illustrates how a jump occurs with mutation M.3. Notice that the new ordinate is calculated according to the polynomial formed by the two points of the segment plus a point to the right: that is, . After a jump has occurred, an ascending reordering of the points with respect to the abscissa values is necessary.

Crossover C.2. This genetic operation works just like crossover C.1 and is applied only to vary the position of a point on the ordinate axis:with . is the element of vector , which corresponds to the individual in the current population with the best fitness value.

3.2.3. Pole-Zero Classification

Due to the stochastic nature of evolutionary algorithms, the binary values of (11) will change as the algorithm evolves, generating different structures of and . The evolution of this piece of genetic information is handled by a simple mutation operator.

Mutation M.4. Unlike the previous ones, this operator generates a new vector that depends entirely on the effects of mutation, meaning that for this piece of genetic information there is no information exchange between generations. This allows free testing of different structures for and to avoid premature convergence. When this operation is required, a random process will generate the mutation vector:with . is the element of vector . is a random number with standard uniform distribution on the open interval . Note that the structure of is built under two considerations: the LTI subsystems cannot be improper and the sum of zeros and the sum of the poles between both subsystems must be equal to the number of zeros and poles of the BLA, respectively.

3.3. WH-EA Description

WH-EA is an elitist evolutionary algorithm that evolves a population of individuals. Each individual contains three portions of genetic information related to the parameters of a Wiener-Hammerstein model . Like any other evolutionary algorithm, WH-EA is inspired by biological evolution over generations. Starting from an initial population, new generations are created using information of the current generation and performing crossover and/or mutation operations and selection based on the fitness of the new individuals. Algorithm 1 shows a pseudocode of main steps performed in WH-EA, whereas details of all the parameters that the algorithm uses are shown in WH-EA Parameters.

(1) Initialise the population;
(2) Evaluate fitness of all population;
(3) for    to MaxGen do
(4)Find  
(5)Random selection of a individual ();
(6)Compute ;
(7)if     then
(8)Compute using Algorithm 2;
(9)else
(10)Compute using Algorithm 3;
(11)end if
(12)if   then
(13)Compute using Mutation M.4;
(14)end if
(15)Update population;
(16) end for
(17) Print
(1)if   then
(2)Compute
(3)if   then
(4)Mutation M.2;
(5)else
(6)Mutation M.3;
(7)end if
(8)else
(9)Crossover C.2;
(10)end if
(1)if   then
(2)Mutation M.1;
(3)else
(4)Crossover C.1;
(5)end if

Initialize the Population. The initial population contains individuals generated within a search space delimited by lower and upper bounds. For each piece of genetic information, the lower and upper bounds can be determined from the information provided by the BLA (location of poles and zeros and static gain).

Real and imaginary values of poles and zeros from the BLA without modifications are introduced as part of the first individual of the initial population. For the rest, mutation M.1 (16) is used but fixing the parent vector as , that is, the BLA.

Therefore, mutation M.1 must be executed times under the above conditions to generate mutated versions of the BLA. Initialization and mutation M.1 are performed considering a search space delimited by lower bound and upper bound which are set around the values of the BLA:where and represent the elements of the user-defined vectors and , respectively, indicating how much the BLA’s pole/zero position can change as the algorithm evolves.

On the other hand, the initial population corresponding to two-dimensional points must be generated within a search space defined horizontally by the minimum () and maximum () amplitude of (the output of ) and vertically by the minimum () and maximum () amplitude of (the input to ). Although and are not known, and depend on the input and . Since is a linear system, the following expressions can be used:where and are the minimum and maximum values of the signal , respectively, and is a scaling factor which depends on the pole/zero locations and the static gain of . Without loss of generality, it is possible to model a Wiener-Hammerstein system considering that both linear blocks have unit gain.

Regarding and , if the input with mean enters regardless of its structure, the mean of the output signal will be equal to . This is not the case when the signal enters the nonlinear block since the gain might provide an offset to . However, since is a linear block, the mean of the output will be equal to the mean of the incoming signal . With this information, the straight line of Figure 6 can be drawn, and and can be found using the following equations:

As can be seen from (28) to (29), the search space for nonlinearity depends on the input signal, output signal, static gain of the BLA, and which is a user-defined parameter. Since both linear subsystems will be estimated with unit gain, neither of these will amplify their input signals, therefore and . For these two conditions to be met, must be less than one. In the same way, it must be observed that and . If is less than one, the first pair of conditions will always be met; however, there is no guarantee that the second pair of conditions will be met. Since it is possible to perform this check prior to the execution of the algorithm, if the second pair of conditions are not met, must be increased, but considering that it must be less than one. It should also be taken into account that if is too large, the search space will be larger than necessary, so the algorithm will cost more to estimate static nonlinearity.

To initialize this portion of genetic information, the points are uniformly distributed between and and located on the straight line shown in Figure 6. These points are introduced as part of the first individual in the population . The corresponding genetic information for the rest of individuals is generated using mutation M.2 (19), but considering that the parent vector is always .

Finally, genetic information corresponding to pole-zero classification is initialized directly using mutation M.4 (26) times.

Evaluate Fitness. Performance of each individual in the population is defined by a fitness criterion which can be calculated using (14), where is obtained from the encoded information in .

The Offspring. Once population has been initialized, for each generation a random integer number will be used to select the parent from which an offspring will be generated. As can be seen in Algorithm 1, not all genetic operators are applied at the same time to generate offspring; this can help to expand diversity and avoid premature convergence.

One or two pieces of the offspring genetic information will be randomly selected for modification according to their respective genetic operators. A random number chooses between modifying the portion related to static nonlinearity using Algorithm 2 or the portion of genetic information related to pole/zero locations using Algorithm 3. The probability for this selection is handled by the control parameter defined aswhere is the rate at which the probability will decrease from initial probability to final probability as generations pass; therefore, . If , the probability of modifying the genetic information of nonlinearity in the first generations will be high, while the probability of modifying the location of poles and zeros will be low. On the other hand, if and , in the final generations, the algorithm will modify with equal probability both portions of genetic information. The selection of these values is justified by the fact that pole/zero locations are known and they will only be fine-tuned within a suitable search space to amend possible errors in the BLA estimation, whereas nonlinearity is completely unknown, so the algorithm should focus more on this portion of genetic information during first generations.

Variation of genetic information corresponding to the classification of poles and zeros for both LTI subsystems is handled by a comparison between a random number and the probability . The value of probability is defined by the user and will be constant throughout the evolution of the algorithm. Figure 7 shows the behaviour of the control parameters (probabilities) used to select the portions of genetic information that will be modified in each generation.

Algorithm 2 is used to modify the genetic information related to nonlinear static function. The control parameter indicates the probability with which the mutation (either M.2 or M.3) or crossover C.2 will be used. Probability of selecting M.2 or M.3 is variable with respect to the generations. During first generations, mutation M.3 is not necessary, since the nonlinearity can be captured thanks to the two-dimensional points movements due to mutation M.2 and crossover C.2 operations. Since it is very likely that nonlinearity includes one or more curvatures, as the algorithm evolves mutation M.3 will be required to concentrate as many points as possible on these curvatures. The variable probability for selection between both mutations is defined bywhere is a user-defined parameter indicating the minimum probability with which the mutation M.2 can be selected. Note that according to (31) and Algorithm 2, the maximum probability is and occurs in the first generation. As the algorithm evolves, this probability will decrease linearly until it reaches in the last generation. When Algorithm 2 is required a random number will allow us to select either a mutation or crossover C.2. If a mutation is selected, a new random number will allow us to select between mutation M.2 or mutation M.3.

On the other hand, Algorithm 3 is used to modify the genetic information related to pole/zero locations using mutation M.1 or crossover C.1. The control parameter indicates the probability with which each genetic operation will be used. Since crossover C.1 causes offspring to inherit genetic information from the best individual, a small value of may lead to premature convergence, whereas a value closer to will cause the algorithm to converge very slowly. When Algorithm 3 is required, a random number will determine the genetic operation to be used.

Update. It is based on a competition between the generated offspring and the individuals of the population. The contestant with the best fitness will be the one who wins the competition. From a randomly selected individual, the offspring starts to compete until defeating an individual; when this happens the descendant will take his place in the population and the algorithm continues with the next generation. If the offspring comes to compete with all individuals and could not win, this will be discarded and the algorithm will pass to the next generation.

4. Application of WH-EA and Results

WH-EA was tested on a numerical example and on the benchmark for nonlinear system identification in (SYSID’09) [9], where a Wiener-Hammerstein system is selected as test object. The benchmark is not intended as a competition, but as a tool to compare the possibilities of different methods to deal with this specific nonlinear structure.

For both cases, the BLA was estimated with the MATLAB System Identification Toolbox [60] using a Box-Jenkins (BJ) structure. Besides, trends and means were only removed for the BLA identification. The following parameters of the algorithm were set in common for both estimates: ; ; ; ; in addition, initial and final standard deviations for mutations were set to 20 and 1, respectively.

4.1. Numerical Example

A Wiener-Hammerstein system with the following structure was designed (where tansig is the hyperbolic tangent sigmoid transfer function):

A Gaussian excitation signal of 6 dB was filtered with a cut-off frequency of 6 Hz and used as input signal. The system was simulated and 120000 input/output samples were recorded and separated in two parts: the estimation data set for identification purposes and the test data set for validation purposes (in both data sets first 1000 samples were ignored to avoid transient effects). Furthermore, additive white Gaussian noise with a Signal-to-Noise Ratio (SNR) of 45.32 dB was added to the output.

The identification of the BLA was carried out and the model obtained was expressed in factored form:

The root mean square of the error (eRMS) obtained with this linear model on test data was of 0.0414.

According to the BLA structure, vector was coded with , , , and as follows:

The search space for nonlinear function was defined with , while elements of and were set to 0.01 (except bounds for the zero that were set to 0.1, since it influences slightly the dynamics and should have freedom of movement during tuning).

WH-EA was executed 3 times with and different number of points was chosen for the nonlinearity. For all trials, the algorithm was initialized with 60000 individuals and the minimum distance between two points was set to . For each estimated Wiener-Hammerstein model, the eRMS on the test data was computed; in addition, the normalized root mean square error (NRMSE) criterion was used to quantify the goodness of fit between real and captured nonlinearity. The results are reported in Table 1; for all cases piecewise linear interpolation was used to connect the points.

The poles and zeros of the BLA were correctly classified in the three tests carried out. As can be seen in Table 1, the eRMS of the Wiener-Hammerstein models decreased as the quality of the captured nonlinearity increases. A reasonable model was scored with 12 points considering that the RMS noise was . A convergence graph for this model is shown in Figure 8, during WH-EA execution, at th generation the poles and zeros of the BLA were correctly classified, and from there the best individual of each generation conserved the genetic information for this classification. Since the noise RMS is known, at the performance of the model was good enough, so the algorithm could have been stopped. Anyway, generations have been allowed in order to demonstrate the great precision that the algorithm can achieve.

In Figure 9, pole/zero locations of the BLA, the obtained Wiener-Hammerstein model, and real system are compared. Notice how WH-EA has moved initial locations trying to get to the true values improving modelling error.

On the other hand, a graphical comparison between real and captured nonlinearity is shown in Figure 10. Linear subsystems of the estimated Wiener-Hammerstein model are represented by (35), while the ordered pairs for the nonlinear static function are shown in Table 2.

4.2. Nonlinear System Identification Benchmark

The system to be modelled is an electronic nonlinear circuit with a Wiener-Hammerstein structure (see Figure 11). This system was built by Vandersteen [61] and presented as a benchmark problem for system identification by Schoukens et al. [9].

The first linear dynamic system is designed as a third-order Chebyshev filter (pass-band ripple of 0.5 dB and cut-off frequency of 4.4 kHz). The second linear dynamic system is a third-order inverse Chebyshev filter (stop-band attenuation of 40 dB starting at 5 kHz). This system has a transmission zero in the frequency band of interest. This can complicate the identification significantly, because the inversion of such a characteristic is difficult. The system was excited with a filtered Gaussian signal (cut-off frequency 10 kHz). Data used for estimation corresponds to interval , whereas test data corresponds to the remaining part . In order to analyse the performance of estimation methods, the mean value of the simulation error (), the standard deviation of the error (), and the root mean square value of the error (eRMS) must be calculated on test and estimation data [9].

Since first 5000 data samples just contain quantization noise, a set of 95000 input/output data was used to estimate the BLA. Multiple simulations were performed considering different combinations of poles and zeros for the input/output model and for the noise model. For each BJ model, the eRMS on test data set was computed. The BLA was obtained with 6 poles, 5 zeros, and one sample delay for the input/output model and 3 zeros and 3 poles for the noise model. The BLA is fully described with and the pole-zero pattern shown in Figure 12. The eRMS of this linear model was of 56.159 mV on test data and 43.143 mV after removing trends and means. According to the BLA structure, vector was coded with , , , and as follows:

During BLA estimation stage, different noise models were tested and it was observed that all poles, real zeros within the unitary circle, and complex zeros where they are located correspond to the dominant dynamics of the system, while real zeros outside the unitary circle were more likely to vary their location. This information was used to define the search space for refining the location of poles and zeros:

Bounds (37) limit search space in the system dominant dynamics within , while for and , limits are between and , respectively.

Static nonlinearity is represented by piecewise linear functions with points. Its search space was defined with and the minimum distance between two points was calculated with . The algorithm was initialized with 5000 individuals and generations were executed. The performance of the estimated Wiener-Hammerstein model is shown in Table 3. In Figure 13, it is depicted how the algorithm has distributed pole/zero locations for both linear subsystems. Notice how some of them were displaced to improve the modelling error. The outputs of the estimated Wiener-Hammerstein model, the linear model error, and the nonlinear model error on test data are shown in Figure 14, while the DFT spectra of this signals are shown in Figure 15. Captured nonlinearity is plotted in Figure 16.

Final estimated linear blocks and are shown in (38) and (39) respectively, while coordinates for nonlinear function are shown in Table 4. The estimated Wiener-Hammerstein model contains 26 parameters of which 14 are used to represent the static nonlinearity (without end points since they can be located anywhere on their respective end segments; nevertheless, these segments slopes are taken into account). Figure 16 shows how mutation M.2 and mutation M.3 located the two-dimensional points to capture the nonlinearity. As expected, due to the effect of mutation M.3, most of them were concentrated on the curvature.

4.3. Discussion

In contrast to other methods which generate good initial estimates by splitting the poles and zeros of the BLA, WH-EA allows us to identify Wiener-Hammerstein models avoiding high user interaction which is an advantage compared to methods using QBLA, where at least two intermediate procedures are required before fine-tuning all parameters of the Wiener-Hammerstein model.

The eRMS of  mV achieved with WH-EA on test data is quite acceptable considering that the RMS of the quantization noise is  mV. With respect to the initial model (BLA), the error was reduced by a factor of thanks to the captured nonlinearity and the updated pole/zero locations. Table 5 shows other proposals that have been tested on the benchmark. It can be appreciated that the eRMS of this paper is slightly higher than others; however, not all estimated models have the same complexity. Some of them use complex models with a greater number of parameters processing raw data before identification, while in this work, WH-EA is fed raw input/output data without preprocessing operations.

Comparing WH-EA with the proposals of Westwick and Schoukens [42] and Vanbeylen [43] whose models have the same complexity as the model estimated in this paper, the results are quite similar; however, to obtain a good final model with these two proposals, it is required that the BLA be estimated with high precision. In Vanbeylen [43] at the BLA division phase, a false position of a pole or zero could cause the values of the fractional powers to be close to which would cause the user to make a bad decision and the BLA is badly divided. This problem is much more critical in Westwick and Schoukens [42] since the method is based on a graphical comparison between the poles and zeros of the BLA and the poles and zeros of the QBLA. With WH-EA, this problem does not occur, since the evolutionary algorithm contemplates possible errors that can be made in the estimation of the BLA. During the algorithm evolution, the binary code used for the classification of the poles and zeros of the BLA can be changed without user interaction as the false positions of the poles and zeros are corrected. This is an important advantage of WH-EA, since it is very likely that the BLA estimate is subject to errors due to noise and nonlinearity effects; this has been experimentally demonstrated; for this reason, many proposals carry out a final readjustment of the parameters of the Wiener-Hammerstein model.

About the computational complexity, an iteration in WH-EA involves the random selection of an individual from the population, the offspring generation according to the genetic operations indicated in Section 3, the calculation of the objective function of the offspring, and the population update. This algorithm was implemented on MATLAB® software and was run on a personal computer with core i7 processor of 2,6 GHz and 16 Gb of RAM. The objective function was easily implemented using the filter command to simulate both linear subsystems, while the nonlinearity was interpolated using the interp1 command. For the benchmark system, the average time consumed per 50 iterations was 1.16 s.

5. Conclusions

In this paper, a new method (WH-EA) to identify Wiener-Hammerstein systems in a single step is proposed. The proposal estimates all parameters of the Wiener-Hammerstein model based on a customized evolutionary algorithm (WH-EA). Unlike conventional procedures, WH-EA is able to look for the best BLA split capturing at the same time the process static nonlinearity with high precision, solving a single optimization problem. The algorithm is fed with the estimated BLA and its pole/zero locations are subtly modified within an adequate search space to allow its fine-tuning, while piecewise linear function is used for the nonlinear block. The performance of this approach has been evaluated through a numerical example with a complex static nonlinearity and through the well-known benchmark data (SYSID’09). The results show that it is possible, using WH-EA, to identify a Wiener-Hammerstein system with a good precision in a parametric framework avoiding high user interaction and drawbacks involved in using the QBLA. Further research will be related to WH-EA extension for nonlinear multivariable systems by using a multiobjective optimization approach.

WH-EA Parameters

nc:Number of pairs of complex conjugate zeros of the BLA
nr:Number of real zeros of the BLA
mc:Number of pairs of complex conjugate poles of the BLA
mr:Number of real poles of the BLA
n:Number of points to represent nonlinearity
:Static gain of the BLA
, :Initial and final standard deviations for control of the aggressiveness of mutations M.1 and M.2 ()
:Rate with which the standard deviation    decreases from to
:Space over which a gene can move
:Minimum distance between two points on the abscissa axis for mutations M.2 and M.3
:Scale factor to define the search space for static nonlinearity
MaxGen:Generations number
NP:Population size
:Control parameter for selection between mutation M.1 or crossover C.1
:Control parameter for selection between mutation (either M.2 or M.3) or crossover C.2
:Variable probability for selection between mutation M.2 or mutation M.3
:Minimum probability for selection of mutation M.2. The maximum probability for this selection is 1
:Variable probability to choose between modifying static nonlinearity or location of poles and zeros
:Rate at which will decrease from initial probability to final probability
:Probability to modify the genetic information related to the classification of poles and zeros.

Conflicts of Interest

The authors declare that there are no conflicts of interest regarding the publication of this paper.

Acknowledgments

This work was partially supported by the Spanish Ministry of Economy and Competitiveness (Project DPI2015-71443-R) and Salesian Polytechnic University of Ecuador through a Ph.D. scholarship granted to the first author.