Advances in Artificial Intelligence

Volume 2015, Article ID 184318, 10 pages

http://dx.doi.org/10.1155/2015/184318

## Wavelet Network: Online Sequential Extreme Learning Machine for Nonlinear Dynamic Systems Identification

^{1}Department of Computer Science, Kirkuk University, Kirkuk, Iraq^{2}Department of Electrical and Electronic Engineering, Universiti Putra Malaysia (UPM), Serdang, Selangor 43300, Malaysia

Received 6 May 2015; Revised 29 July 2015; Accepted 31 August 2015

Academic Editor: Jun He

Copyright © 2015 Dhiadeen Mohammed Salih et al. This is an open access article distributed under the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.

#### Abstract

A single hidden layer feedforward neural network (SLFN) with online sequential extreme learning machine (OSELM) algorithm has been introduced and applied in many regression problems successfully. However, using SLFN with OSELM as black-box for nonlinear system identification may lead to building models for the identified plant with inconsistency responses from control perspective. The reason can refer to the random initialization procedure of the SLFN hidden node parameters with OSELM algorithm. In this paper, a single hidden layer feedforward wavelet network (WN) is introduced with OSELM for nonlinear system identification aimed at getting better generalization performances by reducing the effect of a random initialization procedure.

#### 1. Introduction

In literature, Huang et al. presented an extreme learning machine (ELM) algorithm with a single hidden layer feedforward neural network (SLFN) in [1] and has taken many researchers’ interest because it offers significant advantages such as providing better generalization performance, fast learning speed, ease of implementation, and minimal human intervention [2]. The main principle of using ELM algorithm is the random initialization procedure of the SLFN input weights, biases, and activation function parameters which are either additive or radial basis functions (RBF) and the analytical determining of the output weights in a single step using least square solution.

However, the ELM algorithm is based on fixed network structure, which means if the hidden nodes activation function (e.g., RBF’s centers and impact factors) parameters are initialized, then they will not be tuned during the learning phase. Therefore, random initialization of SLFN hidden node parameters may have effect on the modeling performances [3], and to improve the SLFN it requires high complexity performance and this may lead to ill condition, which means that an ELM may not be robust enough to capture variations in data [4].

Wavelet network (WN) has been applied successfully in many applications related to system identification and black-box modelling for nonlinear dynamic systems with different batch training algorithms [5–7]. The main advantages of WN are the capability of analytical initialization procedure for the hidden nodes (wavelet activation function) parameters [8]. Moreover, wavelets decomposition properties for localization in both time and frequency domains allow better generalization in nonlinear system identification problems [7].

The similarity between the wavelet decomposing theory and single hidden layer feedforward neural networks (SLFNs) inspired Cao et al. in [9, 10] to propose a structure of the composite function wavelet neural networks (CFWNN) to be used with ELM for different applications. The initialization of the wavelet function parameters, namely, translation and dilation, is done using the input data that takes into account the domain of input space. The results on several benchmark real-world data sets showed that the proposed CFWNN method can achieve better performances.

Latterly, a new WN structure of dual wavelet activation functions in the hidden layer nodes has been introduced by Javed et al. in [3] with ELM algorithm and called summation wavelet extreme learning machine (SW-ELM). The proposed SW-ELM showed good accuracy and generalization performances by reducing the impact of a random initialization procedure where wavelets and other parameters of hidden nodes are adjusted a priori to learning.

For many industrial applications where online sequential learning algorithms are preferred over batch learning algorithms, an online sequential extreme learning machine (OSELM) algorithm has been introduced for SLFN (NN-OSELM) with additive and RBF hidden nodes functions and showed better generalization performance and fast training capability over the other well-known sequential learning algorithms [11, 12]. However, the NN-OSELM has not been yet verified in nonlinear systems identification problems for control applications. The authors in [13] stated that if the RBF network is trained from random initial weights for each subset, it could converge to a different minimum that corresponds to weights different from the one corresponding to. This random initialization procedure may cause unacceptable learning behaviour in system identification applications and may lead to a different open loop response of the identified system regardless of the modeling accuracy.

To overcome these differences, the RBF can be replaced by wavelet function based on a fact that wavelets are capable of controlling the order of approximation and the regularity by some of their key mathematical properties and explicit analytical initialization form [13, 14]. In this regard, the authors in [15] introduced a feedforward WN with OSELM (WN-OSELM) algorithm for nonlinear system identification where the wavelet activation function parameters initialization played a big role to ensure fast and better learning performance over NN-OSELM. However, the proposed WN-OSELM method was based on fixed numbers of the input features and the hidden nodes which is not optimal in any case.

In this paper, feedforward WN framework is introduced with OSELM (WN-OSELM) to limit the impact of random initialization of the SLFN hidden nodes parameters by using density function with recursive algorithm [16] and the input weights and biases using Nguyen Widrow approach [17]. Moreover, the optimal input features and hidden nodes are selected using sequential forward search approach (SQFS) [18] and final prediction criterion (FPEC) [19], respectively.

The simulations will be carried out on three nonlinear systems and it will take in account the optimum number of hidden nodes and the input features for both WN and NN to ensure the generalization property.

#### 2. Preliminaries

In this section, a brief review of the traditional wavelet neural networks and the NN-OSELM is presented.

##### 2.1. Wavelet Network

The standard form of WN with one output iswhere refers to the activation function for the hidden nodes where represents the set of all integers. The wavelet activation functions can be any wavelet mother functions (i.e., Morlet, Mexican-hat, 1st Gaussian function derivative, etc.). The symbol refers to the connection weights between the hidden nodes and the output node, while is the applied input vector to the input nodes . The multivariate wavelet mother function in the hidden nodes can be determined by the product of wavelet mother functions as follows:The wavelet function parameters and are called translation and dilation (scale) parameters, respectively. The translation parameter determines the central position of the wavelet, whereas the dilation parameter controls the waves spread. These parameters can be defined from the data set available [14].

##### 2.2. NN-OSELM

For single-layer feedforward neural networks with hidden nodes governed by additive sigmoid or RBF activation functions and based on ELM algorithm theories [20], an online sequential extreme learning machine (OSELM) is developed by [12] in a unified way to deal with industrial applications that the training data comes one by one or chunk by chunk. To explain the principles of the OSELM, suppose a SLFN of hidden nodes and RBF activation function can be expressed as below,where is a Gaussian type radial basis function and is the center vector for neuron and is the output weights. Now, for a number of input/output training samples that equals the number of SLFN hidden nodes (i.e., ), if the input weights, biases, and hidden node parameters are randomly assigned and independent from the training data sets, an analytic and simple way to calculate the estimated output weights is applying least square solutions on the cost function as follows:where is the SLFN hidden layer output and the real output. Here, the estimated output weights can be determined by inverting with a single step where a zero training error can be realized. However, when the number of training samples is greater than the number of hidden nodes , the estimated weights can be considered by using a pseudoinverse of to give a small nonzero training error ,where is pseudoinverse of and the solution given by (5) can be rewritten as below [21], Based on the above, the learning procedure of the OSELM algorithm consists of two phases, namely, the initial phase and the sequential learning phase. In initial phase, suppose a chunk of initial training samples , where is reached such that ; then the estimated weights can be found bywhere while is the applied input vector to the input nodes . Now, for the sequential phase, suppose another chunk of data is reached , and ; then the minimizing problem becomeswhereand then where To express in terms of, the detailed derivation formula can be found in [12], and it is described asFor the subsequenced chunks of data samples, the recursive least square solutions are applicable, and the previous arguments can be generalized for. Suppose a data set , where ; then a recursive algorithm for updating can be written as follows: and (14) will be Finally, the OSELM algorithm for SLFN can be summarized in the following steps:(1)Initialize the network parameters (input weights, biases, and hidden nodes parameters) randomly.(2)Determine and using (8) and (7) for the initial chunk of samples .(3)Set .(4)Determine and using (15) and (16) for the next sample data .(5)Set , , and .(6)Go to Step (4) until all training data finish.

#### 3. Nonlinear System Identification Using WN-OSELM

Many of nonlinear systems can be represented using the nonlinear autoregressive with exogenous inputs (NARX) model. Taking single-input-single-output systems as an example, this can be expressed by the following nonlinear difference equation:where is unknown nonlinear mapping functions, is the number of lagged feedback predictor sequences (output), is the number of lagged feedforward predictors (input), and is the fitting residual assumed to be bounded and uncorrelated with the inputs. Several methods can be applied to realize the NARX model including polynomials, neural networks, and wavelet networks [22]. In this work, the proposed wavelet network with NARX called series-parallel structure is shown in Figure 1, where the real plant outputs are assumed to be available and can be used for prediction so that the stability of the network is guaranteed [23].