#### Abstract

Due to the strong nonlinearity and transition dynamics between different operating points of the high purity distillation column process, it is difficult to use a single model for modeling such a process. Therefore, the multiple model based approach is introduced for modeling the high purity distillation column plant under the framework of the expectation maximization (EM) algorithm. In this paper, autoregressive exogenous (ARX) models are adopted to construct the local models of this chemical process at different operating points, and the EM algorithm is used for identification of local models as well as the probability that each local model takes effect. The global model is obtained by aggregating the local models using an exponential weighting function. Finally, the simulation performed on the high purity distillation column demonstrates the effectiveness of the proposed method.

#### 1. Introduction

The increasing complexity of the process control systems poses a great challenge for process modeling and parameter identification and control. In practice, most of the industrial processes show a strong nonlinearity and a dynamic nature. Therefore, it is of interest for both academic researchers and industrial practitioners to study how to model these systems and to achieve satisfactory or even optimal control performance. Linear modeling technologies have been quite sophisticated after decades of development; however, due to the inborn nonlinearity of some production processes, the performance of single linear model based controllers or optimizers may be compromised or even unsatisfactory [1]. To overcome the limitations imposed by the nonlinearity of the process, various strategies have been developed for nonlinear process modeling. Some of the popular modeling approaches have been proposed and applied in industrial processes, such as artificial neural networks [2, 3], support vector regression and its extension least squares support vector regression [4], Gaussian process regression [5], and hybrid methods [6]. The obvious feature of these methods is that they do not need any prior knowledge about the system. However, these methods do not always capture the process dynamics well and may lose validity when the system transits among a large range of operation.

The multiple model approach is a good compromise which produces complex models or controllers by piecing a number of simpler subsystems together. Some multiple modeling strategies and applications have been investigated because they are able to represent complex nonlinear processes using the linearization technique [7]. The terminology of linear parameter varying (LPV) modeling method, which is featured by its linear structure and varying model parameters, was introduced by Shamma and Athans [8, 9]. The work on identification of multimodel LPV models can be found in much literature. For example, Zhu and Xu proposed an LPV model based on blended linear models and added weights at the input side [10, 11]. The identification of LPV models in an input-output setting with Box-Jenkins (BJ) model structure was addressed by Zhao et al. and prediction error method was employed for parameter interpolation [12]. Huang et al. studied the work of the multiple LPV model identification using several weighting functions with two scheduling variables [13]. The multiple model based recursive least square parameter identification was illustrated by Chen and Liu [14]. Jin et al. proposed a multiple model LPV approach for nonlinear systems under the framework of the expectation maximization (EM) algorithm [15]. It is an important step on LPV identification in which the model identity is treated as the missing variable in the EM scheme. Based on the EM algorithm, the local models are synthesized with process data to obtain a global model for approximating a nonlinear process. Deng and Huang extended the identification method to nonlinear parameter varying systems in which the local model takes the form of state space equation. The identification problem is solved by using the particle filter based on EM algorithm [16]. The efficiency is illustrated on numerical examples and a pilot-scale hybrid water tank. Chen et al. continued to study the uncertain scheduling variable problem in multiple model based nonlinear identification. Both the linear and the nonlinear dynamics of scheduling variables are taken into account [17].

The distillation column is considered as one of the most important unit operations in chemical engineering [18, 19]. It becomes a benchmark problem in chemical process control for nonlinear modeling technology. The difficulties in modeling and controlling distillation columns lie in their highly nonlinear characteristics. Modeling of distillation columns may be classified into three groups: fundamental modeling, empirical modeling, and hybrid modeling [18]. In this paper, a hybrid solution based on the multiple models is adopted, which combines the two main advantages of linear and nonlinear models. The distillation column model is a nonlinear global model but can be represented as a reunion of linear models, one linear model for each operating point (local model or submodel), where each local model works at different operating point and can be identified through the iteration procedure of the EM algorithm. Finally, the global nonlinear model can be obtained by using the exponential function which takes full account of each operating point.

The remainder of this paper is organized as follows: the high purity distillation column used to test the multiple model method is described in Section 2. Section 3 lays out the mathematical formulation of the multiple model approach in which the ARX model is used as the local model. In Section 4, the identification procedure under the EM algorithm is provided. The simulation on the high purity distillation column with multiple inputs is given in Section 5. Finally, conclusions are given in Section 6.

#### 2. Description of a High Purity Distillation Column

Distillation is simply defined as a process in which a liquid or vapor mixture of two or more substances is separated into its component fractions of desired purity, by the application and removal of heat [20]. The process is based on the fact that the vapor of a boiling mixture will be richer in the components that have lower boiling points. In most cases, the distillation is operated at a continuous steady state. New feed is always being added to the distillation column and products are always being removed. A high purity distillation column is strongly nonlinear, and any realistic study should take this into account [21, 22]. Distillation column has become a favorite subject in the process control engineering field, including the areas of soft-sensor, process modeling, and control. There are a variety of configurations for distillation columns, and each performs specific types of separations [23].

A typical two-product distillation column on a pilot plant scale is shown in Figure 1. Some important notation is summarized in Table 1. It is a nonlinear model of a distillation column with theoretical stages including a reboiler plus a total condenser [24]. In the model, the following assumptions are made:(1)two components (binary separation);(2)one feed and two products;(3)constant relative volatility;(4)constant molar flows (same vapor flow on all stages);(5)no vapor holdup;(6)total condenser.

#### 3. Problem Formulation Using Multiple Model Method

The high purity distillation column as mentioned above is a classic chemical production process; the plant may transit among several operating points. A bank of ARX models is employed to approximate the nonlinear dynamic process around each operating point. Consider the following ARX model [15, 17]: where is the output. is the Gaussian noise with zero mean and variance added to the ARX model. is the regressor vector, where is the input of the nonlinear system, and are the orders of the output and input, and . Many industrial processes are often operated in certain “orderly” ways to meet different production objectives. Such orderly ways are also referred to as operating trajectory consisting of several predesigned operating points which are defined as the scheduling variable, . Here we assume that different operating points are known a priori, corresponding to a certain , namely, .

Around the small region of each of the operation points, the ARX local model can be used to approximate the process dynamics. Here parameter is used to represent the th local model parameters. Let denote the sequence of input data set and let denote the observed output data set . Here a hidden variable is introduced to stand for the identity of each local model that takes effect at sampling time point . So the parameters to be estimated for each local model are . Given all of the past observed data information, the probability of the output can be computed as

Based on the fact that the observed output at th sampling time instant directly depends on the regressor vector as well as the data identity of , (3) can be further written as

Here is the normalized weight standing for the probability that the th submodel takes effect at time . An exponential function is adopted as where is the scheduling variable at the sampling time and is defined as the validity width of the th local model which is limited between (the lower bound for ) and (the upper bound for ).

The missing data set is the random variable denoting the identity of each local model, namely, . The values of input variables, output variables, and the scheduling variable are observed or given a priori, so . The parameters to be estimated from process input and output data under the framework of the EM algorithm are .

#### 4. The ARX Model Identification under the EM Algorithm

##### 4.1. The EM Algorithm Review

The EM algorithm is an efficient iterative procedure for maximum likelihood estimation in the presence of missing data. The main strength of the EM algorithm lies in its ability to handle missing variables in the identification data set [25]. After being first introduced by Raghavan et al., it has found extensive applications in various areas for finding maximum likelihood estimates of parameters in probabilistic models [26]. In the EM procedure, both the complete data log likelihood and the conditional predictive distribution of missing data are needed to be calculated. There are two steps, namely, the expectation step (*E*-step) and the maximization step (*M*-step). The EM algorithm proceeds as follows.(1)Initialization: give an initial value of the model parameter vector.(2)*E*-step: the -function is calculated, given the current parameter estimated from the previous iteration,
(3)*M*-step: maximize the -function with respect to to obtain the new iterative parameter estimate.(4)Iteration: step 2 and step 3 are repeated until the changes in parameters after each iteration are within a specified tolerance level.

The above steps ensure that the log likelihood function of the observed data increases at every step. So the EM algorithm is guaranteed to converge to some maximum of the likelihood function [15]. The convergence performance under general conditions is given in Wu [27]. The flowchart of the algorithm is shown in Figure 2.

##### 4.2. Formulation of the Multiple Model Approach

As mentioned above, if we compute the -function, should be derived at first using the probability chain rule:

Using the Bayesian theory, the joint probability density function such as the first term of (7) can be written as

Similarly, the model identity can be determined from (5), so the second term of (7) can be simplified as

The derivation of the third term in (7) is based on the chain rule and the assumption that the scheduling variable is known a priori and independent of the past observations as well as the model parameters. Based on the fact that most of the chemical plants predesign the operating points, namely, scheduling variable, is derived as

Then we can substitute (8), (9), and (10) into (7), and the joint probability density of the likelihood of the full data set can be rewritten as where

The expectation of the complete data in the procedure of the EM algorithm is

From (13) we need to compute the following probability density function:(1),(2),(3), where is calculated by is given by (5).

The last probability density function to be calculated denotes the conditional probability that the th data point comes from the th local model and can be derived from the Bayesian rule:

##### 4.3. The Summary of the Identification Procedure

Now we give the identification procedure under the frame of the EM algorithm as follows.

*(**1) Initialization.* Set the initial value of parameters to be estimated.

*(**2) E-step.* Using the current parameter , calculate , where can be computed by (5) and is calculated from (15).

*(**3) M-step*. Based on the three terms of the function , only the term as (15) depends on the local model parameter and the process noise variance . Derivative is taken over this term with respect to and , respectively. So the new parameter estimation can be calculated as follows:

As a result, the new of the th ARX local model and are equal to

For the local model validity width , , due to the usage of exponential function, it cannot be calculated from an analytical form directly. The optimization problem searching for the optimal value of can be expressed as

Here a nonlinear numerical optimization is required to obtain the optimal between the and . A constrained nonlinear optimization function named “fmincon” in optimization toolbox of “Matlab” is adopted to search for the optimal at each of the iteration steps [15].

*(**4) Iteration.* Repeat step (2) and step (3), until the converge condition is satisfied. Here the converge condition is defined as the changes of the estimated value between two iterations which are as small as the level .

#### 5. A Distillation Column Simulation

Nonlinear system identification is much more complex than linear system identification. In this section, a high purity distillation column is used to test the performance of the multiple model approach based on the EM algorithm.

It has been shown that the purer the products get, the more nonlinear the system becomes [20]. A distillation column is also a typical example for a MISO system in which there are strong interactions between the variables. In this paper, choosing LV-configuration is motivated by the fact that the LV-configuration is the most commonly used in industrial practice.

Here, the process input variables available for control purpose are the reflux and the boil-up rate . The top product composition variable is the output. The disturbances to a distillation column can come from many sources. Besides the two manipulated variables, they can come from the feed (feed rate and feed composition), from the pressure inside the column and from the cooling water, and so forth. The distillation column is very sensitive to the feed rate disturbance. The feed rate is assumed to be measurable; therefore, its effect can be incorporated in the EM algorithm. In this paper, the feed rate is treated as the scheduling variable [23]. When the column is operated over a relatively wide operating region instead of at a single set point, it reveals a significant nonlinear behaviour.

In the simulation, white noise with a variance of is added in the simulated output data for unknown process noise. The observed input and output data for the high purity distillation column are shown in Figures 3 and 4. The scheduling variable is designed with three operating points at 0.8 kmol/min, 1.0 kmol/min, and 1.2 kmol/min, respectively, which were given in Figure 5. In order to achieve high identification performance the random binary excitation signals with small magnitude are added in the input variables. The feed rate increases by a fixed step size and no additional excitation signal is added. Then applying the multiple model modeling approach and using the EM algorithm to estimate the parameters of each local model, the simulation results of self-validation tests indicate the efficiency of the proposed method. The comparison and the weighting function curve are given in Figures 6 and 7.

**(a)**

**(b)**

To further test the performance of the identified multiple model, the cross-validation is also carried on the case. The data used for cross-validation as shown in Figures 8 and 9 are generated under different feed rates from Figure 5. Two different operating points are selected in the cross-validation simulation as in Figure 10. It can be seen from Figure 11 that the real process data and the model prediction are in good agreement with each other which effectively confirms the efficiency of the proposed algorithm.

**(a)**

**(b)**

#### 6. Conclusions

A multiple model modeling algorithm based on EM algorithm is proposed and applied for a high purity distillation column reaction process. Taking a full consideration of the dynamic, nonlinearity, and complexity, the parameters of each local model are identified under different operating points. The global model can be obtained by using a weighting function, which adopts an exponential form about scheduling variable under the framework of EM algorithm. Simulation results show that the fusion model of multiple models based on the EM algorithm is very effective in the chemical process of a high purity distillation column.

#### Conflict of Interests

The authors declare that there is no conflict of interests regarding the publication of this paper.

#### Acknowledgments

The authors thank the financial support by the National Natural Science Foundation of China (nos.: 21206053, 21276111, and 61273131) and the paper is partly supported by the 111 Project (B12018), the Priority Academic Program Development of Jiangsu Higher Education Institutions (PAPD), the Fundamental Research Funds for the Central Universities (JUDCF10064), and Jiangsu Innovation Program for Graduates (CXZZ11_0464).