#### Abstract

The semi-infinite time optimal control for a class of stochastically excited Markovian jump nonlinear system is investigated. Using stochastic averaging, each form of the system is reduced to a one-dimensional partially averaged Itô equation of total energy. A finite set of coupled dynamical programming equations is then set up based on the stochastic dynamical programming principle and Markovian jump rules, from which the optimal control force is obtained. The stationary response of the optimally controlled system is predicted by solving the Fokker-Planck-Kolmogorov (FPK) equation associated with the fully averaged Itô equation. Two examples are worked out in detail to illustrate the application and effectiveness of the proposed control strategy.

#### 1. Introduction

Markovian jump system (MJS) is a class of hybrid systems, which has different operation modes or forms governed by a Markov process. This class of systems can be used to represent complex real systems, which may experience abrupt variation in their structures and parameters. It is well known that random form switching can give rise to unexpected phenomena of the whole MJSs, even if the dynamics in each form is simple and well understood. Development of methodology for analysis of such systems is thus much deserving.

MJSs have been studied with growing interest and activity in recent years since they are first introduced by Kats and Krasovskii [1]. For dynamics analysis, several important criteria for stochastic stability of MJSs have been established [2, 3]. The stationary response of the stochastically excited nonlinear MJSs is studied by Huan et al. [4]. For the problem of optimal control, Krasosvkii and Lidskii studied the LQR control of Markovian jump linear systems firstly [5]. As a consequence, considerable attention has been paid to the optimal control problem of MJSs [6, 7]. Sworder solved the jump linear Gaussian problem for finite time horizon [8]. The infinite time horizon problem of linear MJSs was studied by Wonham [9] and Fragoso and Hemerly [10]. The optimal control solution for linear MJSs subject to Gaussian input and measurement noise is given by Ji and Chizeck [11]. Ghosh et al. investigated the ergodic control problem of switching diffusion [12]. However, most of the published results are applicable to linear MJSs. Far less is known about the optimal control of nonlinear MJSs, particularly for stochastically excited nonlinear MJSs.

Recently, a nonlinear optimal control strategy of stochastic excited nonlinear systems has been proposed by Zhu based on the stochastic averaging method [13] and stochastic dynamical programming principle. It is proved that this strategy is quite effective for response alleviation of nonlinear stochastic systems. The goal of this paper is to extend the strategy for nonlinear stochastic system to the one for nonlinear stochastic MJSs.

An infinite time optimal control problem for single degree-of-freedom (SDOF), stochastically excited, nonlinear, Markovian jump system is investigated in this paper. The organization of this paper is as follows. In Section 2, the equation of SDOF nonlinear Markovian jump system is examined. Stochastic averaging method is applied to this system in Section 3. The partially averaged equation of total energy for each form of the original system is obtained. A finite set of coupled dynamical programming equations is then set up in Section 4, from which the optimal control force is determined. The Fokker-Planck-Kolmogorov (FPK) equation associated with the fully averaged Itô equation is also derived and solved. In Section 5, two examples are worked out in detail to illustrate the application and effectiveness of the proposed control strategy.

#### 2. Formulation of Control Problem

Consider a SDOF controlled, stochastically driven nonlinear system with Markovian jump governed bywhere is nonlinear stiffness; is a small parameter; denotes light damping with Markovian jump; is the feedback control force; represent the Markovian jump coefficients of weakly external and (or) parametric stochastic excitations; are independent Gaussian white noises with zero means and intensities .

The parameter is a continuous-time Markovian jump process with finite discrete state-space and transition matrix , where are real numbers such that, for , , and for all , . The transition probability is given by [14, 15]where is such that .

Our primary goal here is to design an optimal controller which will result in the minimum of a specified performance. For infinite-horizon control problem, the performance index is specified bywhere denotes an expectation operation; is the terminal time of control; is cost function. If random excitations and Markovian jump process are ergodic, then the response of controlled system approaches to stationary and ergodic as . In this case, the performance index (3) is rewritten as

Equations (1) and (4) constitute the mathematical formulation of the infinite-horizon optimal control problem for a SDOF nonlinear system undergoing Markovian jump. In this paper, we consider the perfect observation case, which means that the system state () and values of are all available at each time .

#### 3. Partially Averaged Equation

Let and ; (1) can be converted to the following Itô stochastic differential equations [11, 16]:where are independent standard Wiener processes; and are the generalized displacement and generalized momentum, respectively; . The Hamiltonian function (total energy) of the Hamiltonian system associated with system (5) isSuppose that is a slow jump process and independent of system’s state. That means that, for original system (1) or (5), the system jumps among the forms or modes slowly and independently. We consider one of the forms by letting the Markovian jump process be arbitrarily fixed at . When , denote , , and . Since is a function of and , the Itô equation for can be derived from (5) by using Itô differential rule as follows [17]:

In the case of light damping and weak excitations, the Hamiltonian in system (7) is a slowly varying process while the generalized displacement in (5) is a rapidly varying process. Since the slowly varying process is essential for describing the long-term behavior of the system, the stochastic averaging method is used to average out the rapidly varying process and to yield the following partially averaged Itô equation for [13, 18, 19]:where The region of integration is , and the parameterNote that the second term on the right-hand of (8) is unknown since are unknown function of , , and at this stage.

Equation (8) is only valid when . As Markovian jumps are allowed so that takes values from , (8) can be extended so thatwhere , , and change as jumps, so that , , and .

To be consistent with the partially averaged equation (11), performance index (4) is also partially averaged: that is,The original optimal control problem (1) and (4) has been converted to the partially averaged optimal control problem (11) and (12). The partially averaged optimal control problem is much simpler than the original one since (11) is only one-dimensional. It has been proved [20] that the optimal control for the partially averaged control problem is quasi-optimal control for the original one.

#### 4. Optimal Control Law

For the proposed control problem, we introduce the value function:

According to the stochastic dynamical programming principle and Markovian jump rules, the following dynamical programming equation can be established [16]:where are the transition rates described in (2). Equation (14) is a finite set of second-order partial differential equations indexed by . These equations are coupled through the zero-order terms .

The necessary condition for minimizing the left-hand side of (14) is

Let the cost function be of the formwhere , and is a positive constant. By substituting (16) into (15) and exchanging the order of the derivative with respect to and averaging in (15), the expression of the optimal control force is determined:Note that is generally a nonlinear functional of . Hence, is a nonlinear function of and ; that is, the optimal feedback control is nonlinear. To ensure that the control force is indeed optimal, the following sufficient condition should be satisfied:

Substituting the expression for in (17) into (14) yields the final dynamical programming equation:where

The value function can be obtained by solving (19). The optimal control force is then determined by substituting into expression (17). By inserting the optimal control force into (11) and averaging the term , the following completely averaged Itô equation is obtained:The FPK equation associated with the fully averaged Itô equation (21) is [4, 21, 22]where is the probability density function of total energy for optimally controlled system (1) while the system operating in th form ( for uncontrolled system can be obtained by letting ). The initial condition isand boundary conditions are

The FPK equation (22) is a finite set of second-order partially differential equations, which are coupled through the zero-order terms. Equation (22) does not admit easy solution, analytically or numerically. However, in practical application we are more interested in the stationary solution of FPK equation (22). In this case, FPK equation (22) is reduced by letting . Then, the stationary probability density is obtained readily from solving (22) numerically. The stationary probability density can be obtained from as follows:where is a normalization constant. The stationary probability density of the generalized displacement and the mean value of the total energy are then obtained aswhereis the stationary joint probability density of the generalized displacement and generalized momentum.

#### 5. Numerical Example

To demonstrate the application and to assess the effectiveness of the proposed control strategy, numerical results are obtained for two examples.

##### 5.1. Example 1

Consider a controlled van der Pol-Rayleigh oscillator subjected to both external and parametric random excitations, which is capable of independent Markovian jump and governed bywhere , , and are Markovian jump coefficients of linear and quasi-linear damping, respectively; and denote Markovian jump coefficients of external and parametric random excitations, respectively; is the feedback control force; are independent Gaussian white noises with zero mean and intensities . is a finite-state continuous-time Markov process pointing to the system’s form with the transition probability defined in (2).

We consider the 3-form case here that means and . Prescribe the transition rate between the form and the form by a transition matrix .

Let and . The Hamiltonian associated with system (28) is

As a Hamiltonian system, (28) can be expressed as

Upon stochastic averaging, (11) for system (28) is obtained, for which , and

For the proposed control strategy, the partially averaged cost function is of the form of (16) with

Following (17), the expression of the optimal control force is

Substituting in (33) into the dynamical programming equation (14), completing the averaging yields the following final coupled dynamical programming equations:

To solve (34), the value function is supposed to be of the following form:

Since the optimal control force is function of , only the coefficients and are determined by inserting (32) and (35) into (34):

By substituting (35) into (17) with the coefficients determined by (36), the optimal control force is determined. Then, the following completely averaged FPK equation is obtained:where

Solving (37) numerically with , the stationary probability density of optimal controlled system can be obtained. The stationary probability densities of the total energy, of the generalized displacement, and the mean value of the total energy are then obtained using (25) and (26).

Some numerical results are obtained as shown in Figures 1–4 for system parameters ; ; ; ; ; and

The stationary probability density of the total energy for uncontrolled and optimally controlled system (28) is shown in Figure 1. Obviously, as increases, for optimally controlled system (28) decreases much faster than that for uncontrolled system. The stationary probability density of the displacement is plotted in Figure 2 with the same parameters as in Figure 1. For optimally controlled system, there is a much higher probability that will be located near their equilibrium. Hence, has much larger mode and smaller dispersion around for optimally controlled system than those for uncontrolled system. This implies that the proposed control strategy has high effectiveness for attenuating the response of system (28).

Figure 3 shows the mean value of the total energy of uncontrolled and optimally controlled system as functions of ratio . Obviously, higher value of ratio implies lower probability for the system operating in form and higher probability for the system operating in forms and . When the system operates in form , the system has the smallest damping coefficients and the largest amplitudes of stochastic excitations than those when it operates in forms and . As a consequence, for uncontrolled system decreases monotonously while the ratio increases, as shown in Figure 3. However, for optimally controlled system always keeps a very low value despite changing of the ratio , which indicates that the proposed control strategy is very effective and quite robust with respect to the ratio . In Figures 1–3, lines are obtained by numerical solution of (37), while dots are obtained by direct simulation of system (28). Observe that the dots match closely with the corresponding lines, demonstrating the validity and accuracy of the proposed method. The displacements of the uncontrolled and optimally controlled system (28) are shown in Figure 4, from which the effect of the optimal control on the displacement can be visualized intuitively.

##### 5.2. Example 2

Consider a controlled stochastically excited Duffing oscillator with Markovian jump which is governed bywhere is Markovian jump coefficient of linear damping; is Markovian jump coefficient of external random excitation; is the feedback control force; is Gaussian white noise with zero mean and intensity 2. Here, we consider the 2-form case, and .

Upon the stochastic averaging method, the original jump system (40) can be approximately substituted by the partially averaged system (11) with

Following (17), the expression of the optimal control force is

Then, the following final dynamical programming equation is obtained:

can be obtained from solving (43) numerically. Then the optimal control force is determined by substituting into (17). Substituting the optimal control force (17) into the partially averaged equation (11) and completing the averaging yield the following completely averaged FPK equation:

Solving (44) numerically with , the statistics , , and of the stationary response are then obtained from (25) and (26).

The numerical results shown in Figures 5–8 are for system with parameters ; ; ; ; , and

The stationary probability densities of total energy and of the displacement of uncontrolled and optimally controlled system (40) are evaluated and plotted in Figures 5 and 6, respectively. The mean value of total energy of uncontrolled and optimally controlled system (40) as functions of ratio is shown in Figure 7. It is seen that the optimal control force alleviates the response of system (40) significantly. This effect can be seen more apparently from Figure 8. Again, the analytical results obtained from solving (44) match closely with those from digital simulation of original system (40).

#### 6. Conclusions

In this paper, we proposed a nonlinear stochastic optimal control strategy for SDOF stochastically excited Markovian jump systems where the jump parameter takes on values in a finite discrete set. In the slow jump case, the original system was first reduced to one governed by a finite set of one-dimensional partially averaged Itô equation for total energy upon stochastic averaging. Using the stochastic dynamical programming principle and Markovian jump rules, a finite set of coupled dynamical programming equations was derived, from which the optimal control force was obtained. The obtained optimal control force has been proved to be the quasi-optimal control for the original system. The strategy was applied to two nonlinear oscillators that were capable of independent Markovian jumps. Numerical results showed that the strategy is fairly robust and effective in reduction of stationary response of the controlled system. Thus, it is potentially promising for practical control applications after further research.

The proposed method has the potential to be extended to the optimal control for Multi-DOF Markovian jump systems. However, to obtain the concrete conclusion much work has to be done. Mathematical tools should be developed to solve the dynamical programming equation (14) for Multi-DOF case.

#### Disclaimer

Opinions, findings, and conclusions expressed in this paper are those of the authors and do not necessarily reflect the views of the sponsors.

#### Conflict of Interests

The authors declare that there is no conflict of interests regarding the publication of this paper.

#### Acknowledgments

This work was supported by the Natural Science Foundation of China (through Grants no. 11272279, 11372271, 11432012, 11321202, and 11272201) and 973 Program (no. 2011CB711105).