Abstract

We study the stochastic optimal bounded control for minimizing the stationary response of strongly nonlinear oscillators under combined harmonic and wide-band noise excitations. The stochastic averaging method and the dynamical programming principle are combined to obtain the fully averaged Itô stochastic differential equations which describe the original controlled strongly nonlinear system approximately. The stationary joint probability density of the amplitude and phase difference of the optimally controlled systems is obtained from solving the corresponding reduced Fokker-Planck-Kolmogorov (FPK) equation. An example is given to illustrate the proposed procedure, and the theoretical results are verified by Monte Carlo simulation.

1. Introduction

The well-known tool to solve the problem of stochastic optimal control is the dynamical programming principle, which was proposed by Bellman [1]. According to this principle, a stochastic optimal control problem may be transformed into the problem of finding a solution to the so-called Hamilton-Jacobi-Bellman (HJB) equation. However, in most cases, the HJB equation cannot be solved analytically [2], especially in the multidimensional case. A powerful technique to solve stochastic optimal control problems is to combine the so-called stochastic averaging method [3] with Bellman’s dynamic programming principle. The idea is to replace the original stochastic system by the averaged one and then to apply the dynamic programming principle to the averaged system. It has been justified that the optimal control for the averaged system is nearly optimal for the original system [4]. This combination strategy has two notable advantages. Firstly, the dimension of the averaged system can be reduced remarkably, and the corresponding HJB equation is low-dimensional. Secondly, the diffusion matrix of the original stochastic system is usually singular, while the averaged one is usually nonsingular. This unique characteristic enables the HJB equation for the averaged system to have classical rather than viscous solution. In recent years, a notable nonlinear stochastic optimal control strategy has been proposed for the control of quasi-Hamiltonian systems under random excitations by Zhu based on the stochastic averaging method for quasi-Hamiltonian systems and the stochastic dynamical programming principle [3]. Examples have shown that this strategy is very effective and efficient.

In practice, the excitations of dynamical systems can be classified as either deterministic or random. The random excitation is often modeled as Gaussian white noise, wide-band colored noise, or bounded noise. On the other hand, the magnitudes of the control forces are usually limited due to the saturation in actuators; that is, the control forces are bounded. The optimal bounded control of linear or nonlinear systems under random excitations has been studied by many researchers [58]. In all these studies, the excitations of the systems are random excitations alone. However, many physical or mechanical systems are subjected to both random and deterministic harmonic excitations. For example, such combined excitations arise in the study of stochastic resonance [9] or uncoupled flapping motion of rotor blades of a helicopter in forward flight under the effect of atmospheric turbulence [10]. Due to the existence of the deterministic harmonic excitation, the response of the stochastic system is not time homogeneous and the stationary behavior is not easy to capture. Stochastic averaging method is such a method that can approximate the original system by time-homogeneous stochastic processes. By using stochastic averaging method, many researchers have studied the linear or strongly nonlinear systems under combined harmonic and wide-band random excitations [1116]. On the other hand, optimal bounded control of a linear or nonlinear oscillator subject to combined harmonic and Gaussian white noise excitations has been studied [1719]. However, Gaussian white noise is an ideal model. In most cases, the random excitation should be modeled as wide-band colored noise. So far, little work has been done on the optimal bounded control of a strongly nonlinear oscillator subject to combined harmonic and wide-band noise excitations [20].

In the present paper, a procedure for designing optimal bounded control to minimize the stationary response of a strongly nonlinear oscillator under combined harmonic and wide-band colored noise excitations is proposed. In Section 2, based on the stochastic averaging method, the equation of motion of a weakly controlled strongly nonlinear oscillator under combined harmonic and wide-band noise excitations is reduced to partially averaged Itô stochastic differential equations. In Section 3, a dynamical programming equation for the control problem of minimizing response of the system is formulated from the partially averaged Itô stochastic differential equations by applying the dynamical programming principle. The optimal control law is determined by the dynamical programming equation and the control constraint. In Section 4, the reduced FPK equation governing the stationary joint probability density of the amplitude and phase difference of the optimally controlled system is established. In Section 5, a nonlinearly damped Duffing oscillator under combined harmonic and wide-band noise excitations is taken as an example to illustrate the application of the proposed procedure. All theoretical results are verified by Monte Carlo simulation.

2. Partially Averaged Itô Stochastic Differential Equations

Consider a weakly controlled strongly nonlinear oscillator subject to weak linear or nonlinear damping and weak external or parametric excitation by a combination of harmonic functions and wide-band colored noises. The equation of motion of the system has the following form where the term is stiffness, which is assumed as an arbitrary nonlinear function of . is a small parameter; accounts for light damping and weak harmonic excitation with frequency ; represent weak external and/or parametric random excitations; are wide-band stationary and ergodic noises with zero mean and correlation functions or spectral densities ; denotes a weakly feedback control force. The repeated subscripts indicate summation.

When , (2.1) describes a large class of physical or structural systems. Under the conditions specified by Xu and Cheung [21], the solutions of system (2.1) can be expressed as the following form: where, , , , and   are all stochastic processes. is the potential energy of the system (2.1). The functions and are called generalized harmonic functions. Obviously is the instantaneous frequency of the system (2.1).

When , (2.1) degenerates to the following nonlinear conservative oscillator: The average frequency of the conservative oscillator (2.6) can be obtained by the following formula: Then the following approximate relation exists:

Treating (2.2) as generalized Van der Pol transformation from to , one can obtain the following equations for and : where

According to the Stratonovich-Khasminskii limit theorem [22, 23], and converge weakly to 2-dimensional diffusive Markov processes in a time interval of order as , which can be represented by the following Itô stochastic differential equations: where are independent unit Wiener processes,

System (2.1) has harmonic excitation and two cases can be classified: resonant case and nonresonant case. In the nonresonant case, the harmonic excitation has no effect on the first approximation of the response. Thus, we are interested in the resonant case, namely, where and are relatively prime positive small integers and is a small detuning parameter. In this case, multiplying (2.13) by and utilizing the approximate relation (2.8) yield

Introduce a new angle variable such that which is a measure of the phase difference between the response and the harmonic excitation. Then (2.14) can be rewritten as

Using the Itô differential formula, one can obtain the following Itô stochastic differential equations for , , and :

Obviously, and are slowly varying processes, while is a rapidly varying process. Averaging the drift and diffusion coefficients in (2.17) with respect to yields the following partially averaged Itô stochastic differential equations: where

Herein, denotes the averaging with respect to from 0 to are diffusion coefficients.

Note that there are two procedures of averaging in this section. One is stochastic averaging, and the other is deterministic time averaging. The procedures for obtaining and in (2.12) are called stochastic averaging. The procedures for obtaining and are called deterministic time averaging. To complete averaging and obtain the explicit expressions of and , the functions and in (2.12) are expanded into Fourier series with respect to and the approximate relation (2.8) is used.

3. Dynamical Programming Equation and Optimal Control Law

For mechanical or structural systems, is the response amplitude and is the phase difference between the response and the harmonic excitation. They represent the averaged state of system (2.1). Usually, only the response amplitude is concerned, so it is meaningful to control in a semi-infinite or finite time interval. Consider the ergodic control problem of system (2.18) in a semi-infinite time interval with the following performance index: Herein, is a cost function. Based on the stochastic dynamical programming principle [24], the following simplified dynamical programming equation can be established from the first equation of (2.18): where is called value function; denotes the minimum value of with respect to ; is optimal performance index; indicates the control constraint.

The optimal control law can be determined from minimizing the right-hand side of (3.2) with respect to under control constraint. Suppose that the control constraint is of the form where is a positive constant representing the maximum control force. The control enters the performance index (3.2) only through the term This expression attains its minimum value for , depending on the sign of the coefficient of in (3.4). So the optimal control is where sgn denotes sign function.

It is reasonable to assume that the cost function is a monotonously increasing function of because the controller must do more work to suppress larger amplitudes . Then is positive. Under the specified conditions [21], and are both positive. Equation (3.5) is reduced to

Equation (3.6) implies that the optimal control is a bang-bang control. has a constant magnitude . It is in the opposite direction of and changes its direction at.

For the optimal bounded control of system (2.18) in a finite time interval, the following performance index is taken: where is the final control time and is the final cost function. denotes the expectation operator. The following simplified dynamical programming equation can be established from the first equation of (2.18): where is the value function. Equation (3.8) is subject to the following final time condition:

It is obvious that the same optimal control law as that in (3.6) can be derived if the control constraint is of the form of (3.3).

Note that the optimal control for the averaged system (2.18) is nearly optimal for the original system (2.1). For simplicity, here it is called optimal control for both original and averaged systems.

4. Stationary Response of Optimally Controlled System

Inserting from (3.6) into (2.18) to replace and averaging, the following fully averaged Itô stochastic differential equations for and can be obtained: where

The reduced FPK equation for the optimally controlled system is of the following form where is the stationary joint probability density of the amplitude and the phase difference . Since is a periodic function of , it satisfies the following periodic boundary condition with respect to :

The boundary condition with respect to is which implies that is a reflecting boundary. The other boundary condition is

In addition to the boundary conditions, the stationary joint probability density satisfies the following normalization condition:

Usually, the partial differential equation (4.3) can be solved only numerically.

5. Example

To illustrate the proposed strategy in the previous sections, take the following controlled nonlinearly damped Duffing oscillator as an example. Duffing oscillator is a typical model in nonlinear analysis. The equation of motion of the system is of the form where , , , , , and are positive constants; is the feedback control with the constraint defined by (3.3); are independent stationary and ergodic wide-band noises with zero mean and rational spectral densities

can be regarded as the output of the following first-order linear filter: where are Gaussian white noises in the sense of Stratonovich with intensities . Note that the maximum value of is . It is assumed that , , and are all small.

For the system (5.1), the instantaneous frequency defined by (2.4) has the following form: can be approximated by the following finite sum with a relative error less than 0.03%: where

Then the averaged frequency of the system (5.1) can be approximated by . In the case of primary external resonance,

Introduce the new angel variable defined by (2.15), and complete the procedures shown in Sections 24 we obtain the following fully averaged Itô stochastic differential equations for and : where and are drift and diffusion coefficients, respectively. and are given in the appendix.

The stationary joint probability density of the optimally controlled system (5.1) is governed by the following reduced FPK equation:

Solving the FPK equation (5.9) by finite difference method under the conditions (4.4)–(4.7), the stationary joint probability density of the amplitude and the phase difference of the optimally controlled system (5.1) can be obtained. Furthermore, the stationary mean amplitude of the optimally controlled system (5.1) can be obtained as follows:

To check the accuracy of the proposed method, Monte Carlo digital simulation of the original system (5.1) is performed. The sample functions of independent wide-band noise were generated by inputting Gaussian white noises to the linear filter (5.3). The response of system (5.1) was obtained numerically by using the fourth-order Runge-Kutta method with time step 0.02. The long-time solution after 1500,000 steps was regarded as the stationary ergodic response. 100 samples are used. For every sample, the amplitude and angle variable are calculated from step 1500,001 to step 2000000 to obtain the statistical probability density of . Figures 1 and 2 show the typical sample function of wide-band noise and control force , respectively. Figure 3 shows of the optimal controlled system (5.1). It is seen that the theoretical result agrees very well with that from digital simulation. Figures 4(a) and 4(b) show the sample functions of displacement and velocity of the system (5.1), respectively. It is obvious that the bounded control can reduce the displacement and velocity. Also, the amplitude of the original system (5.1) is reduced by control, which is verified by Figure 5.

Note that the system (5.1) is strongly nonlinear. Increasing the nonlinearity coefficients or in (5.1), one can see that the agreement between theoretical results and Monte Carlo digital simulation is still acceptable (see Figures 6 and 7). This demonstrates that the proposed method is powerful to deal with strongly nonlinear problems, even though the nonlinearity is extremely strong.

6. Conclusions

In the present paper, a combination procedure of the stochastic averaging method and Bellman’s dynamic programming for designing the optimal bounded control to minimize the response of strongly nonlinear systems under combined harmonic and wide-band noise excitations has been proposed. The procedure consists of applying the stochastic averaging method for weakly controlled strongly nonlinear systems under combined harmonic and wide-band noise excitations, establishing the dynamical programming equation for the control problem of minimizing the response based on the partially averaged Itô stochastic differential equations and the dynamical programming principle, determining the optimal control from the dynamical programming equation and the control constraint without solving the dynamical programming equation. Then the stationary joint probability density and mean amplitude of the optimally controlled averaged system are obtained from solving the reduced FPK equation associated with the fully averaged Itô stochastic differential equations. A nonlinearly damped Duffing oscillator with hardening stiffness has been taken as an example to illustrate the application of the proposed procedure. The comparison between the theoretical results and those from Monte Carlo simulation shows that the proposed procedure works quite well even though the nonlinearity is extremely strong. The results show that the response amplitude of the system can be reduced remarkably by the feedback control.

The advantages of the proposed method are obvious. Note that; in the example, the wide-band noise is generated by a first-order linear filter. In principle, one could apply the stochastic dynamical programming method to the extended system to study the optimal control problem. However, the corresponding HJB equation is 5-dimensional or 4 dimensional. The corresponding reduced FPK equation is 4 dimensional, which is very difficult to solve. After stochastic averaging, the original system is represented by two-dimensional time-homogeneous diffusion Markov processes of amplitude and phase difference with nondegenerate diffusion matrix. The dynamical programming equation derived from the averaged equations is two dimensional or one dimensional. The corresponding reduced FPK equation is two dimensional, which is easy to solve. The other advantage of the proposed procedure is that it is not necessary to solve the dynamical programming equation for obtaining the optimal control law. Furthermore, the proposed method can be extended to multi-degrees-of-freedom (MDOF) systems easily. This will be our future work.

Appendix

The drift coefficients in (5.8) are as follows:

The diffusion coefficients in (5.8) are as follows:

Acknowledgments

The work reported in this paper was supported by the National Natural Science Foundation of China under Grant nos. 10802030, 10902096 and Specialized Research Fund for Doctoral Program of Higher Education of China under Grant no. 200802511005.