Abstract

We study optimization problems involving eigenvalues of symmetric matrices. We present a nonsmooth optimization technique for a class of nonsmooth functions which are semi-infinite maxima of eigenvalue functions. Our strategy uses generalized gradients and space decomposition techniques suited for the norm and other nonsmooth performance criteria. For the class of max-functions, which possesses the so-called primal-dual gradient structure, we compute smooth trajectories along which certain second-order expansions can be obtained. We also give the first- and second-order derivatives of primal-dual function in the space of decision variables under some assumptions.

1. Introduction

output feedback control is an important example of a design problem, where the feedback controller has to respond favorably to several performance specifications. Typically in synthesis, the channel is used to enhance the robustness of the design. Due to its prominence in practice, control has been addressed in various ways over the years.

In nominal synthesis, feedback controllers are computed via semidefinite programming (SDP) [1] or algebraic Riccati equations [2]. When structural constraints on the controller are added, the synthesis problem is no longer convex. Some of the problems above have even been recognized as NP-hard or as rationally undecidable. These mathematical concepts indicate the inherent difficulty of synthesis under constraints on the controller. The synthesis problem involves finding an output feedback control matrix that minimizes the norm of a certain transfer function, subject to the constraint that is stabilizing. This is a challenging problem and even finding a stabilizing can be difficult. Indeed, if the entries of are restricted to lie in prescribed intervals, then finding a stabilizing is an NP-hard problem [3].

feedback controller synthesis was one of the motivating application for the development of our work. We consider a linear time invariant dynamical system in the standard LFT form where is the state, is the output, is the command input, and , are the performance channel. To cancel direct transmission from input to output , the assumption is made. This is without loss of generality (see [4], chapter 17).

Let be a static feedback controller; then the closed-loop state space data and transfer function read where Dynamic controllers can be addressed in the same way by prior augmentation of the plant (3) (see, e.g., [5]).

In synthesis, we compute to minimize the norm of the transfer function ; that is, (see, e.g., [4]). The standard approach to synthesis in the literature uses the Kalman-Yakubovich-Popov Lemma and leads to a bilinear matrix inequality (BMI) [6]. Here we use a different and much more direct approach based on our -decomposition method. The advantage of this is that Lyapunov variables can be avoided, which is beneficial because they are a source of numerical trouble. Not only does their number grow quadratically with the system order, but they may also cause strong disparity between the optimization variables. The price to be paid for avoiding them is that a difficult semi-infinite and nonsmooth program has to be solved. To synthesize a dynamic controller of order , , the objective is defined as follows: which is nonsmooth and nonconvex with two sources of nonsmoothness, the infinite max-operator and the maximum eigenvalue function. In addition, stands for the conjugate transpose of the complex matrix .

The application we have in mind is optimizing the -norm, which is structurally of the form where is an operator with values in the space of symmetric or Hermitian matrices, equipped with the scalar product , and denotes the maximum eigenvalue function on .

The above problem (5) can be recast as a case of (6). The program we wish to solve in this paper is where the function has the form (6).

is nonsmooth with two possible sources of nonsmoothness: (a) the infinite max-operator and (b) the nonsmoothness of , which may lead to nonsmoothness of for fixed .

Optimization of the -norm is a prominent application in feedback synthesis, which has been pioneered by Polak and coworkers; see, for instance, [7, 8] and the references given there. Existing methods for the synthesis problem are often based on first reformulating the problem into one involving linear matrix inequalities (LMIs) and an additional nonconvex rank constraint or nonconvex equality constraint. Solving methods for such reformulations of the problem include those based on linearization method [9], alternating projections method [10], augmented Lagrangian method [11], and sequential semidefinite programming method [12]. The synthesis problem can also be reformulated into a problem involving bilinear matrix inequalities (BMIs). Dealing with such reformulations of the problem includes [12, 13] (see also the references therein). A disadvantage of these approaches is that they require the introduction of Lyapunov variables. As the number of Lyapunov variables grows quadratically with the number of state variables, the total number of variables can be quite large and even problems of moderate size can lead to numerical difficulties [14].

In this paper, the synthesis problem is posed as an unconstrained, nonsmooth, nonconvex minimization problem and requires special optimization techniques. Our approach avoids the use of Lyapunov variables; hence, it is well suited for optimizing our reformulation of the synthesis problem. We develop the local nonsmooth optimization strategy, a superlinear space decomposition algorithm, which is suited for optimizing the -norm. Problem (7) implies the smoothness information; we can adopt variable space decomposition form. Meanwhile, since the problem (7) has the special structure called primal-dual gradient structure (PDG), which has been introduced in [15], it is possible to identify smooth tracks. So we can design a method which has fast convergent rate. The approach which is taken to solve this problem is based on using the recently developed local optimization algorithm presented in [15, 16]. In light of the -space decomposition, this method is introduced in [15] (see also [17, 18]). Moreover, it is applied to many applications such as nonlinear programming and second-order cone programming (see [1922]). The idea is to decompose into two orthogonal subspaces and at a point that the nonsmoothness of is concentrated essentially on , and the smoothness of appears on -subspace. More precisely, for a given , where denotes the Clarke subdifferential of at . Then can be decomposed as direct sum of two orthogonal subspaces, that is, , where , and . Then we define the primal-dual Lagrangian, an approximation of the original function, and show along certain manifolds it can be used to create as second-order expansion for a nondifferentiable function. As a result, we can design an algorithm frame that makes a step in the -space, followed by a -Newton step in order to obtain superlinear convergence, and show that it improves the situation considerably.

The rest of the paper is organized as follows. In Section 2, we recall some basic concepts about decomposition theory. In Section 3, we reformulate these problems as unconstrained max-finite function optimization problems under the hypothesis of the multiplicity one of the largest eigenvalue. We also mention some of the issues involved in trying to solve such problems. Using primal-dual gradient structure (PDG), we give an important conclusion about second-order expansion of the function. Likewise, Section 4 outlines the optimization approach as in Section 3 and presents a different way to deal with the supposition of multiplying largest eigenvalues. The paper ends with some concluding remarks.

2. Preparation and Preliminary Results

We recall the -theory developed in [15]. Let be a finite-valued convex function. For a given , we start by defining a decomposition of the space . The subspaces and are equivalently defined as follows: In other words, is the subspace where appears to be differentiable at 0. Likewise, we can obtain the following result, which is stated in [15].

Proposition 1. Let be a proper convex function for a given point ; one has the following.(1)  is the subspace parallel to aff   and. (2)For any   ri ,   and are, respectively, the normal and tangent cones to at  , where ri stands for the relative interior respect to a given set.

We give the Clarke generalized gradient for local Lipschitz function.

Definition 2 (see [23, 24]). Let be local Lipschitz on ; the generalized gradient of at , denoted by  , is defined by where is the generalized directional derivative of at in the directive .

The following results come from [23]; we will use these properties in later sections; as for their proofs, we will omit them.

Proposition 3. Suppose is a finite collection of functions , each of which is Lipschitz near . The function is defined by Then one has where , and if is regular at for each , then equality holds and is regular at .

Notation. We introduce the basic notation in the remainder parts. is the space of symmetric matrices and stands for the cone of positive semidefinite symmetric matrices. denotes Fröbenius scalar product of . Let be the multiplicity of the largest eigenvalue of ; that is, lies on the submanifold , where is a -submanifold of . Let be the eigenspace associated with , let be an orthonormal basis of , let be an orthonormal basis associated with , , and let be, respectively, the tangent and normal spaces to the submanifold at . is the adjoint operator of the linear operator . Much of the additional notation comes from [25, 26].

3. -Space Decomposition for Single Eigenvalue

3.1. Theory of the Single Eigenvalue Function

In this section we will analyse the case where the multiplicity of is one at all active frequencies . This is motivated by practical considerations because nonsmoothness (b) never occurred in our tests. The necessary changes required for the general case will be discussed in next section.

Lemma 4. For a closed-loop stabilizing controller , the set of active frequencies is either finite or ; that is, for all .

A system where is called all-pass. This is rarely encountered in practice. For the technical formulas we will concentrate on those  ’s, where the set of active frequencies or peaks is finite.

From what follows we will analyse the case where the multiplicity of is one at all active frequencies . This is motivated by practical considerations because nonsmoothness about never happened in our tests. The necessary changes required for the general case will be discussed in Section 4.

In [27], three approaches to semi-infinite programming are discussed: exchange of constraints, discretization, and local reduction. We will use a local reduction method here. The main ideas are recalled below.

Let be a local solution of (7). Indexing the active frequencies at , we suppose that the following conditions are satisfied.

Assumption 5. Consider
(i) ,  ,
(ii) ,  ,
(iii) ,  for every .

These assumptions define the setting denoted as the standard case in semi-infinite programming [27]. The three conditions express the fact that the frequencies are the strict global maximizers of . Notice that condition (iii) is the finiteness hypothesis already mentioned, justified by Lemma 4.

Lemma 6. Under conditions (i)–(iii), the neighborhood of may be chosen such that for every . In particular, for every .

So we have that program (6) is locally equivalent to the standard following nonlinear program: where ; then we may solve (12) via the so-called -decomposition method.

Assumption 7. are linearly independent.

Under the hypothesis of Assumption 7, local convergence of this approach will be assured because this guarantees that (12) satisfies the linear independence constraint qualification hypothesis.

We denote , and ,  , stands for the eigenvector associated with the largest eigenvalue of .

Next a special kind of structure of , called primal-dual structure (PDG), will be seen.

Proposition 8. There exists a ball about , denoted by , functions the multiplicity of    is single, so are on ; in addition,(1)  and  for  ; (2)for each;(3) is the unit simplex in   given by(4)on the basis of the property about the subdifferential of the maximum functions, for each , if and only ifwhere    if  and  .

We have the following result.

Theorem 9. Suppose the set is finite. Then the Clarke subdifferential of at is the set as follows: where is the set of active indices at and

Proof. Because is the finite maximum functions, we can directly make use of the Clarke subdifferential of it and derivative of the eigenvalue function with multiplicity one to get and the proof is done.

Theorem 10. Suppose Assumptions 5 and 7 hold. Then one has the following results at .(1)The Clarke subdifferential of has the following expression: (2)Let denote the subspace generated by the subdifferential . Then where stands for linear hull of a set .

Proof. With Theorem 9 and Assumption 5, we can get the conclusion (1).
Let ,  , and we have . Then it follows from the definition of space that and means that the second formula holds. The proof is completed.

Remark 11. (i) Since the subspaces and generate the whole space , every vector can be decomposed along its -components at . In particular, any can be expressed as follows: where and .
(ii) For any , we have From the above Theorem 10, the -component of a subgradient is the same as that of any other subgradient at ; that is, .

3.2. Smooth Trajectory and Second-Order Properties

Given that , the Lagrangian-like function of can be formulated in

Theorem 12. Suppose Assumption 7 holds. Then, for all small enough, there exists the following.(i)The solution of the nonlinear system with variables is unique and , where is a function.(ii)For the solution function in (i) one has where .
The trajectory   is and In particular,  , , and .(iii).

Proof. (i) Differentiating the left hand side of (26) with respect to gives This Jacobian at is , which is nonsingular because of Assumption 7. There is also a Jacobian with respect to , so by the implicit function theorem, there is a function defined on a neighborhood of such that .
(ii) From (i), we have that is . Thus, the Jacobian and exist and are continuous. Differentiating the system with respect to , we obtain that or, in matrix form, . Using the expression , we have that By virtue of the continuity of , is nonsingular. Hence, Furthermore, is because , is ; then is . Thus, and are . From the definition of the spaces, we have . Hence, . So and .
(iii) The conclusion can be directly obtained in terms of (i) and the definition of .

So far we have developed a primal track . Now we take our attention to an associated dual object, which is also a smooth function of ; we study a multiplier vector function , which depends on structure function gradients, , and an arbitrary subgradient at .

Lemma 13. Given that , the system with , has a unique solution , which is given by in particular, for all .

Theorem 14. Given that , at the trajectory , one has

Proof. According to the definition of in (25) and the item (iii) from Theorem 12, we get

Theorem 15. Given that and supposing Assumption 7 holds; then for small enough, the following assertions are true.(i) is a function of and satisfies the Lagrangian-like result , for .(ii)The gradient of is given by , and, in particular, when , one has (iii)The Hessian of is given by In particular, when , one has

Proof. (i) This conclusion follows from of .
(ii) Using the chain rule, the differential of the Lagrangian-like with respect to can be written as follows: Multiplying each equation by the appropriate , summing the results, and using the fact that yield Using the transpose of the expression of , we get which together with (6.11) in [28] yields the desired result.
If , then  , and . By Remark 11(ii), we have
(iii) Differentiating the following equation with respect to ; we obtain where , . It follows from the proof of Theorem 6.3 in [28] that Then when , We finish the proof.

Theorem 16. Suppose Assumption 7 holds and . Then for small enough, there holds the second-order expansion of along with the trajectory ,

Proof . From the definition of , we have Since , we get Therefore,

4. -Decomposition for Multiple Eigenvalues

4.1. -Theory of the Multiple Eigenvalue Function

The working hypothesis of the previous section was that leading eigenvalues had multiplicity 1 for all frequencies in the set and for all in a neighborhood of . This hypothesis is motivated by our numerical experience, where we have never encountered multiple eigenvalues. This is clearly in contrast with experience in pure eigenvalue optimization problems. However, our approach is still functional if the hypothesis of single eigenvalues at active frequencies is abandoned. Based on the weaker assumption that the eigenvalue multiplicities at the limit point are known for all active frequencies , and on the information at the current iterate point, we have good technique to dependably guess .

This situation has been discussed by several authors (see, e.g., [27, 2931]). Consider where has multiplicity . We replace the maximum eigenvalue function by the average of the first eigenvalues This function is smooth and convex in a neighborhood of the smooth manifold of the matrices with the largest eigenvalue multiplicity , and on . Then we may replace the nonsmooth information contained in by the smooth information contained in the function by adding the constraint . The manifold has codimension in and in a neighborhood of may be described by equations , which has been presented independently in [16, 32]. The extension to the semi-infinite eigenvalue optimization is clear under the finiteness assumption (iii). We may then approach minimization of the -norm along the same lines and obtain the finite program where stands for the multiplicity of the largest eigenvalue  ; we denote by an orthonormal basis associated with the eigenvector of  . According to the foregoing analysis, we can transform the above constrained optimization problem into the following form: For convenience, we denote

So because ; we still label as . In view of the property about the indicated functions, we can transform (56) into an unconstrained optimization. Thus, for the problem (56), we just need to deal with the following form equivalently: where , and means the indicated function at for the set .

Similarly as in Proposition 8, we can obtain that the problem (58) possesses PDG.

Proposition 17. in (58) is a primal-dual gradient structured (PDG) function.

Proof. First, ; in [32] Shapiro and Fan had given the fact that the function , for and , so too. In this way, there exists a ball about , and a dual multiplier set (1) and for ;(2)for each ,  ;(3) is a closed convex set such that(a)if , then we know that ; that is, , which is defined in (14);(b)for each , where is the th unit vector in;(c)for each , there exists such that and for ;(4)for each ,   if and only if where if and . Moreover, satisfies dual feasibility: .

Assumption 18. are linearly independent.

Theorem 19. Suppose Assumption 18 holds. Then one has the following results at . (1)The Clarke subdifferential of has the following expression: where .(2)Let denote the subspace generated by the subdifferential . Then where ,

4.2. Smooth Trajectory and Second-Order Properties

We give the smooth trajectory information about the function in the following theorem; with respect to its proofs, it is similar to Theorem 12.

Theorem 20. Suppose Assumption 18 holds. Then for all small enough, there exists the following. (i)The nonlinear system with variables has a unique solution , where is a -function.(ii)For function , one has where .
The trajectory is , and In particular, , and .(iii), and the smooth trajectory is tangent to at .(iv), and .

Now we pay our attention to an associated dual object, that is, also a smooth function of .

Theorem 21. We suppose Assumption 18 is holding, with trajectory and to a subgradient at , for all small enough; the linear system with variables , has a unique solution , given by where is defined in Theorem 20.

Next we consider the following primal-dual function

Theorem 22. Given , at the trajectory , one has

Theorem 23. Given and supposing that Assumption 18 holds, then for small enough, the following assertions are true.(i) is a function of and (ii)The gradient of is given by , where and, in particular, when , one has (iii)The Hessian of is given by where is the matrix function defined by In particular, when , one has

Proof. (i) Because is , it follows from of in the above theorem. At the same time, Assumption 18 holds; using (68) with gives In addition, Multiplying and , respectively, for the above equations and summing, we get the Lagrangian-like expression in item (i).
(ii) Using the chain rule, the differential of the Lagrangian-like functions (69) and (77) with respect to can be written as follows: Multiplying each equation by the appropriate and , summing the results, and using the fact that yield Using the transpose of the expression of , we get which together with (6.11) in [28] yields the desired result.
If ,  then , and , , , and by Theorem 19 so we attain
(iii) Differentiating the following equation with respect to , we obtain where . It follows from the proof of Theorem 6.3 in [28] that Then when , where .

We call the corresponding Hessian matrix of at a basic -Hessian for at and denote it by . Using second-order -derivatives we can specify second-order expansions for and give related necessary conditions for optimization problem.

Theorem 24. Suppose Assumption 18 holds and . Then for small enough, there holds the second-order expansion of along the trajectory ,

Proof. From the definition of , we have Since , we get Therefore,

Corollary 25. Suppose Assumption 18 holds and is a local minimizer of (56). Then and the associated basic -Hessian is positive semidefinite.

5. Conclusions

In this paper, we mainly study the -theory to optimize the -norm or other nonsmooth criteria which are semi-infinite maxima of maximum eigenvalue functions. We use a methodology from semi-infinite programming to obtain a local nonlinear programming model and apply the decomposition method. With the so-called PDG that this problem possesses, Lagrangian-like theory is applied to the class of the functions. Under some hypothesis conditions, we can obtain the first- and second-order derivatives of the primal-dual Lagrangian function. This method can operate well in practice.

For further work, the need can be anticipated: in this paper we only give the theory analysis to solve the special class of eigenvalue optimization, we will continue to study its executable algorithm, and we will extend the algorithm of convex eigenvalues to nonconvex cases, where its related theory will be researched in later papers.

Conflict of Interests

The authors declare that there is no conflict of interests regarding the publication of this paper.

Acknowledgments

The authors would like to thank Yuan Lu from Shenyang University for numerous fruitful discussions and an anonymous referee for their constructive criticism that helped to improve the presentation. This paper is supported by the National Natural Science Foundation of China, Grant nos. 11171049, 11226230, and 11301347, and General Project of the Education Department of Liaoning Province L2012427.