Research Article  Open Access
A Nonlinear Lagrange Algorithm for Stochastic Minimax Problems Based on Sample Average Approximation Method
Abstract
An implementable nonlinear Lagrange algorithm for stochastic minimax problems is presented based on sample average approximation method in this paper, in which the second step minimizes a nonlinear Lagrange function with sample average approximation functions of original functions and the sample average approximation of the Lagrange multiplier is adopted. Under a set of mild assumptions, it is proven that the sequences of solution and multiplier obtained by the proposed algorithm converge to the KuhnTucker pair of the original problem with probability one as the sample size increases. At last, the numerical experiments for five test examples are performed and the numerical results indicate that the algorithm is promising.
1. Introduction
Consider the stochastic minimax problems of the form where , is a random vector supported on the probability space , , denotes expectation with respect to the distribution of , and is well defined. Problem (1) has drawn much attention in recent years, which arises in various situations such as inventory theory, robust optimization, and engineering filed; for example, see [1â€“5].
A nonlinear Lagrange function for problem (1) can be established based on Zhang and Tang [6]; that is, where is Lagrange multiplier and , and is a controlling parameter. The good properties of function (2) were investigated in [6] and the convergence analysis of the corresponding nonlinear Lagrange algorithm was presented in [7]. Although function (2) overcomes the nondifferentiability of the objective function in problem (1), the exact numerical evaluation of the expected value in (2) is very difficult because either distribution of random vector is unknown or it is too complex to compute the multidimensional integral.
The sample average approximation (in short, SAA) method [8â€“15] is a wellbehaved approach for bypassing this difficulty. The idea of SAA method is to generate a random sample of the random variable with sample size and approximate the involved expected value function by the corresponding sample average function . Inspired by the SAA method, we present the SAA function of as follows: where . Furthermore, we will propose an implementable nonlinear Lagrange algorithm based on SAA function (3), in which function (3) is minimized and the Lagrange multiplier is updated by its SAA form. Under some mild assumptions on problem (1), we will show that the sequences of solution and multiplier generated by the SAA methodbased nonlinear Lagrange algorithm converge to the KuhnTucker pair of the original problem with probability one as the sample size increases.
The remainder of this paper is organized as follows. Preliminaries are given in Section 2. The SAA methodbased nonlinear Lagrange algorithm and convergence analysis are established in Section 3. Section 4 reports the numerical results by using the proposed algorithm to solve five test examples. Finally, conclusions are drawn in Section 5.
2. Preliminaries
This section serves as a preparation for the convergence analysis of the proposed SAA methodbased nonlinear Lagrange algorithm. The assumptions on problem (1) are provided firstly. Furthermore, some results that are essential to our discussion are listed. At last, we recall the nonlinear Lagrange algorithm in [7].
Let denote the KuhnTucker pair of problem (1). Let be small enough and define . The Lagrange function for problem (1) is defined by . Set and . We list the following assumptions on problem (1), which will be used in the subsequent theoretical analysis.(A1) is twice continuously differentiable on .(A2)There exists a nonnegative measurable function such that is finite and for every the inequality holds with probability one.(A3)The random sample is independent and identically distributed.(A4) satisfies the KT condition. That is, (A5)Strict complementary condition holds; that is, for .(A6)Linear independent constraint qualification holds. That is, is a set of linear independent vectors.(A7)For all satisfying , , it holds that where is a constant.
Definition 1 (see [11]). For nonempty sets and in , one denotes by the distance from to and by the deviation of the set from the set .
Lemma 2 (HeineCantor theorem; see [16]). If is continuous function and is compact, then is uniformly continuous, where and are two metric spaces.
Note. An important special case is that every continuous function from a closed interval to the real numbers is uniformly continuous.
Lemma 3. Define for . Suppose that converges to with probability one uniformly on for . Then converges to with probability one uniformly on .
Proof. From the given condition, for , one has that, for any , there exists such that when , holds with probability one for any .
Let . Thus we have that, for any , when , for any , the following holds with probability one:
which means that Lemma 3 is true.
Algorithm 4. We have the following.
Step 1. Choose , where , , and small enough and set .
Step 2. Solve
and obtain the optimal solution .
Step 3. If , then stop. Otherwise go to Step 4.
Step 4. Update the Lagrange multiplier by
Step 5. Set and return to Step 2.
3. The SAA MethodBased Nonlinear Lagrange Algorithm and Its Convergence
In view of the numerical computation difficulty in Algorithm 4 and motivated by the SAA method, we provide the following implementable nonlinear Lagrange algorithm based on the SAA method firstly. Furthermore we establish the convergence analysis of the SAA methodbased algorithm under assumptions (A1)â€“(A7) in this section.
Implementable SAA methodbased Algorithm 5 is presented as follows.
Algorithm 5. We have the following.
Step 1. Choose , where , small enough, , and is large enough. Set .
Step 2. Solve
and obtain the optimal solution .
Step 3. If , then stop. Otherwise go to Step 4.
Step 4. Update the Lagrange multiplier by
Step 5. Set and return to Step 2.
Taking into account the local convergence analysis of Algorithm 4 given in [7], next we will study the convergence of the sequence pair obtained by Algorithm 5 on . Let and denote the optimal value and the set of optimal solutions of and and denote the optimal value and the set of optimal solutions of , respectively. Set .
Theorem 6. If assumptions (A1)â€“(A3) hold and converges to with probability one for some , then the following statements hold: (i) converges to with probability one uniformly on ;(ii) converges to and tends to 0 with probability one as .
Proof. (i) Let , where . Then one has
Now we prove that converges to with probability one uniformly on . Considering the definition of , we have
It follows from assumption (A1) and Theoremâ€‰â€‰7.48 in [11] that both and are continuous at on . Consequently, for any , there exist constants and such that and for . Since is continuous at on and from Lemma 2, we have that is uniformly continuous on ; that is, for any and , there exists such that, for , it holds that
Furthermore, from Theoremâ€‰â€‰7.48 in [11] we know that converges to with probability one uniformly on , which means that, for the above given , there exists such that when , the inequality
holds with probability one for any . In view of formula (15) and formula (16), one draws the conclusion that, for any , there exists such that when , it holds that
with probability one for any . That is, converges to with probability one uniformly on .
In view of converging to with probability one, being bounded on with probability one, , and formula (14), we obtain that converges to with probability one uniformly on as . Moreover, one gets that converges to with probability one uniformly on as from Lemma 3.
Next we prove that converges to with probability one uniformly on . Let and . Hence, one has
From the above discussion, we know that , for any . Since is continuous at on and by Lemma 2, we obtain that is uniformly continuous on the interval . That is, for any and , there exists such that, for , it holds that
Moreover, for the given , there exists such that, for , it holds that
with probability one for any . From formulas (19) and (20), it follows that, for any , there exists such that, for , the following inequality holds:
with probability one for any . Combined with formula (18), statement (i) is true.(ii)From statement (i) and Theoremâ€‰â€‰5.3 in [11], statement (ii) is obtained. The proof of Theorem 6 is completed.
Theorem 7. If assumptions (A1)â€“(A3) hold, letting , , then, for any , the following statements hold:(i) converges to with probability one for ;(ii) converges to with probability one uniformly on ;(iii) tends to , and tends to 0 with probability one as .
Proof. (i) We use the mathematical induction method to show that statement (i) is true below.(a)Let ; then for we have
Considering , we have that converges to with probability one from Theorem 6. For , it holds that
Noting that the first term in the righthand side of formula (23) converges to 0 with probability one by Theoremâ€‰â€‰7.48 in [11] and the second term converges to 0 with probability one for being continuous, we obtain that converges to with probability one. Moreover, since is continuous at on , one gets that converges to with probability one. Then it follows from the properties of convergent sequence that converges to with probability one for .(b)When , we assume that converges to with probability one for . Then, when , next we prove that converges to with probability one for .
Let ; then for one has
From Theorem 6, we know that converges to with probability one as . By a similar proof process to that in (a), we have that converges to with probability one for . For , one has that
Noting that the first term of (25) tends to 0 with probability one as for converging to with probability one and being bounded on with probability one and the second term tends to 0 with probability one as for , we obtain that converges to with probability one. Then it follows from properties of convergent sequence that converges to with probability one for .
According to (a) and (b), we have that statement (i) holds.(ii)From statement (i) and Theorem 6, we obtain that statement (ii) is true.(iii)From statement (ii) and Theoremâ€‰â€‰5.3 in [11], one has that statement (iii) holds.
The above theorem shows that the sample average approximation Lagrange multiplier converges to its counterpart with probability one, and the optimal value and optimal solutions of the subproblem converge to their counterparts of the subproblem with probability one under some mild conditions. Next we will analyze the convergence of Algorithm 5 under some mild conditions.
Theorem 8. If assumptions (A1)â€“(A7) hold, letting , then there exist and such that, for any , it holds that the sequence pair () converge to the KT pair () with probability one.
Proof. Under assumptions (A1) and (A4)â€“(A7), from Theoremâ€‰â€‰3.1 in [7], we have that there exist and such that, for any and , the following inequality holds:
where is a constant, which implies that the pair tend to the KT pair of the original problem (1) as .
Since assumptions (A1)â€“(A3) hold and , it follows by Theorem 7 that the pair converge to the pair with probability one as .
Furthermore, since
the conclusion is obtained.
Remark 9. Theorem 8 shows that, under some mild assumptions, the sequence pair generated by Algorithm 5 locally tend to the KT pair of the original problem (1) with probability one as and when the controlling parameter is less than the threshold .
4. Numerical Results
The numerical results for five test examples by using Algorithm 5 are presented in this section, where the five test problems are compiled based on the deterministic optimization problems in the literature [17, 18]. The numerical experiments are implemented in Matlab 7.1 runtime environment on the same computer, whose basic parameters are Intel CORE i32310â€‰M@2.10â€‰GHz and memory 2â€‰Gb.
In the experiments, the sample with sample size is generated by in Matlab 7.1. For each problem, we choose , , , , , and , respectively, to make comparison. The initial value for each example. Unconstrained minimization problem in Step 2 of Algorithm 5 is solved by BFGS quasiNewton method combined with Wolf nonexact linear search rule, and the control precision is in this step. The stopping criterion in Step 3 is where .
The obtained numerical results are reported in Tables 1â€“5, in which , , iter., , and represent the sample size, the value of controlling parameter, the number of iterations, the error between the solution sequence by Algorithm 5 and the optimal solution of problem (1), and the error between the optimal value by Algorithm 5 and the optimal value of problem (1), respectively.





Example 1 (HaldMadson [17]). Consider the unconstrained minmax stochastic problem (1), in which is uniformly distributed on and are given by