This paper presents a general and comprehensive description of Optimization Methods, and Algorithms from a novel viewpoint. It is shown, in particular, that Direct Methods, Iterative Methods, and Computer Science Algorithms belong to a well-defined general class of both Finite and Infinite Procedures, characterized by suitable descent directions.

1. Introduction

The dichotomy between Computer Science and Numerical Analysis has been for many years the main obstacle to the development of eclectic computational tools. With the latter term the author indicates the capability of implementing algorithms properly adaptable to particular environmental requirements and, therefore, optimized for this aim.

Since the formulation of a problem requires the preliminary definition of the variables, and the functions involved in the model, the antithesis between finite and continuous applied mathematics is even stronger from a computational point of view.

In Computer Science, problems are typically defined on discrete sets (graphs, integer variables and so forth) and are characterized by procedures formalized in a finite number of steps.

Direct Methods, which are classical tools of Numerical Analysis, can be considered, in fact, algorithms according to the standard Computer Science definitions. However, the presence of ill-conditioned matrices can seriously affect the practical implementation of Direct Methods. On the other hand, Iterative Methods are based in the majority of cases on the convergence of a sequence approximating the optimal solution of a problem defined in a continuous range. Proper stopping rules on the truncation error reduce the latter computational scheme to a finite process, but, unfortunately, in many cases the theoretical result is affected by a variety of numerical instability problems, thereby preventing a precise forecast of the true number of iterations, requested to achieve the desired approximation.

Furthermore, Linear Programming, Convex Quadratic Programming, and the unconstrained minimization of a symmetric positive definite bilinear form are continuous problems that can be exactly solved with a finite number of steps. This proves that the distinction between algorithms and infinite iterative procedures is not always characterized by the discrete or the continuous range of the variables involved in the problem.

Most of Numerical Analysis methods are based upon the application of the Fixed Point theorem, assuring the convergence of the iterative scheme by means of a contraction of the distance between successive terms of the sequence approximating the optimal solution.

Gradient methods are usually considered in the literature as particular procedures in the frame of optimization techniques, for classical unconstrained or constrained problems.

The main aim of the present paper is to show that Gradient or Gradient-type methods represent the fundamental computational tool to solve a wide set of continuous optimization problems, since they are based on a unitary principle, referred to both to finite and to infinite procedures.

Moreover, some classical discrete optimization algorithms can be also viewed in the framework of Gradient-type methods.

Hence, the gradient approach allows to deal with problems involving variables defined both in a continuous range and in a discrete one, by utilizing finite or infinite procedures in a quite general perspective.

It is essential to underline that ABS methods [1], which represent a remarkable class of algorithms for solving linear and nonlinear equations, are founded on a quite different approach. Roughly speaking, ABS-methods construct, in fact, a set of spanning matrices in š‘…š‘›, by performing an adaptive optimization, associated to the dimension of the subspace and parameter dependent. In many ABS-methods the choice of the set of optimal parameters is, in fact, crucial in order to identify by a unified approach the structural features of the optimization algorithms. Parameter dependence is not present in Gradient-type methods.

It is important to emphasize that the typical finiteness of Computer Science algorithms is characterized by classes of Gradient-type methods converging to an isolated point of a suitable sequence, generated by the procedure.

Furthermore, the most recent algorithms for Local Optimization can be precisely described by Gradient-type methods in a general framework. As a matter of fact, Interior Points techniques [2, 3], Barrier Algorithms [4] represent a wide set of Gradient-type methods for NonLinear Programming.

Moreover, a fundamental role in this new approach is played by the properties of suitable Structured matrices, associated to the optimization procedures. Advanced Linear Algebra Techniques are, in fact, essential to construct low-complexity algorithms.

We point out, in particular, the techniques based on Fast Transforms and the corresponding approximations by algebras of matrices simultaneously diagonalized [5ā€“10].

The utilization of Advanced Linear Algebra Techniques in NonLinear Programming opens a new research field, leading in many cases to a significant improvement both of the efficiency and in the practical application of Gradient-type methods for problems of operational interest [11, 12].

In Deterministic Global Optimization structured matrices allow remarkable results in the frame of the Tunneling techniques, by using the classical š›¼šµšµ approach [9].

The novel results on Tensor computation [13] are a promising area of research to improve the efficiency of global optimization algorithms for large-scale problems and particularly for the effective construction of more general sets of Repeller matrices in the Tunneling phases [14, 15]. This approach can have important consequences also in Nonlinear Integer Optimization (see the pioneer work in [16]), taking into account the more recent results concerning the discretization of the problem by the continuation methods (see, e.g., [17]).

Therefore, this survey has also the aim of finding in-depth general relationships between Local Optimization techniques and Deterministic Global Optimization algorithms in the frame of Advanced Linear Algebra Techniques.

2. The Gradient and the Gradient-Type Approach

Let minš±āˆˆš“š‘“(š±),š“āŠ†š‘…š‘›,(2.1) be an unconstrained problem to be solved.

By assuming š‘“(š±)āˆˆš¶1(š“), the simplest heuristic procedure to deal with (2.1) is to determine the stationary points of š‘“(š±)āˆˆš“, that is, š±āˆ—āˆˆš“āˆ¶āˆ‡š‘“(š±āˆ—)=0 by the recursive computational scheme: š±(š‘˜+1)=š±(š‘˜)āˆ’šœ†š‘˜ī€·š±āˆ‡š‘“(š‘˜)ī€ø,š‘˜=0,1ā€¦(2.2) withš±(0)āˆˆš“ being the initial point of the procedure, āˆ’āˆ‡š‘“(š±(š‘˜)) the direction of maximum local decreasing, šœ†š‘˜ the one-dimensional step-size, šœ†š‘˜>0,

forallš‘˜,šœ†š‘˜ is computed such thatš±(š‘˜+1)š‘“ī€·š±āˆˆš“,(š‘˜+1)ī€øī€·š±ā‰¤š‘“(š‘˜)ī€ø,(2.3) and the sequence {š±(š‘˜+1)} satisfies the condition limš‘˜āŸ¶āˆžī€·š±āˆ‡š‘“(š‘˜)ī€ø=šŸŽ.(2.4) The iterative method (2.2) is a particular case of the following general Gradient-type method:š±(š‘˜+1)=š±(š‘˜)āˆ’šœ†š‘˜š¬(š‘˜),š‘˜=0,1ā€¦,(2.5) whereš¬(š‘˜) is a descent direction, that is, š¬š‘‡(š‘˜)āˆ‡š‘“(š±(š‘˜)ī„ž)>0ā‡”cos(āˆ’āˆ‡š‘“(š±(š‘˜)),š¬(š‘˜))ā‰„0.

The following theorem generalizes a well-known result shown in [18].

Theorem 2.1 (see [7]). If š¬(š‘˜) is a descent direction in š±(š‘˜) for a function š‘“(š±), then š¬(š‘˜)=āˆ’š“š‘˜āˆ’1ī€·š±āˆ‡š‘“(š‘˜)ī€ø(2.6) with š“š‘˜ being a symmetric positive definite (spd) matrix.
Moreover, the following property holds: ī‚€ī„žī€·š±cosāˆ’āˆ‡š‘“(š‘˜)ī€ø,š¬(š‘˜)ī‚ī€·š“ā‰„š‘>0āŸŗcondš‘˜ī€øā‰¤š‘€,āˆ€š‘˜.(2.7)

Remark 2.2. Particular cases of descent directions can be obtained, by settingš“š‘˜=š¼(SteepestDescentmethod), š“š‘˜=āˆ‡2š‘“(š±(š‘˜))(Newton-Raphsonmethod), š“š‘˜ā‰ˆāˆ‡2š‘“(š±(š‘˜))(generalquasi-NewtonandclassicalBFGSmethods), š“š‘˜structuredā‰ˆāˆ‡2š‘“(š±(š‘˜))(low-complexityBFGS-typemethods) (see [5, 6, 18, 19]).
It is useful to underline that the general theory of admissible directions for unconstrained optimization [20] is also a special case of (2.5). By setting, in fact, for a given š›¾>0: š·ī€·š›¾,š±(š‘˜)ī€ø=ī‚†š¬(š‘˜)āˆˆš‘…š‘›āˆ¶ā€–ā€–š¬(š‘˜)ā€–ā€–ī€·š±=1,āˆ‡š‘“(š‘˜)ī€øš‘‡š¬(š‘˜)ā€–ā€–ī€·š±ā‰„š›¾āˆ‡š‘“(š‘˜)ī€øā€–ā€–ī‚‡,(2.8) one can obtain other Gradient-type methods described by (2.5).

The iterative scheme described by Algorithm 1 contains several ingredients of a general Gradient-type method see [20].

(a) Given { š›¾ š‘˜ } , { šœŽ š‘˜ } , š‘˜ = 0 , 1 ā€¦ s.t.
i n f { š›¾ š‘˜ } > 0 , i n f { šœŽ š‘˜ } > 0
Let š± ( 0 ) āˆˆ š‘… š‘› be a starting point
For a given vector š¬ ( š‘˜ ) āˆˆ š· ( š›¾ š‘˜ , š± ( š‘˜ ) ) , set
š± ( š‘˜ + 1 ) = š± ( š‘˜ ) āˆ’ šœ† š‘˜ š¬ ( š‘˜ ) , with šœ† š‘˜ āˆˆ ( 0 , šœŽ š‘˜ ā€– āˆ‡ š‘“ ( š± ( š‘˜ ) ) ā€– ) :
š‘“ ( š± ( š‘˜ + 1 ) ) ā‰ˆ m i n šœ‡ { š‘“ ( š± ( š‘˜ ) āˆ’ šœ‡ š¬ ( š‘˜ ) ) , 0 < šœ‡ ā‰¤ šœŽ š‘˜ ā€– āˆ‡ š‘“ ( š± ( š‘˜ ) ) ā€– }

The convergence of Algorithm 1 is guaranteed by the following result (see again [20]).

Theorem 2.3. Let š‘“(š±)āˆˆš¶1(š“), š“ open āŠ‚š‘…š‘›. Let š¾={š±āˆˆš‘…š‘›āˆ¶š‘“(š±)ā‰¤š‘“(š±(0))}āŠ‚š“ā‡’š±(0)āˆˆš¾. Then, forall{š±(š‘˜)} evaluated by Algorithm 1: (i)š±(š‘˜)āˆˆš¾,forallš‘˜;(ii)if š±(š‘˜+1)ā‰ š±(k),{š±(š‘˜)} has at least an Extremal Point (EP) š±āˆ—;(iii)every EPā€‰ā€‰š±āˆ— of {š±(š‘˜)} is a stationary point, that is, āˆ‡š‘“(š±āˆ—)=0.

Remark 2.4. Notice that Theorem 2.3 can be also applied in the case of classical Computer Science algorithms. As a matter of fact, if condition (ii) is not verified, then, by definition, āˆƒĢƒš‘˜āˆ¶š±(Ģƒš‘˜+1)=š±(Ģƒš‘˜), implying āˆ‡š‘“(š±(Ģƒš‘˜))=0, that is, the convergence to a stationary point in a finite number of steps Ģƒš‘˜. Moreover, the convergence to an isolated point Ģ‚š± of the sequence {š±(š‘˜)} can be proven ab absurdo by showing that Ģ‚ifš±āˆˆEPā‡’āˆ€š‘˜>š‘˜0ī€·š±,š‘“(š‘˜)ī€øī€·š±āˆ’š‘“(š‘˜+1)ī€ø>š‘0,š‘0>0.(2.9) We will see in par. 3-4 that the convergence in a finite number of steps of a given iterative procedure can be verified in this way both for the unconstrained problems and for the constrained ones.

3. Local Unconstrained Optimization

Let š¶ and š‘ be a spd matrix of order š‘› and a š‘›-dimensional vector, respectively.

It is well known that the problem 1min2š±š‘‡š¶š±āˆ’š›š‘‡š±,š±āˆˆš‘…š‘›(3.1) can be exactly solved in at most š‘› steps by the Conjugate Gradient (š¶šŗ) method [21], which represents a direct method to solve (3.1). The quadratic form associated to a spd matrix š¶ is, in fact, a convex function.

However, it can be also proved that the application of the procedure defined in (2.2), that is, the Steepest Descent method, always requires an infinite number of iterations, apart from the trivial case š±(0)=š±āˆ—. The latter result shows that the existence of a finite procedure to solve (3.1) does not depend only by the role played by convexity but it is also the consequence of a sort of optimal matching between the problem and the corresponding algorithm, which is in this case the (š¶šŗ) method. On the other hand, the latter method can be also interpreted as an iterative method in the family of the following fixed point procedures: š±(š‘˜+1)=ī€·(š‘Ÿš¼āˆ’š¶)š±(š‘˜)ī€ø+š›š‘Ÿ(3.2) with š‘Ÿ being a suitable scalar parameter. By setting, in fact, š¶š»=š¼āˆ’š‘Ÿ,āŽ›āŽœāŽœāŽœāŽœāŽœāŽœāŽœāŽ1š·=š‘Ÿ010ā‹±ā‹±š‘Ÿ10ā‹±00š‘Ÿā‹±10ā‹±ā‹±š‘ŸāŽžāŽŸāŽŸāŽŸāŽŸāŽŸāŽŸāŽŸāŽ ,(3.3) one can obtain the classical iterative scheme š±(š‘˜+1)=š»š±(š‘˜)+š·š›.(3.4) Since ā€–š»ā€–š‘ =1āˆ’1/cond(š¶), (3.4) is convergent if the original matrix š¶ is well conditioned.

Moreover, if Ģ‚š± is the optimal solution of (3.1), the truncation error of the method is:ā€–ā€–š±(š‘˜)āˆ’Ģ‚š±ā€–ā€–2ā‰¤ī‚µ11āˆ’ī‚¶cond(š¶)š‘˜ā€–ā€–š±(0)āˆ’Ģ‚š±ā€–ā€–2.(3.5) In the case of (š¶šŗ) method, one can prove the inequalityā€–ā€–š±(š‘˜)āˆ’Ģ‚š±ā€–ā€–2īƒ©āˆšā‰¤2cond(š¶)āˆ’1āˆšīƒŖcond(š¶)+1š‘˜ā€–ā€–š±(0)āˆ’Ģ‚š±ā€–ā€–2.(3.6) Equation (3.6) shows that, if the dimension š‘› is huge and the matrix š¶ is well conditioned, from a computational point of view it is more convenient to implement the (š¶šŗ)method as a classical iterative procedure with a stopping rule based on the above inequality.

So, once again, the distinction between Numerical Analysis direct methods (or Computer Science algorithms) and infinite procedures cannot be considered as the fundamental classification rule in computational mathematics.

In the case of Steepest Descent method, the truncation error isā€–ā€–š±(š‘˜)āˆ’Ģ‚š±ā€–ā€–2ī‚µā‰¤2cond(š¶)āˆ’1ī‚¶cond(š¶)+1š‘˜ā€–ā€–š±(0)āˆ’Ģ‚š±ā€–ā€–2.(3.7) The difference between (3.6) and (3.7) clearly indicates the major efficiency of (š¶šŗ) method.

In [22] the finite version of (š¶šŗ)-method was extended to a family of nonquadratic functions, including the following important sets: š±š¹(š±)=š‘‡š¶š±ī€·šœš‘‡š±ī€ø2,š±āˆˆš‘‹,(3.8) ā€‰ šŗ(š±)=š±š‘‡ī€·šœš¶š±āˆ—š‘‡š±ī€øš‘˜,š‘˜integer,š±āˆˆš‘‹,(3.9) where š‘‹={š±āˆˆš‘…š‘›āˆ¶šœš‘‡š±>0}.

According to the classical definition, the function š¹ indicated in (3.8) is called conic. If š‘˜=āˆ’2, then šŗ(š±)ā‰”š¹(š±).

Hence, šŗ represent a class of nonquadratic functions for which the optimal solution can be found with a finite number of steps if the matrix š¶ is spd.

As a matter of fact, the following result holds.

Theorem 3.1 (see [22] Theorem 3.1, Lemmas 3.2 and 5.1). Let šŗ(š±) be defined as in (3.9). Then the minimum problem minšŗ(š±),š±āˆˆš‘‹,(3.10) can be solved in at most š‘› steps.

Let us now consider some generalizations of the convexity, which play an important role in global optimization see [7].

Let š¬(š‘˜) be a descent direction in š±(š‘˜) for a function š‘“(š±). The importance of the following definitions will be shown in the next results of this paragraph.

Definition 3.2. A function š‘“(š±)āˆˆš¶1(š‘…š‘›) is called algorithmically convex if forallš±(š‘˜),š±(š‘˜+1) evaluated by an algorithm of type (2.5), one has ī€·š¬(š‘˜+1)āˆ’š¬(š‘˜)ī€øš‘‡ī€·š±(š‘˜+1)āˆ’š±(š‘˜)ī€øā‰„0.(3.11)

Definition 3.3. A function š‘“(š±)āˆˆš¶1(š‘…š‘›) is called weakly convex if forallš±(š‘˜),š±(š‘˜+1) evaluated by an algorithm (2.5), the following inequality holds: ā€–ā€–ī€·š±āˆ‡š‘“(š‘˜+1)ī€øī€·š±āˆ’āˆ‡š‘“(š‘˜)ī€øā€–ā€–2ī€·ī€·š±āˆ‡š‘“(š‘˜+1)ī€øī€·š±āˆ’āˆ‡š‘“(š‘˜)ī€øī€øš‘‡ī€·š±(š‘˜+1)āˆ’š±(š‘˜)ī€øā‰¤š‘€.(3.12)

Definition 3.4. Let š¬(š‘˜)=āˆ’š“š‘˜āˆ’1āˆ‡š‘“(š±(š‘˜)),forallš‘˜ be descent directions of an algorithm of type (2.5) applied to problem (2.1). Then the method is called secant if the matrix š“š‘˜ solves the secant equation: š“š‘˜ī€·š±(š‘˜)āˆ’š±(š‘˜āˆ’1)ī€øī€·š±=āˆ‡š‘“(š‘˜)ī€øī€·š±āˆ’āˆ‡š‘“(š‘˜āˆ’1)ī€ø.(3.13)

Definition 3.2 is clearly a generalization of convexity. As a matter of fact, if š¬(š‘˜)=āˆ‡š‘“(š±(š‘˜)) then š‘“(š±)āˆˆš¶1(š“),š“āŠ†š‘…š‘›, is convex if and only if (3.11) is verified forallš±(š‘˜),š±(š‘˜+1)āˆˆš“ (see [23]).

Definition 3.3 is also a generalization of convexity. In [24], in fact, it is proved that if š‘“(š±)āˆˆš¶1(š“),š“āŠ†š‘…š‘›, is convex, then (3.12) is satisfied forallš±(š‘˜),š±(š‘˜+1)āˆˆš“. So (3.12) is a necessary, but not sufficient, condition for a function š‘“ to be convex.

Definition 3.4 is an š‘›-dimensional generalization of the classical secant iterative formula to compute the zeroes of the derivative of a function š‘“(š‘„)āˆˆš¶1(š‘…1), that is, š‘„š‘˜+1=š‘“ī…žī€·š‘„š‘˜ī€øš‘„š‘˜āˆ’1āˆ’š‘“ī…žī€·š‘„š‘˜āˆ’1ī€øš‘„š‘˜š‘“ī…žī€·š‘„š‘˜ī€øāˆ’š‘“ī…žī€·š‘„š‘˜āˆ’1ī€ø.(3.14) Observe, in fact, that (3.14) can be rewritten as š‘„š‘˜+1=š‘„š‘˜āˆ’š‘“ī…žī€·š‘„š‘˜ī€øš‘Žš‘˜,š‘Žš‘˜ī€·š‘„š‘˜āˆ’š‘„š‘˜āˆ’1ī€ø=š‘“ī…žī€·š‘„š‘˜ī€øāˆ’š‘“ī…žī€·š‘„š‘˜āˆ’1ī€ø.(3.15) Hence, the expression of š‘Žš‘˜ is the 1-dimensional version of (3.13).

In [7] it is proved the following result.

Theorem 3.5 (see also [18]). Let š¬(š‘˜)=āˆ’š“š‘˜āˆ’1āˆ‡š‘“(š±(š‘˜)) be descent directions of a secant method, that is, satisfying (3.13), applied to problem (2.1). Moreover, let conditions (2.7) and (3.12) be verified. Then, āˆƒ{āˆ‡š‘“(š±(š‘˜š‘–))}, such that limš‘–āŸ¶āˆžī€·š±āˆ‡š‘“(š‘˜š‘–)ī€ø=šŸŽ.(3.16)

Remark 3.6. Theorem 3.5 shows that a global convergence for a quasi-Newton secant method applied to problem (2.1) can be obtained if the function š‘“(š±) is weakly convex and the matrices š“š‘˜ approximating āˆ‡2š‘“(š±(š‘˜)) are well conditioned.

Remark 3.7. By utilizing Armijo-Goldstein-Wolfe's method [18] and setting š¬(š‘˜)=āˆ‡š‘“(š±(š‘˜)), the step šœ†š‘˜ in (2.5) is such that forallš‘˜ī€·ī€·š±āˆ‡š‘“(š‘˜+1)ī€øī€·š±āˆ’āˆ‡š‘“(š‘˜)ī€øī€øš‘‡ī€·š±(š‘˜+1)āˆ’š±(š‘˜)ī€øš‘“ī€·š±>0,(š‘˜+1)ī€øī€·š±<š‘“(š‘˜)ī€ø.(3.17) Hence, by Definition 3.2, in this case the function š‘“(š±) is also algorithmically convex. For general descent directions š¬(š‘˜), evaluated by a quasi-Newton secant method, inequality (3.11) is not always satisfied.

4. Local Constrained Optimization

Quadratic Programming (QP) is defined in the following way: minš±š‘‡š¶š±+šœš‘‡š±,š“š±=š›,š±ā‰„šŸŽ(4.1) with š¶ being a symmetric semidefinite positive (ssdp) matrix of order š‘› and š“ a matrix with š‘š rows and š‘› columns.

Remark 4.1. Let š‘ƒ={š±āˆˆš‘…š‘›āˆ¶š“š±=š›,š±ā‰„0}. The optimal solution of (4.1) can be located in any point of š‘ƒ. Hence, (4.1) is a continuous problem which cannot be immediately reduced to a finite problem as in the case š¹(š±)=šœš‘‡š±, that is, Linear Programming (LP).
Let us consider, for instance, the following problems: minš‘„21āˆ’3š‘„2,2š‘„1āˆ’š‘„2ā‰„4,āˆ’š‘„1āˆ’2š‘„2ā‰„āˆ’16,āˆ’2š‘„1+4š‘„2ā‰„āˆ’8,āˆ’7š‘„1+8š‘„2š‘„ā‰„āˆ’35,1ā‰„0,š‘„2ā‰„0,(4.2)minš‘„21+4š‘„22āˆ’4š‘„1āˆ’24š‘„2+40,āˆ’7š‘„1āˆ’6š‘„2+42ā‰„0,āˆ’5š‘„1+3š‘„2š‘„+10ā‰„0,1ā‰„0,š‘„2ā‰„0.(4.3)

The optimal solution of (4.2) is the point (3,2) which is in the boundary of š‘ƒ but is not a vertex. On the other hand, problem (4.3) has the optimal solution in the inner point (2,3). However, QP can be solved in general in a finite number of steps by means of Frank-Wolfe's algorithm [25]. So, QP can be considered as a finite continuous constrained optimization problem.

The following question arises: does QP characterize the boundary, separating finite continuous constrained optimization problems from infinite ones? In other words, there exist more general nonlinear constrained optimization problems that can be solved in a finite number of iterations? Since in the unconstrained case we have shown in the previous paragraph that there exist nonquadratic problems that can be exactly solved in a finite number of iterations by utilizing the (š¶šŗ)-method, the answer is expected to be positive.

Given a convex function š‘“(š±)āˆˆš¶1(š‘†),š‘† convex āŠ†š‘…š‘›, Convex Programming with Linear Constraints (CPLC) is defined as minš‘“(š±),š“š±=š›,š±ā‰„šŸŽ.(4.4) Problem (4.4) can be solved by the Reduced Gradient (RG) algorithm or by the Gradient Projection (GP) method [23, 26, 27].

Assuming š“ with maximum rank and taking into account Remark 2.4, one can introduce the following.

Definition 4.2. Let īš‘“(š±)āˆˆš¶1(š‘†),š‘† convex āŠ†š‘…š‘›, be a convex function.
Let īš‘ƒ={š±āˆˆš‘…š‘›āˆ¶īĢ‚š“š±=š›,š±ā‰„šŸŽ} be a nonempty polyhedron. The corresponding CPLC problem (4.4) is a finite continuous constrained optimization problem, if and only if there exists a convergent Gradient-type method (2.5) and a positive real number š‘0, such that if (2.5) would require an infinite number of steps, then š±(š‘˜)āˆˆīš‘ƒ,āˆ€š‘˜,infš‘˜ī‚†īš‘“ī€·š±(k)ī€øāˆ’īš‘“ī€·š±(š‘˜+1)ī€øī‚‡ā‰„š‘0,āˆ€š‘˜>š‘˜0,š‘0>0.(4.5) Equation (4.5) clearly implies that āˆƒš‘˜āˆ—āˆ¶īš‘“(š±(š‘˜āˆ—))=minš±āˆˆīš‘ƒīš‘“(š±).

The importance of Definition 4.2 can be pointed out by the next result, showing the relationship between (4.4) and a particular linear optimization problem.

Theorem 4.3. Let š±(š‘˜) be an admissible solution of (4.4). Let š¬(š‘˜) be a descent direction in š±(š‘˜) for the function š‘“(š±). Then š¬(š‘˜) is an admissible descent direction for (4.4) if š“š¬(š‘˜)=0.
Moreover, for any fixed Ģ‚š±(š‘˜) the optimal solution š¬āˆ— of the problem ī€·Ģ‚š±mināˆ‡š‘“(š‘˜)ī€øš‘‡š¬,š“š¬=šŸŽ,ā€–š¬ā€–=1(4.6) is given by š¬āˆ—=ī‚€š¼āˆ’š“š‘‡ī€·š“š“š‘‡ī€øāˆ’1š“ī‚ī€·Ģ‚š±āˆ‡š‘“(š‘˜)ī€øā€–ā€–ī‚€š¼āˆ’š“š‘‡ī€·š“š“š‘‡ī€øāˆ’1š“ī‚ī€·Ģ‚š±āˆ‡š‘“(š‘˜)ī€øā€–ā€–.(4.7) By setting Ģ‚š±šœ=š‘“((š‘˜)) and š±=š¬ it was proven in [28] that (4.6) is equivalent to a general LP problem, that is, minšœš‘‡š±,š“š±=š›,š±ā‰„šŸŽ.(4.8) Furthermore, if š‘‡={š±āˆˆš‘…š‘›+,š“š±=šŸŽ,ā€–š±ā€–=1}, the following result holds (see [29]).

Theorem 4.4. Given a suitable integer šæ and the function. š‘”(š±)=š‘›ī“š‘—=1šœlnš‘‡š±š‘„š‘—=š‘›lnšœš‘‡š±āˆ’š‘›ī“š‘—=1lnš‘„š‘—,(4.9) then, (4.6) and hence (4.8) are equivalent to find a point š±āˆ—: š±āˆ—š‘”ī€·š±āˆˆš‘‡,āˆ—ī€ø<āˆ’2š‘›šæ.(4.10) Moreover, it is possible to determine a real number š‘0 and a sequence {š±(š‘˜)}āˆˆš‘‡ by a GP algorithm with a suitable scaling procedure (see again [29]) such that š‘”ī€·š±(š‘˜+1)ī€øī€·š±<š‘”(š‘˜)ī€øāˆ’š‘0.(4.11)

By Theorem 4.4 and Definition 4.2 it follows that there exists a Gradient-type method (2.5) solving šæš‘ƒ in a finite number of steps. Hence šæš‘ƒ is a finite continuous constrained optimization problem. It is important to underline that the latter result is not a consequence of the intrinsic finiteness of the set of the possible optimal solutions (the vertices of a polyhedron) as in the classical simplex algorithm.

Given the convex functions, š‘“(š±),ā„Ž1(š±),ā„Ž2(š±)ā‹Æā„Žš‘š(š±)āˆˆš¶1(š‘†),š‘† convex āŠ†š‘…š‘›, let us now consider the general Convex Programming (CP) problem:ā„Žminš‘“(š±),š‘–(š±)ā‰¤0,š‘–=1,2ā€¦š‘š,š±ā‰„šŸŽ.(4.12) The following property is well known [23, 26].

Definition 4.5. Letting Ģ‚š±ā‰„0 and š¼={š‘–āˆ¶ā„Žš‘–(Ģ‚š±)=0}, then the constraints of (4.12) are qualified if one of the following conditions is satisfied: āˆƒš±āˆ—ā‰„šŸŽāˆ¶ā„Žš‘–ī€·š±āˆ—ī€ø<0,š‘–=1,2ā€¦š‘š,(4.13)ā„Žš‘–(Ģ‚Ģ‚Ģ‚š±)islocallyconcaveinš±,āˆ€š‘–āˆˆš¼,āˆ€š±.(4.14) If ā„Žš‘–(š±)=šœš‘–š‘‡š±, then (4.14) is trivially satisfied Ģ‚forallš±,forallš‘–āˆˆš¼.

So, from Definition 4.5 we deduce that the constraints of CPLC problem (4.4) are always qualified. Assuming in (4.4) š“ with maximum rank, we clearly obtain a condition equivalent to (4.13).

Definition 4.6. A set š¶āŠ†š‘…š‘› is called a convex cone if š±āˆˆš¶āŸ¹šœ†š±āˆˆš¶,āˆ€šœ†>0,āˆ€š±(1),š±(2)āˆˆš¶,āˆ€0ā‰¤šœ†1,šœ†2ā‰¤1,šœ†1š±(1)+šœ†2š±(2)āˆˆš¶.(4.15) The following theorem was proved in [30] in a general Hilbert space (see Theorem 2.3).

Theorem 4.7. Let š‘†1,š‘†2āˆˆš‘…š‘› be closed convex cones, and let š‘†š‘œ1 denote the interior of š‘†1. Assume that š‘†š‘œ1ā‰ āˆ….
Then the corresponding conic feasibility problem ļ¬ndš±āˆˆš‘†š‘œ1āˆ©š‘†2(4.16) can be solved in a finite number of steps.

The technique utilized to prove Theorem 4.7 is based upon the so-called Method of Alternative Projections (MAP) (see [31]).

Theorem 4.7 was extended in [30] (see Proposition 2.1) by assuming š‘†1 and š‘†2 be closed convex sets, thereby proving that a convex feasibility problem is equivalent to a conic feasibility problem. However, the open question remains how to express explicit formulas for the projection operators to convert the algorithm from š‘†1 and š‘†2 to the conified closed sets con(š‘†1) and con(š‘†2) in the case of nonlinear and nonquadratic problems. The Linear Matrix Inequality (LMI) feasibility problem was, in fact, efficiently solved in the literature (see [32]).

Remark 4.8. Theorem 4.7 can be applied to CPLC problem (4.4), by assuming š‘†1={š±āˆˆš‘…š‘›š‘†āˆ¶š“š±=š›,š±ā‰„šŸŽ},2(š‘˜)=ī€½š±āˆˆš‘…š‘›āˆ¶š‘“(š±)ā‰¤š‘”(š‘˜),š‘”(š‘˜)ī€¾āˆˆš‘…,š‘˜=1,2,ā€¦š‘˜0.(4.17)

Hence, explicit formulas for the projection operators for suitable classes of nonlinear convex feasibility problems in terms of the corresponding conified sets might allow to solve CPLC problem (4.4) in the nonquadratic case with a finite number of steps. By utilizing Theorem 3.1, we can prove, in fact, the following important theorem.

Theorem 4.9 (see [33]). Consider the particular CPLC problem minš±š‘‡ī€·šœš¶š±āˆ—š‘‡š±ī€øāˆ’2,š¶spd,š“š±=š›,āˆ’šœš‘‡š±ā‰¤0,š±ā‰„šŸŽ.(4.18) Assume that the optimal solution š±āˆ— of problem (4.18) be such that āˆ’šœš‘‡š±āˆ—<0. Then, (4.18) can be converted into a convex feasibility problem by utilizing a proper modification of the Alternative Projection method, and the latter algorithm converges to the optimal solution with a finite number of steps.

Remark 4.10. Given the convex set of feasible solutions š‘†1=ī€½š±āˆˆš‘…š‘›āˆ¶š“š±=š›,āˆ’šœš‘‡ī€¾š±ā‰¤0,š±ā‰„šŸŽ,(4.19) the proof of Theorem 4.9 is essentially based upon the following computational ingredients:

(a)by Theorem 4.7, one can convert the closed convex set defined in (4.19) into a closed convex cone; (b)by Theorem 3.1, the extended version of (š¶šŗ)-method and a suitable projection algorithm can be applied to problem (4.18) thereby obtaining a convergence with a finite number of steps.

5. Global Optimization

One can prove the following global convergence theorem [7].

Theorem 5.1. Consider Problem (2.1), where š‘“(š±)āˆˆš¶2(š‘…š‘›).
Let š‘“min be the value of the optimal solution. Assume that āˆ€šœ–š‘Žāˆˆā„œ+,āˆƒšœ–š‘ āˆˆā„œ+āˆ¶ā€–ā€–ī€·š±āˆ‡š‘“(š‘˜)ī€øā€–ā€–>šœ–š‘ š‘“ī€·š±exceptforš‘˜āˆ¶(š‘˜)ī€øāˆ’š‘“min<šœ–š‘Ž.(5.1) If in an iterative scheme of BFGS-type, š±(š‘˜+1)=š±(š‘˜)āˆ’šœ†š‘˜šµ(š‘˜)āˆ’1ī€·š±āˆ‡š‘“(š‘˜)ī€ø,ī‚€šµ(š‘˜)ī‚€ī‚šµ=šœ‘(š‘˜āˆ’1)ī‚ī‚,ā€¦,āˆ€š‘˜,(5.2) the following conditions are satisfied: ā€–ā€–ī€·š±āˆ‡š‘“(š‘˜+1)ī€øī€·š±āˆ’āˆ‡š‘“(š‘˜)ī€øā€–ā€–2ī€·ī€·š±āˆ‡š‘“(š‘˜+1)ī€øī€·š±āˆ’āˆ‡š‘“(k)ī€øī€øš‘‡šœ†š‘˜š(š‘˜)=ā€–ā€–š²š‘˜ā€–ā€–2š²š‘‡š‘˜š¬š‘˜ā‰¤š‘€,(5.3) ā€‰ ī€·šµcond(š‘˜)ī€øā‰¤š‘.(5.4) Then āˆ€šœ–š‘Žāˆˆā„œ+,āˆƒš‘˜āˆ—āˆ—āˆ¶āˆ€š‘˜>š‘˜āˆ—āˆ—š‘“ī€·š±(š‘˜)ī€øāˆ’š‘“min<šœ–š‘Ž.(5.5) Theorem 5.1 points out as follows three conditions for a global optimization BFGS-type method.

Condition (5.1) assumes an optimal matching between the BFGS-type algorithm and the function š¹ [34]. (5.3) is equivalent to (3.12), that is, š‘“(š±) is weakly convex (see [24]).

Condition (5.4) can be easily satisfied, by modifying the matrices šµ(š‘˜) by a restarting procedure, because every descent direction is associated to an spd matrix (see Theorem 2.1).

Let us now consider the classical ā€œbox-constrainedā€ problem: minš‘“(š±),š±š‹ā‰¤š±ā‰¤š±š”.(5.6) Let š±šæš‘(š‘š)ā‰¤š±š‘(š‘š)ā‰¤š±š‘ˆš‘(š‘š) denote the current box at iteration š‘š.

Set š›¼š±š‘(š‘š)ī‚»1=max0,āˆ’2minš‘–šœ†š‘–ī€½āˆ‡2š‘“ī€·š±š‘(š‘š)ī‚¼ī€øī€¾,(5.7) ā€‰ šæš‘(š‘š)ī€·š±š‘(š‘š)ī€øī€·š±=š‘“š‘(š‘š)ī€ø+š›¼š±š‘(š‘š)ī‚€š±šæš‘(š‘š)āˆ’š±š‘(š‘š)š±ī‚ī‚€š‘ˆš‘(š‘š)āˆ’š±š‘(š‘š)ī‚.(5.8) The following global convergence theorem holds (see [35, 36]).

Theorem 5.2. Consider Problem (5.6) and assume š‘“(š±)āˆˆš¶2. These hypotheses imply ī€·āˆ‡cond2ī€øš‘“(š±)ā‰¤š‘,āˆ€š‘šāˆƒš›¼āˆ—š‘š=maxš±š‘(š‘š)š›¼š±š‘(š‘š).(5.9) Set š‘“šæš‘(š‘š)=infš±š‘(š‘š)šæš‘(š‘š)ī€·š±š‘(š‘š)ī€ø,š‘“š‘ˆš‘(š‘š)āŽ›āŽœāŽœāŽī‚€š±=š‘“šæš‘(š‘š)+š±š‘ˆš‘(š‘š)ī‚2āŽžāŽŸāŽŸāŽ .(5.10) Then, it follows forallš‘š: š‘“šæš‘(š‘š)ā‰¤š‘“šæš‘(š‘š+1)ā‰¤minš±š‘(š‘š+1)š‘“ī€·š±š‘(š‘š+1)ī€øā‰”minš±š‘“(š±),(5.11) ā€‰ š‘“š‘ˆš‘(š‘š)ā‰„š‘“š‘ˆš‘(š‘š+1)ā‰„minš±š‘“(š±)ā‰„š‘“šæš‘(š‘š).(5.12) Moreover, forallšœ–š‘Ž>0,āˆƒš‘šāˆ—āˆ¶forallš‘šā‰„š‘šāˆ—: š‘“š‘ˆš‘(š‘š)āˆ’š‘“šæš‘(š‘š)<šœ–š‘Ž,ā€–ā€–š±š‘ˆš‘(m)āˆ’š±šæš‘(š‘š)ā€–ā€–2ā‰¤ī‚™4šœ–š‘Žš‘,š‘constant(5.13) Theorem 5.2 can be immediately extended to Problem (2.1), by assuming a growth condition on the function š‘“(š±).

In fact, we have the following.

Corollary 5.3. Given š‘“(š±)āˆˆš¶2(š‘…š‘›): limā€–š±ā€–āŸ¶āˆžš‘“(š±)=+āˆž.(5.14) Equation (5.14) implies āˆƒš¾0āˆ¶minš±āˆˆš‘…š‘›š‘“(š±)ā‰”minā€–š±ā€–ā‰¤š¾0ā€–ā€–āˆ‡š‘“(š±),2ā€–ā€–š‘“(š±)ā‰¤š‘1,ā€–š±ā€–ā‰¤š¾0.(5.15) Assume ā€–ā€–āˆ‡2š‘“(š±)āˆ’1ā€–ā€–ā‰¤š‘2ī€·āˆ‡andhencecond2ī€øš‘“(š±)ā‰¤š‘1š‘2.(5.16) Then, the convergence results proved for (5.6) can be applied to (2.1).

We can fruitfully combine the results of Theorems 5.1 and 5.2, by proving the following.

Theorem 5.4. Consider Problem (5.6) and assume š‘“(š±)āˆˆš¶2(š‘…š‘›).
If in a BFGS-type iterative scheme š±(š‘˜+1)=š±(š‘˜)āˆ’šœ‡š‘˜ā„¬(š‘˜)āˆ’1ī€·š±āˆ‡š‘“(š‘˜)ī€ø,(5.17) the following conditions are satisfied: š±š‹ā‰¤š±(š‘˜)ā‰¤š±š”,āˆ€š‘˜,(5.18) ā€‰ ī€·ā„¬cond(š‘˜)ī€øā‰¤š‘,(5.19) ā€‰ ā€–ā€–ī€·š±āˆ‡š‘“(š‘˜+1)ī€øī€·š±āˆ’āˆ‡š‘“(š‘˜)ī€øā€–ā€–2ī€·ī€·š±āˆ‡š‘“(š‘˜+1)ī€øī€·š±āˆ’āˆ‡š‘“(š‘˜)ī€øī€øš‘‡šœ†š‘˜š(š‘˜)=ā€–ā€–š²kā€–ā€–2š²š‘‡š‘˜š¬š‘˜ā‰¤š‘€.(5.20) Then (5.17) is convergent to the optimal solution of (5.6).

Proof. By the assumptions it follows: cond(āˆ‡2š¹(š±))ā‰¤š‘.
Hence, by (5.7) we have for all š‘š: āˆƒš›¼āˆ—š‘š=maxš±š‘(š‘š)š›¼š±š‘(š‘š).(5.21) Set ī‚šæš‘(š‘š)ī€·š±š‘(š‘š)ī€øī€·š±=š‘“š‘(š‘š)ī€ø+š›¼āˆ—š‘šī‚€š±šæš‘(š‘š)āˆ’š±š‘(š‘š)š±ī‚ī‚€š‘ˆš‘(š‘š)āˆ’š±š‘(š‘š)ī‚.(5.22) Therefore, by (5.21) and (5.22) for all š‘š: šæš‘(š‘š)ī€·š±š‘(š‘š)ī€øā‰„ī‚šæš‘(š‘š)ī€·š±š‘(š‘š)ī€ø,āˆ€š±š‘(š‘š).ī‚šæš‘(š‘š)ī€·š±š‘(š‘š)ī€øconvexāˆ€š±c(š‘š).(5.23) So, š‘“šæš‘(š‘š)=infš±š‘(š‘š)šæš‘(š‘š)ī€·š±š‘(š‘š)ī€ø=minš±š‘(š‘š)ī‚šæš‘(š‘š)ī€·š±š‘(š‘š)ī€ø,āˆ€š‘š.(5.24) Let š±(Ģƒš‘˜š‘š)š‘(š‘š) be a local minimum in the box š‘(š‘š).
If š±šæš‘(š‘š+1)ā‰¤š±(Ģƒš‘˜š‘š)š‘(š‘š)ā‰¤š±š‘ˆš‘(š‘š+1) and š‘“((š±šæš‘(š‘š+1)+š±š‘ˆš‘(š‘š+1))/2)ā‰„š‘“(š±(Ģƒš‘˜š‘š)š‘(š‘š)), then define š‘“š‘ˆš‘(š‘š+1)ī‚µš±=š‘“(Ģƒš‘˜š‘š)š‘(š‘š)ī‚¶.(5.25) Else, set š±(0)š‘(š‘š+1)=ī‚€š±šæš‘(š‘š+1)+š±š‘ˆš‘(š‘š+1)ī‚2,š‘“š‘ˆš‘(š‘š+1)ī‚µš±=š‘“(Ģƒš‘˜š‘š+1)š‘(š‘š+1)ī‚¶(5.26) with š±(Ģƒš‘˜š‘š+1)š‘(š‘š+1) being a local minimum evaluated by the starting point š±(0)š‘(š‘š+1) and contained in the box š‘(š‘š+1). Since the assumptions of Theorem 5.2 are satisfied, by the results of [7] (see Theorem 2 and Corollary 2), it follows that (5.19), (5.20) imply that forallšœ–š‘>0,āˆƒ{š±(Ģƒš‘˜š‘šš‘–)š‘(š‘šš‘–)}: š‘“š‘ˆš‘ī€·š‘šš‘–ī€øā‰„š‘“š‘ˆš‘ī€·š‘šš‘–+1ī€ø,ā€–ā€–ā€–ī‚µš±āˆ‡š‘“(Ģƒš‘˜š‘šš‘–)š‘ī€·š‘šš‘–ī€øī‚¶ā€–ā€–ā€–<šœ–š‘.(5.27) Applying Theorem 5.2, by inequalities (5.11) and (5.27) and by setting šœ–=max{šœ–š‘Ž,šœ–š‘}, we have that forallšœ–>0,āˆƒ{š±(Ģƒš‘˜š‘šš‘–)š‘(š‘šš‘–)}, ā€–ā€–ā€–ī‚µš±āˆ‡š‘“(Ģƒš‘˜š‘šš‘–)š‘ī€·š‘šš‘–ī€øī‚¶ā€–ā€–ā€–2ā€–ā€–ā€–š±<šœ–,š‘ˆš‘ī€·š‘šš‘–ī€øāˆ’š±šæš‘ī€·š‘šš‘–ī€øā€–ā€–ā€–2ā‰¤ī‚™4šœ–š‘,š‘“ī‚µš±(Ģƒš‘˜š‘šš‘–)š‘ī€·š‘šš‘–ī€øī‚¶āˆ’š‘“minā‰¤š‘“š‘ˆš‘ī€·š‘šš‘–ī€øāˆ’š‘“šæš‘ī€·š‘šš‘–ī€ø<šœ–.(5.28) This completes the proof.

Although the local minimization phases are performed effectively by the iterative scheme (5.17), the convergence of the method to the global minimum is usually very slow by the very nature of the š›¼šµšµ approach. In particular, the number of the upper bounds š‘“š‘ˆš‘(š‘ši) and the corresponding boxes š‘šš‘–, requested to obtain a satisfactory approximation can be unacceptable from a computational point of view. In order to overcome this problem, a fast determination of ā€œgoodā€ local minima is essential.

More precisely, by the utilization of terminal repellers and tunneling techniques [14], one can build algorithms based on a sequence of cycles, where each cycle has two phases, that is, a local optimization phase and a tunneling one. The main aim of these procedures is to build a favourable sequence of local minima (maxima), thereby determining a set of possible candidates for the global minimum (maximum) more efficiently.

By injecting in the method suitable ā€œtunneling phases,ā€ one can avoid the unfair entrapment in a ā€œbadā€ local minimum, that is, when the condition š‘“š‘ˆš‘(š‘š+1)ī‚µš±=š‘“(Ģƒš‘˜š‘š+1)š‘(š‘š+1)ī‚¶ī‚µš±=š‘“(Ģƒš‘˜š‘š)š‘(š‘š+1)ī‚¶=š‘“š‘ˆš‘(š‘š)ā‰«š‘“min(5.29) is verified for several iterations. For this purpose, the power of the repellers, utilized in the tunneling phases, plays a crucial role. The classical and well-known use of scalar repellers [14, 34] is often unsuitable, when the dimension š‘› of the problem assumes values of operational interest. A repeller structured matrix, based on the sum of a diagonal matrix and a low-rank one [15], can be constructed to overcome the latter difficulty.

Let š±(Ģƒš‘˜) be an approximation of a local minimizer for š‘“(š±)āˆˆš¶1.

A matrix š•¬(Ģƒš‘˜) is called a repeller matrix for š±(Ģƒš‘˜) if āˆƒĢ‚š±, Ģ‚š±=š±(Ģƒš‘˜)āˆ’š’œ(Ģƒš‘˜)ī‚€š±āˆ‡š‘“(Ģƒš‘˜)ī‚,š‘“(Ģ‚š±ī‚€š±)<š‘“(Ģƒš‘˜)ī‚.(5.30) The repeller matrix š•¬(Ģƒš‘˜) for any given computed local minimizer š±(Ģƒš‘˜) can be approximated in the following way (see [15]): š•¬(Ģƒš‘˜)ā‰ˆšœ†(Ģƒš‘˜)ī‚µš¼š¼+šœ‡ī‚¶+š‘…āˆ’1,2ā‰¤rank(š‘…)ā‰¤4(5.31) with šœ†(Ģƒš‘˜) being the maximal scalar repeller [34] that is, šœ†(Ģƒš‘˜)=šœ–š‘Žā€–ā€–āˆ‡š‘“(š±(Ģƒš‘˜))ā€–ā€–2,ā€–ā€–ā€–āˆ‡š‘“(š±(Ģƒš‘˜)ā€–ā€–ā€–ā‰Ŗāˆššœ–š‘Ž,šœ–š‘Ždesiredprecision,(5.32) with š‘… being of the following structure: š‘…=šœ‡1š©š©š‘‡+šœ‡2šŖšŖš‘‡+šœ‡3š©š«š‘‡+šœ‡4š«šŖš‘‡š©,šŖ,š«suitablevectorsšœ‡1,šœ‡2,šœ‡3,šœ‡4scalars.(5.33) In this way, the application of a BFGS-type method can be effectively extended to the tunneling phases and hence to the whole global optimization scheme (see [9, 33]).

The structure in (5.33) can be generalized by using the recent Tensor-Train (TT)-cross approximation theory [13].

It is well known, in fact, that a rank-p matrix can be recovered from a cross of p linearly independent columns (or rows). Therefore, an arbitrary matrix can be interpolated by a pseudoskeleton approximation (see [15] and again [13]). In particular, since a repeller matrix is not arbitrary and possesses some hidden structure, it is fundamental to discover a low-parametric representation, which can be useful in the tunneling phases.

An operational cross approximation method, evaluating large close-to-rank-p matrices in š’Ŗ(š‘›š‘2) time complexity and by computing š’Ŗ(š‘›š‘) elements, was shown in [37].

6. Discrete Optimization

A well-known family of Computer Science methods is represented by the so-called Greedy algorithms. The simplest application of this type of procedures is in the standard Knapsack Problem (KP), that is, maxšœš‘‡ššš±,š‘‡š±ā‰¤š‘,š±ā‰„šŸŽ,integer.(6.1) Greedy approach is essentially a generalization of the classical Dynamical Programming (DP) methods, which are based on the Bellman Principle. By utilizing the DP computational scheme and assuming š‘¦ integer, problem (6.1) can be reduced to the recursive solution of the following family of problems: maxšœš‘‡(š‘˜)š±(š‘˜),ššš‘‡(š‘˜)š±(š‘˜)š±ā‰¤š‘¦,(š‘˜)ā‰„šŸŽ,integer,1ā‰¤š‘˜ā‰¤š‘›,1ā‰¤š‘¦ā‰¤š‘,integer,(6.2) where šœ(š‘˜),šš(š‘˜),š±(š‘˜) indicate the vectors associated to the first š‘˜ components of šœ,šš,š±, respectively.

Given š‘˜ and š‘¦, let šœ“š‘˜(š‘¦) be the value of the objective function corresponding to the optimal solution of problem (6.2).

The algorithm computes šœ“š‘˜(š‘¦) by the recursive formula šœ“š‘˜ī€½šœ“(š‘¦)=maxš‘˜āˆ’1(š‘¦),šœ“š‘˜ī€·š‘¦āˆ’š‘Žš‘˜(š‘˜)ī€ø+š‘š‘˜(š‘˜)ī€¾.(6.3) By (6.3), the optimal value of (6.2) is determined by a generalized discrete Steepest Descent algorithm, since š‘š‘˜(š‘˜) is the š‘˜.š‘”ā„Ž component of the gradient of the objective function and represents, in fact, the increase associated to the choice of the š‘˜.š‘”ā„Ž object.

Therefore, formula (6.3) is based on a discrete Steepest Descent approach, and the value šœ“š‘˜(š‘¦āˆ’š‘Žš‘˜(š‘˜))+š‘š‘˜(š‘˜) assures that the corresponding solution is admissible.

Integer Nonlinear Programming with Linear Constraints problems (INPLCs) can be transformed into continuous GO problems over the unit hypercube [17]. In order to reduce the difficulties caused by the introduction of undesirable local minimizers, a special class of continuation methods, called smoothing methods, can be introduced [38]. These methods deform the original objective function into a function whose smoothness is controlled by a parameter. Of course, the success of the latter approach depends on the existence of a suitable smoothing function.

Hence, the Gradient-type methods for Global Optimization of Section 4 can be also applied to INPLC.

7. Conclusions

In this paper we have tried to demonstrate that Gradient or Gradient-type methods lead both to a general approach to optimization problems and to the construction of efficient algorithms.

In particular, we have shown that the class of problems for which the optimal solution can be obtained in a finite number of steps is larger than canonical unconstrained Convex Quadratic problems or Convex Quadratic Programming. Moreover, we have pointed out that the classical distinction between Direct Methods and Iterative Methods cannot be considered as a fundamental classification of techniques in Numerical Analysis. Many optimization problems can be, in fact, solved in a finite number of steps by suitable hybrid efficient algorithms (see [33]).

Furthermore, if the matrices involved in the computation are well conditioned, the superiority of Iterative Methods with respect to Direct ones, which is a typical feature of (š¶šŗ) algorithm, can be proved an a more general context (see again [33]).

Several heuristic and ad hoc algorithms in operational environments can be considered, in fact, as particular cases of a general Gradient-type approach to the problem. In some cases, surprisingly enough, the convergence of Iterative Methods can be guaranteed only by utilizing a special Line-Search Minimization algorithm (see f.i. Fletcher-Reeves method in conjunction with Armijo-Goldstein-Wolfe's procedure, [18], Theorem 5.8).

It is also important to underline that many combinatorial problems, representing a remarkable benchmark set in Computer Science, can be translated in terms of Gradient-type methods in a general framework.

Once again, we stress that the Fixed Point theorem, which is considered a milestone in Numerical Analysis and guarantees the convergence of most of classical Iterative Methods, represents the background for only a subset of Gradient-type methods.


This paper was partially supported by PRIN 2008 N. 20083KLJEZ.