Abstract

Various optimization problems in engineering and management are formulated as nonlinear programming problems. Because of the nonconvexity nature of this kind of problems, no efficient approach is available to derive the global optimum of the problems. How to locate a global optimal solution of a nonlinear programming problem is an important issue in optimization theory. In the last few decades, piecewise linearization methods have been widely applied to convert a nonlinear programming problem into a linear programming problem or a mixed-integer convex programming problem for obtaining an approximated global optimal solution. In the transformation process, extra binary variables, continuous variables, and constraints are introduced to reformulate the original problem. These extra variables and constraints mainly determine the solution efficiency of the converted problem. This study therefore provides a review of piecewise linearization methods and analyzes the computational efficiency of various piecewise linearization methods.

1. Introduction

Piecewise linear functions are frequently used in various applications to approximate nonlinear programs with nonconvex functions in the objective or constraints by adding extra binary variables, continuous variables, and constraints. They naturally appear as cost functions of supply chain problems to model quantity discount functions for bulk procurement and fixed charges. For example, the transportation cost, inventory cost, and production cost in a supply chain network are often constructed as a sum of nonconvex piecewise linear functions due to economies of scale [1]. Optimization problems with piecewise linear costs arise in many application domains, including transportation, telecommunications, and production planning. Specific applications include variants of the minimum cost network flow problem with nonconvex piecewise linear costs [27], the network loading problem [811], the facility location problem with staircase costs [12, 13], the merge-in-transit problem [14], and the packing problem [1517]. Other applications also include production planning [18], optimization of electronic circuits [19], operation planning of gas networks [20], process engineering [21, 22], engineering design [23, 24], appointment scheduling [25], and other network flow problems with nonconvex piecewise linear objective functions [7].

Various methods of piecewisely linearizing a nonlinear function have been proposed in the literature [2639]. Two well-known mixed-integer formulations for piecewise linear functions are the incremental cost [40] and the convex combination [41] formulations. Padberg [35] compared the linear programming relaxations of the two mixed-integer programming models for piecewise linear functions in the simplest case when no constraint exists. He showed that the feasible set of the linear programming relaxation of the incremental cost formulation is integral; that is, the binary variables are integers at every vertex of the set. He called such formulations locally ideal. On the other hand, the convex combination formulation is not locally ideal, and it strictly contains the feasible set of the linear programming relaxation of the incremental cost formulation. Then, Sherali [42] proposed a modified convex combination formulation that is locally ideal. Alternatively, Beale and Tomlin [43] suggested a formulation for the piecewise linear function similar to convex combination, except that no binary variable is included in the model and the nonlinearities are enforced algorithmically, directly in the branch-and-bound algorithm, by branching on sets of variables, which they called special ordered sets of type 2 (SOS2). It is also possible to formulate piecewise linear functions similar to incremental cost but without binary variables and enforcing the nonlinearities directly in the branch-and-bound algorithm. Two advantages of eliminating binary variables are the substantial reduction in the size of the model and the use of the polyhedral structure of the problem [44, 45]. Keha et al. [46] studied formulations of linear programs with piecewise linear objective functions with and without additional binary variables and showed that adding binary variables does not improve the bound of the linear programming relaxation. Keha et al. [47] also presented a branch-and-cut algorithm for solving linear programs with continuous separable piecewise-linear cost functions. Instead of introducing auxiliary binary variables and other linear constraints to represent SOS2 constraints used in the traditional approach, they enforced SOS2 constraints by branching on them without auxiliary binary variables.

Due to the broad applications of piecewise linear functions, many studies have conducted related research on this topic. The main purpose of these studies is to find a better way to represent a piecewise linear function or to tighten the linear programming relaxation. A superior representation of piecewise linear functions can effectively reduce the problem size and enhance the computational efficiency. However, for expressing a piecewise linear function of a single variable with break points, most of the methods in the textbooks and literature require adding extra binary variables and constraints, which may cause a heavy computational burden when is large. Recently, Li et al. [48] developed a representation method for piecewise linear functions with fewer binary variables compared to the traditional methods. Although their method needs only extra binary variables to piecewisely linearize a nonlinear function with break points, the approximation process still requires extra constraints, nonnegative continuous variables, and free-signed continuous variables. Vielma et al. [39] presented a note on Li et al.’s paper and showed that two representations for piecewise linear functions introduced by Li et al. [48] are both theoretically and computationally inferior to standard formulations for piecewise linear functions. Tsai and Lin [49] applied the Vielma et al. [39] techniques to express a piecewise linear function for solving a posynomial optimization problem. Croxton et al. [31] indicated that most models of expressing piecewise linear functions are equivalent to each other. Additionally, it is well known that the numbers of extra variables and constraints required in the linearization process for a nonlinear function obviously impact the computational performance of the converted problem. Therefore, this paper focuses on discussing and reviewing the recent advances in piecewise linearization methods. Section 2 reviews the piecewise linearization methods. Section 3 compares the formulations of various methods with the numbers of extra binary/continuous variables and constraints. Section 4 discusses error evaluation in piecewise linear approximation. Conclusions are made in Section 5.

2. Formulations of Piecewise Linearization Functions

Consider a general nonlinear function of a single variable ; is a continuous function, and is within the interval []. Most commonly used textbooks of nonlinear programming [2628] approximate the nonlinear function by a piecewise linear function as follows.

Firstly, denote as the break points of , , and Figure 1 indicates the piecewise linearization of .

can then be approximately linearized over the interval [] as where , , , in which only two adjacent ’s are allowed to be nonzero. A nonlinear function is then converted into the following expressions.

Method 1. Consider
where , , .

The above expressions involve new binary variables . The number of newly added 0-1 variables for piecewisely linearizing a function equals the number of breaking intervals (i.e., ). If is large, it may cause a heavy computational burden.

Li and Yu [33] proposed another global optimization method for nonlinear programming problems where the objective function and the constraints might be nonconvex. A univariate function is initially expressed by a piecewise linear function with a summation of absolute terms. Denote () as the slopes of line segments between and , expressed as . can then be written as follows:

is convex in the interval if , and otherwise is a non-convex function which needs to be linearized by adding extra binary variables. By linearizing the absolute terms, Li and Yu [33] converted the nonlinear function into a piecewise linear function as shown below.

Method 2. Consider
where , , , , are upper bounds of and are extra binary variables used to linearize a non-convex function for the interval .

Comparing Method 2 with Method 1, Method 1 uses binary variables to linearize for whole interval. But the binary variables used in Method 2 are only applied to linearize the non-convex parts of . Method 2 therefore uses fewer 0-1 variables than Method 1. However, for with intervals of the non-convex parts, Method 2 still requires binary variables to linearize .

Another general form of representing a piecewise linear function is proposed in the articles of Croxton et al. [31], Li [32], Padberg [35], Topaloglu and Powell [36], and Li and Tsai [38]. The expressions are formulated as shown below.

Method 3. Consider
where , , and where is a large constant and .

The above expressions require extra binary variables and constraints, where break points are used to represent a piecewise linear function.

Form the above discussions, we can know that Methods 1, 2, and 3 require a number of extra binary variables and extra constraints linear in to express a piecewise linear function. To approximate a nonlinear function by using a piecewise linear function, the numbers of extra binary variable and constraints significantly influence the computational efficiency. If fewer binary variables and constraints are used to represent a piecewise linear function, then less CPU time is needed to solve the transformed problem. For decreasing the extra binary variables involved in the approximation process, Li et al. [48] developed a representation method for piecewise linear functions with the number of binary variables logarithmic in . Consider the same piecewise linear function discussed above, where is within the interval [] and break points exist within []. Let be an integer, , expressed as

Let be a set composed of all indices such that . For instance, , .

Denote to be the number of elements in . For instance, , .

To approximate a univariate nonlinear function by using a piecewise linear function, the following expressions are deduced by the Li et al. [48] method.

Method 4. Consider
where , , , and are free continuous variables, and are nonnegative continuous, and all the variables are the same as defined before.

The expressions of Method 4 for representing a piecewise linear function with break points use binary variables, constraints, non-negative variables, and 2 free-signed continuous variables. Comparing with Methods 1, 2, and 3, Method 4 indeed reduces the number of binary variables used such that the computational efficiency is improved. Although Li et al. [48] developed a superior way of expressing a piecewise linear function by using fewer binary variables, Vielma et al. [39] investigated that this representation for piecewise linear functions is theoretically and computationally inferior to standard formulations for piecewise linear functions. Vielma and Nemhauser [50] recently developed a novel piecewise linear expression requiring fewer variables and constraints than the current piecewise linearization techniques to approximate the univariate nonlinear functions. Their method needs a logarithmic number of binary variables and constraints to express a piecewise linear function. The formulation is described as shown below.

Let and . An injective function , , where the vectors and differ in at most one component for all ,….

Let , for all  , , and . Some notations are introduced below.

: a set composed of all , where of and for or of for ; that is, .

: a set composed of all , where of and for ,… or of for ; that is, .

The linear approximation of a univariate , , by the technique of Vielma and Nemhauser [50] is formulated as follows.

Method 5. Denote as the piecewise linear function of , where be the break points of . can be expressed as

Method 5 uses binary variables, continuous variables, and constraints to express a piecewise linearization function with line segments.

3. Formulation Comparisons

The comparison results of the above five methods in terms of the numbers of binary variables, continuous variables, and constraints are listed in Table 1. The number of extra binary variables of Methods 1 and 3 is linear in the number of line segments. Methods 4 and 5 have the logarithmic number of extra binary variables with line segments, and the number of extra binary variables of Method 2 is equal to the number of concave piecewise line segments. In the deterministic global optimization for a minimization problem, inverse, power, and exponential transformations generate nonconvex expressions that require to be linearly approximated in the reformulated problem. That means Methods 4 and 5 are superior to Methods 1, 2, and 3 in terms of the numbers of extra binary variables and constraints as shown in Table 1. Moreover, Method 5 has fewer extra continuous variables and constraints than Method 4 in linearizing a nonlinear function.

Till et al. [51] reviewed the literature on the complexity of mixed-integer linear programming (MILP) problems and summarized that the computational complexity varies from to , where is the number of constraints and is the number of binaries. Therefore, reducing constraints and binary variables makes a greater impact than reducing continuous variables on computational efficiency of solving MILP problems. For finding a global solution of a nonlinear programming problem by a piecewise linearization method, if the linearization method generates a large number of additional constraints and binaries, the computational efficiency will decrease and cause heavy computational burdens. According to the above discussions, Method 5 is more computationally efficient than the other four methods. Experiment results from the literature [39, 48, 49] also support the statement.

Beale and Tomlin [43] suggested a formulation for piecewise linear functions by using continuous variables in special ordered sets of type 2 (SOS2). Although no binary variables are included in the SOS2 formulation, the nonlinearities are enforced algorithmically and directly in the branch-and-bound algorithm by branching on sets of variables. Since the traditional SOS2 branching schemes have too many dichotomies, the piecewise linearization technique in Method 5 induces an independent branching scheme of logarithm depth and provides a significant computational advantage [50]. The computational results in Vielma and Nemhauser [50] show that Method 5 outperforms the SOS2 model without binary variables.

The factors affecting the computational efficiency in solving nonlinear programming problems include the tightness of the constructed convex underestimator, the efficiency of the piecewise linearization technique, and the number of the transformed variables. An appropriate variable transformation constructs a tighter convex underestimator and makes fewer break points required in the linearization process to satisfy the same optimality tolerance and feasibility tolerance. Vielma and Nemhauser [50] indicated that the formulation of Method 5 is sharp and locally ideal and has favorable tightness properties. They presented experimental results showing that Method 5 significantly outperforms other methods, especially when the number of break points becomes large. Vielma et al. [39] explained that the formulation of Method 4 is not sharp and is theoretically and computationally inferior to standard MILP formulations (convex combination model, logarithmic convex combination model) for piecewise linear functions.

4. Error Evaluation

For evaluating the error of piecewise linear approximation, Tsai and Lin [49, 52] and Lin and Tsai [53] utilized the expression to estimate the error indicated in Figure 2. If is the objective function, is the th constraint, and is the solution derived from the transformed program, then the linearization does not require to be refined until and , where is the evaluated error in objective, is the optimality tolerance, is the error in the th constraint, and is the feasibility tolerance.

The accuracy of the linear approximation significantly depends on the selection of break points and more break points can increase the accuracy of the linear approximation. Since adding numerous break points leads to a significant increase in the computational burden, the break point selection strategies can be applied to improve the computational efficiency in solving optimization problems by the deterministic approaches. Existing break point selection strategies are classified into three categories as follows [54]:(i)add a new break point at the midpoint of each interval of existing break points;(ii)add a new break point at the point with largest approximation error of each interval; (iii)add a new break point at the previously obtained solution point.

According to the deterministic optimization methods for solving nonconvex nonlinear problems [29, 33, 38, 39, 48, 49, 5356], the inverse or logarithmic transformation is required to be approximated by the piecewise linearization function. For example, the function or is required to be piecewisely linearized by using an appropriate breakpoint selection strategy, if a new break point is added at the midpoint of each interval of existing break points or at the point with largest approximation error, the number of line segments becomes double in each iteration. If a new breakpoint is added at the previously obtained solution point, only one breakpoint is added in each iteration. How to improve the computational efficiency by a better break point selection strategy still needs more investigations or experiments to get concrete results.

5. Conclusions

This study provides an overview on some of the most commonly used piecewise linearization methods in deterministic optimization. From the formulation point of view, the numbers of extra binaries, continuous variables, and constraints are decreasing in the latest development methods especially for the number of extra binaries which may cause heavy computational burdens. Additionally, a good piecewise linearization method must consider the tightness properties such as sharp and locally ideal. Since effective break points selection strategy is important to enhance the computational efficiency in linear approximation, more work should be done to study the optimal positioning of the break points. Although a logarithmic piecewise linearization method with good tightness properties has been proposed, it is still too time consuming for finding an approximately global optimum of a large scale nonconvex problem. Developing an efficient polynomial time algorithm for solving nonconvex problems by piecewise linearization techniques is still a challenging question. Obviously, this contribution gives only a few preliminary insights and might point toward issues deserving additional research.

Acknowledgment

The research is supported by Taiwan NSC Grants NSC 101-2410-H-158-002-MY2 and NSC 102-2410-H-027-012-MY3.