Abstract

An easy-to-use procedure is presented for improving the -constraint method for computing the efficient frontier of the portfolio selection problem endowed with additional cardinality and semicontinuous variable constraints. The proposed method provides not only a numerical plotting of the frontier but also an analytical description of it, including the explicit equations of the arcs of parabola it comprises and the change points between them. This information is useful for performing a sensitivity analysis as well as for providing additional criteria to the investor in order to select an efficient portfolio. Computational results are provided to test the efficiency of the algorithm and to illustrate its applications. The procedure has been implemented in Mathematica.

1. Introduction

The portfolio selection problem consists of finding an efficient portfolio in the sense of obtaining a tradeoff between the expected return and the risk of the investment. Most portfolio selection models are based on the original Markowitz model [1, 2], in which the expected return of a given portfolio is measured by , where is the vector of mean returns of the assets and contains the weight of each asset in the portfolio. On the other hand, the risk is measured by , where is the covariance matrix. In general, the matrix is positive semidefinite, but we will assume that it is positive definite. This is the case if the returns of the assets are linearly independent as random variables.

In these terms, the Markowitz model can be formulated as the following quadratic programming problem, which we abbreviate as continuous variable problem (CP) as opposed to the formulation with semi continuous variables to be introduced later:

Here is a minimum expected return specified by the investor. The portfolio selection problem can be thought of in a more natural way as a biobjective problem: to minimize risk and to maximize the expected return. Hence, an optimal portfolio selection must provide an efficient portfolio, that is, a portfolio providing the maximum expected return for a given admissible risk or—which is the same—the minimum risk for a given desired expected return. The risk-return pairs of all the efficient portfolios form the so-called efficient frontier of a given instance of the problem, and so the decision-support techniques designed to assist an investor in selecting a portfolio consist of computing and analyzing the efficient frontier in order to find the efficient portfolio best fitting the investor's preferences about the trade-off between acceptable risk and desired return.

The real world modern portfolio selection problems incorporate into the original Markowitz model many different kinds of additional constraints, reflecting both market conditions and further investor preferences (see, for instance, [3]). Here we address the problem of dealing with the two kinds among these constraints which make the corresponding model more involved from a computational point of view, namely, semicontinuous variable constraints and cardinality constraints. The main feature of models incorporating such constraints is that they are not quadratic (continuous) problems anymore, but become mixed integer (binary) problems. As it will be shown, the efficient frontier of such problems becomes more irregular and new specific computation techniques are required.

Moreover, these irregularities can make the optimal solution of the problem highly sensitive to small variations of the parameters fixed by the investor which are always very vague in nature. Cadenas et al. [4] deal with this issue by means of a fuzzy version of the portfolio selection problem in the continuous variable case. The techniques developed in the present paper make possible to apply those of [4] to the more general and complex problems we are considering here, in which the sensitivity analysis of the solutions is even more necessary. Sensitivity analysis on (continuous variable) quadratic problems has been studied from different points of view. For the specific case of the portfolio selection problem, the sensitivity on the estimations about expected returns and risk levels is dealt with, for instance, in Goldfarb and Iyengar [5]. A general analysis of the optimal value function in quadratic programming can be found in Hadigheh et al. [6]. See also Best and Grauer [7] for the portfolio selection case.

2. On the Computation of the Efficient Frontier

It is well known [2, 6] that the efficient frontier of (CP) is a continuous curve comprising a finite number of arcs of parabola. The usual way of determining it is the so-called -constraint method (EC) (see [8]), which can be described as a two-stage procedure.(EC1) Calculate a sample of the efficient frontier, that is, solve the problem (CP) (or any of its extensions described below) for a sufficiently large number of values of , ranging from the minimum to the maximum possible return of an efficient portfolio, which are calculated previously. So a “dotted” representation of the efficient frontier is obtained. (EC2) Interpolate the pairs (risk, return) by any standard interpolation technique to obtain a continuous curve, or even a smooth one, depending on the specific interpolation technique used.

This is what most commercial packages actually do (see [8] for a review of the current software situation). Notice that what really matters is not just obtaining a picture of the efficient frontier but knowing the efficient portfolio corresponding to each of its points. In this way, the -constraint method also requires an interpolation of the efficient portfolios calculated in EC1 stage, and this is usually done by linear (vectorial) interpolation, even if the interpolation of EC2 has been nonlinear.

Although the -constraint method is the most extensively used procedure [8], it is clear that it provides a limited knowledge of the efficient frontier. In order to take a well-founded decision, it would be very useful to know the change points where an arc of parabola of the efficient frontier joins the next one, since they correspond to different portfolio compositions allowing a richer sensitivity analysis to be made than that provided by the Kuhn-Tucker multipliers (if known) and offering the investor the possibility of choosing among portfolios which are similar in risk and return but different in other characteristics (dividends, social responsibility, etc.) that could be considered decisive when the differences on risk and return are minimal.

In specific terms, an investor wishing for of expected return would accept an efficient portfolio providing just if it had better characteristics than the efficient portfolio corresponding to a return of , for instance, if it had a substantially lower risk or a composition that made it preferable for other reasons not reflected in the model because of its secondary importance. This preferable alternative can exist if the initial choice of is near a change point of the efficient frontier.

That is why some attempts can be found in the literature to obtain techniques for computing the exact efficient frontier, that is, for obtaining an analytical—instead of numerical—representation of the frontier, providing the exact efficient portfolio for each risk or return value, the equations of the arcs of parabola, the change points, the Kuhn-Tucker multipliers, and so forth. Markowitz himself provides in [2] the so-called critical line algorithm, a simplex-like procedure dealing with quadratic problems, which was distributed later in an excel implementation called Optimizer, limited to problems with at most 248 variables [9]. Later, Steuer et al.[8, 10, 11] proposed a completely different algorithm called MPQ (multiparametric quadratic programming) and showed that it is even more powerful than the previous method and can deal with very large instances of (CP). Finally, A. Niedermayer and D. Niedermayer presented in [12] a revised version of the Markowitz algorithm, improving MPQ.

However, all these exact methods are specifically designed for the problem (CP), but when additional constraints are incorporated, the efficient frontier is no longer continuous, and the set of possible risk-return pairs is not convex (see Figure 7 for a “typical” efficient frontier in this context). No method is known for computing the exact efficient frontier of such problems, and the -constraint method seems to be the only one available.

As Tables 1 and 2 (below) show, this method is useless in practice for large instances of (CP), and a fortiori for large instances of the much more complex problem with the additional constraints described. However, for medium-sized instances, the standard commercial packages like GAMS [13] or LINGO [14] happen to be powerful enough to deal with the EC1 stage of the -constraint method in a few minutes (for instance, for a -asset sample, GAMS takes about minutes to calculate a -point sample). The purpose of this paper is to make a proposal regarding the EC2 stage.

The point is that all the interpolation methods used to this end vary from the linear interpolation (providing continuous nonsmooth curves) to other classical, relatively simple, interpolation methods providing smooth curves (see [15]). The main disadvantage of these methods is that they are good ways for approximating smooth curves by continuous or smooth curves, but looking for a smooth curve is not a good idea when we know that the true curve we are trying to capture is not even continuous.

More precisely, our proposal is an algorithm for calculating locally exact pieces of the efficient frontier around each point in the sample calculated at the EC1 stage of the procedure. It does provide a sequence of intervals together with the equations of the arcs of parabola composing the efficient frontier in each interval, as well as a pair of vectors parametrizing the corresponding efficient portfolio as a function of the expected return. It does not necessarily obtain the exact efficient frontier, but it provides an analytical interpolation of a given sample which is the best interpolation that can be obtained from it, in the sense that it is locally exact, that is, it is exact in a neighbourhood of each point of the sample. Moreover, for small problems it can be adapted to an enumeration algorithm providing the exact frontier.

3. The KTEF Procedure

Here we describe the kernel of the interpolation procedure that we propose as an alternative for the EC2 stage in the -constraint method. It is applied to the following variant of (CP), where two vectors and of lower and upper bounds for the assets have been incorporated. Hence, we have a continuous bounded variable problem (CBP):

Since it is a continuous (quadratic) problem, its optimal solution could be obtained theoretically by algebraically solving its Kuhn-Tucker conditions. In order to write them, we need to introduce the Lagrangian function: where , are real numbers and , are vectors (, , and being the Kuhn-Tucker multipliers of the problem). Then the Kuhn-Tucker conditions are:

We see that all of them are linear equalities or inequalities except for the complementary slackness ones. Each complementary slackness condition splits into two alternative linear equations that, when combined, give rise to systems of linear equations and inequalities, where is the number of assets considered (however, since the equations and cannot be satisfied simultaneously, they are immediately reduced to ).

That is why the explicit resolution of the Kuhn-Tucker conditions is not a viable method, even for a small-sized problem of, for example, variables, the amount of equation systems to be solved being exponentially high. Consequently, this approach is not dealt with in the literature except for very small instances of the problem (such as the two-asset case [16]), or in the simplest case consisting of problem (CP) without the sign constraints, that is, allowing short sales (see [17, 18]). One of the main ideas that we plan to exploit here is that if a solution of the Kuhn-Tucker conditions for a given value of is known (in our context, as the result of the EC1 stage of the -constraint method), such a solution determines a specific case of the complementary slackness conditions, and in turn a single system of linear equations and inequalities that can be solved parametrically on . The result is an exact piece of the efficient frontier of (CBP).

We have called KTEF the algorithm that (partially) solves in this sense the Kuhn-Tucker conditions to calculate a piece of the efficient frontier. In order to present it, we formulate some preliminary considerations.

Let us call and the minimum and the maximum expected return of an efficient portfolio. For the portfolio selection problem becomes infeasible, whereas for the optimal solution is the same as for . Hence, we can assume that . Bearing in mind that we are assuming the variance-covariance matrix to be positive definite, we know that for each level of return the problem has a unique optimal solution (with expected return exactly equal to ), which is its only Kuhn–Tucker point. This implies that the first constraint in (3.1) is satisfied with an equality: Hence, when stating the Kuhn-Tucker conditions, we can take this equation as the first primal feasibility condition and delete the first complementary slackness one.

For each variable , the pair of conditions gives rise to three possibilities:

Hence, in each case, the index set splits into three disjoint subsets , where and . Let us call one of these cases degenerate if it can provide a Kuhn-Tucker point for at most one value of (considered as a parameter of the model). Notice that every case in which contains at most one index is degenerate. Indeed, if , the sets and determine the whole portfolio , so that must be that determined by (3.4). If , the value of is determined by equation and is again determined by (3.4). Since the Kuhn-Tucker conditions cannot provide an interpolation when the given case is degenerate, KTEF stops as soon as this situation is detected, in particular if contains less than two indices. Otherwise, from the sets and , the KTEF procedure solves the Kuhn-Tucker conditions parametrically on , that is, it calculates two vectors and such that the optimal portfolio is for all varying in a certain interval , also determined by KTEF. Moreover, it also calculates the coefficients , , such that the efficient frontier over the above-mentioned interval is the arc of parabola described by the quadratic equation .

See Algorithm 1 for the pseudocode of the KTEF-procedure. The details of the calculations, together with the justification that it actually solves the Kuhn-Tucker conditions, can be found in the Appendix. Notice that the output of the algorithm also contains the terms which determine the Khun-Tucker multipliers. See the Appendix for their specific meaning.

Inputs , , , , , .
Step 1 Set , , extract the vectors ,
    and the submatrices , and (see (A.1) in the Appendix),
    and the vector of active bounds.
Step 2 If # the case is degenerate (STOP).
Step 3 Calculate the inverse matrix .
Step 4 Calculate , , , , , .
Step 5 Calculate , , , according to (A.14).
Step 6 Calculate , according to (A.11), as well as
     , .
Step 7 Calculate , , , according to (A.15).
Step 8 Define a set of lower bounds for containing:
    (i) ,
    (ii) for provided that ,
    (iii) for provided that ,
    (iv) for provided that , (where ,
         ),
    (v) for provided that (where ,
         ,
Step 9 Define .
Step 10 Define a set of upper bounds for containing:
    (i) for provided that ,
    (ii) for provided that ,
    (iii) for provided that ,
    (iv) for provided that ,
Step 11 Define .
Step 12 If the case is degenerate (STOP).
Step 13 Calculate , , according to (A.19).
Outputs , , , , , , , , , , , , .

4. Computing the Efficient Frontier

In this section, we present a KTEF-based procedure, which we call KTEF-S (see Algorithm 2 for the pseudocode), for performing the second stage of the -constraint method (EC2) for the portfolio selection problem endowed with semicontinuous variable and cardinality constraints (additional linear constraints can also be included in our proposal without modifying it essentially, but we will not consider it in practice for the sake of clarity).

Inputs , , , , .
For each
 (a) Let , , , be the vectors obtained from , , , , respectively, by deleting
 the components corresponding to indexes such that .
 (b) Calculate and from according to (3.6).
End
Eliminate the terms in the sequence giving rise to repeated terms in the
sequence .
For each
 (a) Let be the submatrix of obtained by deleting the rows and columns for
 which .
 (b) Call KTEF ( , , , , , ), which provides an interval [ , ]and
 the coefficients ( , , ) of the equation of an arc of parabola.
  on error (KTEF has stoped in a degenerate case) discard the point.
End
(i) Define the functions ( ) given by (4.3).
(ii) Let , where is the minimum of all .
For to
  For to
   (a) Let Roots be the set of real roots of (4.4).
   (b) Append to Points any satisfying (4.5).
   (c) Let Roots be the set of real roots of (4.6).
   (d) Append to Points any satisfying .
  Next .
Next .
(i) Order the vector Points and eliminate repeated entries.
(ii) Let for each .
(iii) Calculate the vector such that is the index where is attained.
(iv) Let , let .
For to the length of
If append to Good the index and append to Change
 the value
Next .
(i) Append to Change the last point of Points.
(ii) Set , , ) (where is a short for ).
Outputs Change, .

Semicontinuous Variables
Portfolios with many small nonzero weights are usually considered unacceptable by many investors and, on the other hand, the investor may also impose upper bounds for the sake of diversification. Since it would be absurd to force the portfolio to contain a minimum amount of each possible asset, we need to declare each weight as a semicontinuous variable, that is, allow it to take the value or, in another case, to vary within a given interval .

Cardinality Constraints
They appear as diversification constraints, introducing into the model an investor's preferences about how many assets an acceptable portfolio must contain, or even how many assets it must contain from several fixed groups of assets.

Semicontinuous variables can be incorporated into the model by means of auxiliary binary variables, obtaining the following semicontinuous variable problem (SCP):

Here takes the value if the th asset appears in the portfolio and otherwise. We have added hats to the problem data in order to keep the notation of KTEF when called later. The binary variables can also be used to incorporate the cardinality constraints. For instance, we can impose where and are, respectively, a lower and an upper bound on the number of assets composing the portfolio. Similarly, some bounds can be imposed on the number of assets taken from a specific subset. Any such cardinality constraint (i.e., any condition on the binary variables ) can be added without altering our method at all.

The input of the KTEF-S algorithm is the output of the first stage of the -constraint method (EC1), namely, a dotted sample of the efficient frontier calculated by means of any suitable procedure (for medium-sized problems, many commercial packages like GAMS or LINGO can be used).

For each point (, ), we apply KTEF to the instance of the problem (CBP) obtained by removing the variables from (SCP) (with any set of additional cardinality constraints) such that . This provides an interval and the coefficients of the equation of an arc of parabola, which is a piece of the exact efficient frontier of the problem . In order to compare the arcs defined on the possibly overlapping intervals , we extend them to the functions where is a large enough number (greater than any possible level of risk). Function provides the lowest level of risk that we can find for a given level of return from the fact that we know that is an efficient portfolio for (SCP) (where means that we cannot find any efficient portfolio from this fact). The best risk we can find for a given is . The last part of the KTEF-S calculates the function , which is the best approximation to the efficient frontier that we can get from the sample.

We want to calculate the function expressed as a sequence of lines and arcs of parabola on a respective sequence of intervals. The extreme points of these intervals (i.e., the points where the minimum of the functions changes from being attained at one index to being attained at another one ) can be of three different kinds. (1) The intersection point of two arcs of parabola corresponding to two different sample points, that is a point satisfying Notice that it is also necessary for to belong to the domains of both parabolas, that is, (2) The intersection point of an arc of parabola of an with the first constant piece of another , that is, a point satisfying with , .(3) The end point of an arc of parabola, that is, one of the points .

Figure 1 shows an example of each type of change point. The KTEF-S procedure calculates the (finite) set Points of all points of type 1, 2, 3, so that the set of all change points will be found as a subset of Points. For technical reasons, we also include the minimum of all . To select the subset Change of change points from the set Points, we order Points and calculate the middle points . Let be the index where the minimum is attained. Since , if , there must be a change point between and , which should be , since it is the only member of Points in that interval. Hence the set Change can be obtained as the set of the points such that . Notice that we never check if really is a change point, but we clearly have , which cannot be a change point. We also define a set Good containing the indexes such that . Hence, if we enumerate the elements of Change as and those of Good as , we have that, for , the minimum is attained at . The data corresponding to indexes outside Good can be dismissed.

The output of the procedure consists of the sequence of change points together with the sequence of coefficients of parabola corresponding to the efficient frontier over the interval .

5. Applying the KTEF Procedure to the Continuous Case

Although there are more efficient methods for computing the efficient frontier of a linear constrained (continuous) portfolio selection problem, it should be mentioned that in this case KTEF provides an interesting alternative to the usual -constraint method (i.e., the two-stage procedure described in the introduction) which also provides the exact efficient frontier.

The idea is that instead of first calculating a sample for an arbitrary sequence of expected returns, the KTEF algorithm can guide the selection of the sample so that the number of calls to the solver that calculates the sample points is reduced to the minimum necessary to get the exact frontier.

Let us describe this procedure, which we have called KTEF-C, which applies KTEF to the continuous case (see Algorithm 3 for the pseudocode). Besides calling the KTEF procedure, it also uses a subroutine H whose inputs are the data , , , of the model together with a level of return and whose output is the efficient portfolio for that . As we have mentioned, the procedures implemented in the usual commercial standard optimization packages such as GAMS or LINGO are widely used for sampling efficient frontiers and they are capable of dealing with any reasonable problem.

Inputs , , ,
 uses H, R, KTEF
  Set , an empty sequence,
  
  Set ( , , , , ), ( , , , )
(1) Set , .
While and do
   ,
If then
  If then else
  
  else stop
(2) Set ( , , , , )
  Calculate and from according to (3.6)
  Call KTEF( , , , , , )
    on error (KTEF has stoped in a degenerate case)
   set go to (2)
  Set
  Set
  Set ( , are part of the
  output of KTEF). The new interval should be inserted
  in the right place to preserve the increasing order of the
  sequence
goto (1)
Output

At the beginning of the KTEF-C procedure, another subroutine R is called once in order to calculate as the maximum return that can be attained on the feasible set of (3.1) without the first constraint.

The output of the KTEF-C algorithm is a set containing a sequence of outputs of KTEF, that is, of the form where the intervals are almost disjoint (they have at most their endpoints in common) and cover the whole interval of the efficient frontier, the corresponding vectors parametrize the efficient portfolios, and the parabolas parametrize the efficient frontier. The rest of the data parametrize the Kuhn-Tucker multipliers.

Notice that the loop starting in the line labeled (2) must end after a finite number of iterations, since there is a finite number of possibilities for and and each degenerate case corresponds to at most one value of . Hence, there is just a finite number of possible values for giving rise to a degenerate case. In practice, the probability of choosing an corresponding to a degenerate case is very small, so that the error case will never hold.

Each time the main loop (starting in (1)) is executed, a new nondegenerate interval is found. Since the number of such nondegenerate intervals is finite (because the number of nondegenerate possibilities for the sets and is also finite), the KTEF algorithm always stops, and the number of iterations is exactly the number of nondegenerate intervals composing the efficient frontier, that is, the least necessary number of iterations needed to compute the whole efficient frontier.

Finally, we note that the non-degenerate intervals cover the whole interval , since if denotes the union of such intervals, then is a closed subset of whose complementary set is finite, and hence closed. The connectedness of the interval implies that , that is, that all the Kuhn-Tucker points appear in the non-degenerate cases. That is why the degenerate cases can be disregarded.

6. Testing the Algorithms

In this section, we present some computational results in order to test the efficiency of our proposed algorithms. We have used a database of historical data of 1000 assets taken from the Russell 2000 stock market index [19]. The percentage of zero entries of all the covariance matrices considered in our computational proofs oscillates between and , so they are far from being sparse. The EC1 stage of the -constraint method has been handled with GAMS and our algorithms for the EC2 stage have been implemented in Mathematica.

For the continuous case, it is well known (see Section 1) that there are much more efficient procedures than -constraint. For instance, in Table 1 we reproduce a table from [8] comparing the -constraint method with the MPQ method proposed by Steuer, Qi and Hirschberger, which calculates the exact efficient frontier. We see that the CPU-times of the MPQ method are substantially better and also that the -constraint method becomes inviable for large instances of the problem. We also refer to (Table 12.1) in [12], where two variants of the -constraint (one using the so-called “Wolfe-simplex algorithm” and a second one using Matlab) are compared with MPQ, Markowitz's critical line algorithm and the improved version of the latter proposed by those authors. The largest case considered for the -constraint method corresponds to a -asset instance and the reported Matlab CPU-time is seconds per point.

On the other hand, for the semicontinuous case no alternative is known, and the CPU-times of solving mixed integer programs are much greater. Table 2 contains the mean CPU-time per point we have obtained for some instances with a different number of assets in the continuous and semicontinuous case. We have obtained better times than those of [8] for the continuous case (but presumably the MPQ results would be similarly improved by a faster computer and the CPU-times for MPQ in Table 1 are better than ours in any case). In the semicontinuous case, the EC1 stage of the -constraint method becomes inviable for -asset instances of the problem, and barely useful for -asset instances (for which a, say, -point sample requires about three and a half hours of computations).

However, these considerations concern to the EC1 stage of the -constraint method whereas our algorithms deal with the EC2 stage. Hence, once it is assumed that the -constraint method is to be used (because of its simplicity in the continuous case or out of necessity in the semicontinuous one), the only possible comparison would be with the usual interpolation methods. These methods vary from the simple piecewise linear interpolation (i.e., joining a given sequence of dots with straight lines) to methods that are a bit more sophisticated, guaranteeing that the resulting curve will be differentiable (like spline interpolation [15]). These methods are implemented in almost every commercial package, their computational time is negligible, and it is even disregarded when computing CPU-times of the -constraint method. Thus, it is obvious that our algorithms for interpolating a given sample of the efficient frontier by means of the Kuhn-Tucker conditions will take necessarily more time than the usual ones, which simply adjust small degree polynomials. Hence, we can only test our algorithms in the sense of granting that, for those instances of the problem for which the -constraint method is viable, the CPU-time added by our interpolation method is acceptable in view of the advantages it provides.

In this way, Table 3 contains the CPU-time which needs the KTEF-S algorithm to process one case (i.e., a given choice of sets , , and ) as a function of the number of assets. We need to deal with “time per case” because several points of a given sample can correspond to the same case, and hence even starting with equal length samples, the number of processed cases may differ.

Our computations show that the CPU-time of KTEF-S depends polynomially (quadratically, in fact) on the number of cases arising from the input sample. For instance, Figure 2 shows that this function fits almost exactly its quadratic least square approximation in a -asset example. We have observed the same almost exact fitting in all cases we have checked. All the obtained parabolas have a very small second derivative. Table 4 shows some equations of the interpolating parabolas we have obtained.

Let us also remark that the CPU-time corresponding to the calls to KTEF is just a minor percentage of the total CPU-time. For instance, from the seconds used to process the different cases generated from a -point sample in the instance used to generate Figure 2, only correspond to the calls to KTEF. The rest corresponds to the computation of the change points.

7. Analysis of the Efficient Frontier

In this section, we present some examples illustrating the possibilities of analyzing the efficient frontier provided by our algorithms. The main idea is that when the efficient frontier is calculated by means of any of the usual interpolation methods, that is, mathematical techniques for obtaining in the simplest way a continuous or even smooth curve from a finite set of points, the only economical information contained in the result is a finite set of efficient portfolios, since the interpolating arcs have no economical meaning. This suffices to plot the frontier with enough accuracy so that an investor can choose a level of return taking into account the corresponding risk level. On the other hand, the interpolations made by means of our algorithms have a precise economical meaning since, starting from a sufficiently large sample of the frontier, they provide the exact frontier, specifically, a piecewise parametrization of the infinite set of efficient portfolios and in particular the change points, that is, the return values where the composition of the efficient portfolio changes.

This leads to the question of how many points are necessary in order to obtain the exact efficient frontier. In the continuous case, the KTEF-C algorithm determines the exact number of points that are needed, whereas in the semicontinuous case we cannot say anything a priori. Table 5 contains the number of arcs of parabola and the number of portfolio compositions found from different samples for different instances of (SCP). We have checked different instances (they are taken from the database used in previous section, except for the -asset case, which is considered in Example 7.1) of several sizes but, since there is no obvious way of aggregating the results, we have opted for showing a few representative cases. Notice that Table 5 shows that the complexity of the frontier is not proportional to the number of assets.

In all the cases, we have considered the difference on the number of compositions found from a -point sample and that found from a -point sample does not exceed two additional cases. This means that in the frontier calculated in the first case, there are a few (very small) intervals where a slightly better composition exists. It is clear that finding these small corrections does not compensate the additional computational effort required by the EC1 stage of the -constraint method (we note also that the number of intervals found does not seem to stabilize, but this concerns to the set of constraints being active for each return level, which is of minor interest for an investor). Hence, our computational results indicate that a -point sample is a reasonable size, at least for -asset instances. However, we must remark that, in practice, it is more convenient to draw a rough version of the efficient frontier so that the investor can choose the particular zone where he or she would invest, according to the tradeoff between risk and return he or she considers acceptable, and then calculate a, say, - or -point sample of that particular zone, providing easily a more accurate description of it that what we could obtain for the whole frontier from a -point sample. In this way, larger instances of SCP can be handled in a reasonable time. In any case, the fact is that postprocessing the sample by using the KTEF-S procedure instead of a typical interpolation offers many advantages with only a small additional CPU-time. The most immediate one is obtained at the very first step of the algorithm, where the sample is filtered to retain just one point for each Kuhn-Tucker complementary slackness case. For instance, as Table 5 shows, a -point sample is immediately reduced to a subsample with less than points without any loss of information at all, since the Kuhn-Tucker conditions applied to the reduced sample provide a representation of the efficient frontier which exactly interpolates all the removed points. Hence, our algorithm makes the previously discussed convenience of working with medium-sized or large samples viable. Moreover, our algorithm not only greatly simplifies the output of the standard -constraint method, but in fact it also structures and analyzes it, providing a functional structured efficient frontier, with exact values for the equations of the arcs of parabola, change points, Kuhn-Tucker multipliers, and so forth. Let us illustrate these facts by means of two specific examples.

Example 7.1. We have computed the efficient frontier of a portfolio selection problem with assets. We have used monthly data over the period January 2001–December 2008 from the Spanish Stock Exchange Interconnection System (SIBE) [20], which integrates the four existing security exchanges in Barcelona, Bilbao, Madrid, and Valencia (for the experiment we have used assets that have quoted every month from January 2001 to December 2008. Specifically, the assets are the following: ABE, ABG, ACS, ACX, ADZ, AGS, ALB, AMP, ANA, AND, ASA, AZK, BAY, BBVA, BDL, BES, BKT, BMA, BTO, BVA, CAF, CEP, CPF, CPL, CUN, DGI, DIN, EAD, ECR, ELE, ENC, EVA, FAE, FCC, FER, FUN, GAM, GAS, GCO, GUI, IBE, IBG, IDO, IDR, ITI, JAZ, LGT, MAP, MCM, MDF, MLX, MVC, NAT, NEA, NHH, OHL, PAC, PAS, PAT, POP, PRS, PSG, PVA, RDM, REE, REP, RIO, SAN, SED, SNC, SOL, SOS, SPS, STG, SYV, TEC, TEF, TST, TUB, TUD, UBS, UNF, UPL, URA, VID, VIS, ZEL, ZOT).

Continuous Case
We have considered a continuous instance in which each weight is bounded in the interval . Figure 3 shows the exact efficient frontier resulting. It comprises arcs of parabola over the intervals of expected returns and of risk levels.

After applying our algorithm, not only do we have the picture of the efficient frontier but also all the related data about the efficient portfolios and Kuhn-Tucker multipliers. This information can be used to perform a sensitivity analysis of a given solution. For instance, if we set a return level , the optimal portfolio contains the following assets: BBVA, BDL, CAF, CEP, CUN, IBE, POP, REE, RIO, SOS, STG, TEF, TST, UNF, UPL, and ZEL. However, in Table 6, we see that this solution is only valid over a very small interval of returns, namely . For in this interval, the efficient portfolio is given by the expression , where

For a return level , the efficient portfolio differs from the original one in four assets. This could be checked just by simply solving the problem for this value of . However, our additional computations allow us to trace the changes in the efficient portfolio as the return decreases. This is shown in Table 6, where we see that assets AND, MCM, and IBG enter the portfolio successively and that finally asset BBVA exits. On the other hand, the number of assets in the efficient portfolio also grows if we increase the return level, but this is just a local behavior, since, as Figure 4 shows, the number of assets globally decreases as the return increases. Our method guarantees that the analysis is exact and we can see that there are many unstable portfolios in the sense that a small change in may produce a change in the composition of the portfolio. This analysis can also be used to study the convenience of introducing cardinality constraints into the model. Moreover, the equations parametrizing the efficient frontier (also given in Table 6) provide a sensitivity analysis of the risk with respect to the return level.

Semicontinuous Case
Next we deal with the same data, but considering semicontinuous variables in the range and cardinality constraints specifying that the total number of assets in a portfolio must vary within the range 5–10. We apply KTEF-S to equally spaced samples of the efficient frontier. The number of intervals found is indicated in Table 5. Figure 5(a) shows the whole efficient frontier calculated from a -point sample. The calculations from a -point sample provide an almost equal picture, but the larger the sample, the more accurate is the structure we obtain. For instance, Figure 5(b) shows an enlargement of the neighborhood of the return value . We see that the convexity and the continuity that the frontier shows on a large scale fail when examined more closely, and these details are missed when considering a smaller sample.

We note that the economic theory about the portfolio selection problem relies partially on the continuity and convexity of the frontier, which is granted in the continuous case, but fails in the semicontinuous one, and hence it is relevant to know to what extent it can fail in the specific zone of the frontier where the investor intends to choose an efficient portfolio. On the other hand, if there are different portfolio compositions with similar levels of risk and return, an investor could prefer one of them for other reasons beyond these two values. Hence, knowing the variation in portfolio structures along the frontier is also relevant to making a sensitivity analysis of the problem.

Example 7.2. We now consider five assets from the historical data introduced by Markowitz [2], namely, American Tobacco, AT&T, United States Steel, General Motors and Atcheson, and Topeka & Santa Fe. We have established the bounds .

Continuous Case
For this kind of small problem there is no need to call any optimization package. We can apply KTEF to an enumeration of all possible cases for the sets and . More precisely, from the cases, many of them can be removed a priori since they are degenerate leaving just cases. After applying the algorithm, only provide a piece of the efficient frontier. Figure 6 shows the efficient frontier of the problem in which the six intervals are highlighted with dots. These are listed in Table 7 together with their corresponding sets and , as well as the equation of the corresponding piece of the efficient frontier.

From these equations, we can calculate the derivative of the frontier or, alternatively, notice that it is just the Kuhn-Tucker multiplier associated with the return constraint, which is also given by the algorithm. This derivative allows us in turn to study the smoothness of the frontier. Figure 6 shows also the derivative in the present example, and we see that it is discontinuous at all the points where the equation changes, which means that the efficient frontier is not smooth at these points. For instance, the left and right derivatives at the first change point are, respectively

The difference between them is small, so the discontinuity could be difficult to detect without an exact procedure. On the other hand, the jumps can also be large, as at the last change point, where we have

This means that starting from a return level near to , the risk of the optimal portfolio is especially sensitive to a small change in .

The algorithm allows us to calculate the Kuhn-Tucker multipliers. For instance, Table 8 gives the optimal solution for a desired return . It contains the optimal values for the variables as well as the minimum risk . The last row contains the (nontrivial) multipliers, for example is the multiplier of the return constraint, that is, the ratio between the increase in the minimum risk and the increase in the specified desired return. Notice that the multiplier of the capital constraint is of no interest since the constant on the right-hand side cannot be modified.

Semicontinuous Case
We have considered semicontinuous variables with bounds , and the cardinality constraint (4.2) with , . Applying KTEF-S to an equally spaced -point sample, we get the frontier shown in Figure 7(a). We know that this is in fact the true frontier since we obtain the same result if we apply the exact algorithm consisting of changing the first loop of KTEF-S by an enumeration of all the possibilities for , , , , . Figure 7(b) shows the true frontier together with a standard interpolation of the sample, and we see that there are some remarkable differences. The frontier consists of arcs of parabola with change points, such that the interval between two of them corresponds to an arc or to a vertical line.

8. Conclusions

Solving the Kuhn-Tucker conditions is a theoretical way of tackling the portfolio selection problem which can be used in very restrictive cases in practice, since it gives rise to nonacceptable exponential CPU-times. Our results show that, however, the Kuhn-Tucker conditions can be efficiently used as an interpolation procedure for the final stage of the -constraint method, since the CPU-times remain quadratic and, moreover, they turn out to be very small when compared with the CPU-time needed for the first stage.

The interpolation algorithms proposed here are very simple conceptually, and they can be implemented by short codes in any general purpose application like Mathematica, Matlab, and so forth (the most complicated operation to perform is the computation of an inverse matrix). This feature, together with the popularity of the -constraint method for graphing efficient frontiers, makes our method competitive even with the existing alternatives for the continuous case, since they require more complex implementations not easily available for the economist user that is not a specialist in computation tasks, which, with the aid of our proposal, can get much more than a graph with a relatively small additional CPU-time.

Moreover, our interpolation method has been shown to remain effective when applied to problems with semicontinuous variable and cardinality constraints. For this kind of problems, the -constraint is the only known applicable method, and it requires large samples to provide faithful graphs of the frontier. We have shown that, by means of our nontrivial interpolation method, large samples of about points of the efficient frontier are reduced to a set of less than intervals with its corresponding equations, containing even more information than the original sample.

In general, our procedure provides a simple, structured, analytical expression for the efficient frontier, which is easier to handle than the sample it is calculated from.

For small-sized problems, the analytical description of the frontier obtained with our method is exact, whereas for medium-sized instances, we have shown that a -point sample of the whole frontier (or a proportionally reduced sample of a part of it) provides a reasonable approximation of the exact frontier in the sense that larger samples provide very few additional portfolio compositions (valid for very short intervals) that do not compensate the additional computational effort.

In the continuous case, our method determines the minimal sample that is needed to obtain the exact frontier.

For very small instances (with no more than seven or eight assets), it can be adapted to an exact enumeration algorithm (not to be confused with KTEF-C or KTEF-S) that does not require any sample of the efficient frontier. This can be useful for academic purposes.

We have illustrated by two examples the advantages and possibilities provided by our proposal. In general, the computational results show that the shape of the efficient frontier is different in small and medium-sized instances.(i)For small-sized problems (notice that many private investors are interested in selecting portfolios from a small-sized set of assets), although they are computationally simple to handle, the shape of the efficient frontier can present many irregularities (discontinuities and sudden changes of slope) which must be taken into account since the risk of an efficient portfolio can be very sensitive to the selected expected return. Hence, the information provided by our method could make an investor move his or her choice to a safer or a more profitable one.(ii)As the number of assets increases, the efficient frontier becomes more regular at a large scale, and hence its “microscopic” irregularities are not relevant. However, very small changes in the selected expected return can alter the composition of the corresponding optimal portfolio. In this case, our procedure provides the investor with the intervals where each composition is efficient, so that he or she can select according to additional preferences among several options practically indistinguishable with respect to risk and return.

Appendix

Justification of the KTEF Procedure

Let us decompose an arbitrary vector , where is the vector consisting of the components with and consists of those with . In particular, we have , where is the vector of active bounds, that is, is (resp., ) if (resp. ). Similarly, we decompose where contains the rows and columns of corresponding to the indexes , contains those corresponding to indexes and has the rows of those and the columns of those . In these terms, (3.4) and (3.7) become

Similarly, the stationary point conditions for indexes become

Solving this equation gives

Premultiplying by and and using (A.2), we obtain, respectively, where

Solving (A.5) and (A.6) gives where

Of course, this requires not to be . In this way, notice first that since is positive-definite, and . Also

Hence, we will have provided that . But clearly, only if for a certain , and therefore (A.2) give .

Thus, we see that we are considering a degenerate case, since it can only provide a Kuhn-Tucker point for at most one value of (that given by the previous equation). Therefore, we can assume that .

Incorporating (A.8) into (A.4), we obtain , where Therefore, (3.8) is fulfilled taking , . Moreover, decomposing and , where (resp., ) denotes the part of (resp., ) corresponding to the indexes of (resp. ), we can calculate and from the stationary point conditions corresponding to the indexes in and , respectively,

So we have obtained a unique point satisfying the Kuhn-Tucker equalities for the fixed case. However, it must also satisfy the inequalities to be a Kuhn-Tucker point. In order to make these conditions explicit, let us show how the multipliers depend on . From (A.8) and incorporating (3.8) into (A.12), we obtain where

The Kuhn-Tucker inequalities are Equivalently,

These inequalities provide a finite set of lower and upper bounds for , which in turn determine a closed interval . We conclude that the current case provides a Kuhn–Tucker point just for . If , we will again have a degenerate case. Finally, the minimum risk for a given return within the interval is given by where

The KTEF procedure consists of these calculations arranged as the algorithm described before. Its inputs are the data of the model (3.1) together with the sets of indices and , and it provides among its outputs a parametrization of a piece of the efficient frontier of the problem together with the (possibly empty) interval in which it is valid.

Acknowledgment

This paper has been partially supported by project TIN2008-06872-C04-02 from the Ministerio de Ciencia e Innovación of Spain.