Abstract
We explore a portfolio constructive model, formulated in terms of satisfaction of a given set of technical requirements, with the minimum number of projects and minimum redundancy. An algorithm issued from robust portfolio modeling is adapted to a vector model, modifying the dominance condition as convenient, in order to find the set of nondominated portfolios, as solutions of a bicriteria integer linear programming problem. In order to improve the former algorithm, a process finding an optimal solution of a monocriteria version of this problem is proposed, which is further used as a first feasible solution aiding to find nondominated solutions more rapidly. Next, a sorting process is applied on the input data or information matrix, which is intended to prune nonfeasible solutions early in the constructive algorithm. Numerical examples show that the optimization and sorting processes both improve computational efficiency of the original algorithm. Their limits are also shown on certain complex instances.
1. Introduction
Markowitz provided one of the first comprehensive theoretical frameworks for the portfolio selection problem [1]. In his proposal, each portfolio is evaluated in terms of the expected return and risk. Then, the efficient set, or efficient frontier, corresponds to all portfolios with the largest expected return, given a level of risk. From this framework, the expected return is usually evaluated as the weighted sum of the expected return from each project in the portfolio, while the risk value is evaluated by the variance of the portfolio. Thus, the investor may select the portfolios in the efficient frontier that best match her/his needs.
In portfolio selection two vectors are defined [2]. First, the investment proportion vector corresponds to the proportion of money that the investor accepts to invest on each member of a set of securities (projects). The criteria vector, instead, contains the values of measures evaluating the portfolio. In this sense, an efficient portfolio, in terms of the first vector, is a nondominated portfolio, in the sense of the second one. A multicriteria portfolio selection problem supposes a criteria vector with three or more criteria [3], which is expected to be more difficult in terms of computing of nondominated portfolios. However, below, we show that depending on what it is to be taken into account as an evaluation measure, even with two criteria the generation of portfolios is a hard combinatorial problem.
Since the portfolio selection problem intrinsically incorporates business criteria, budget restrictions, and returns volatility [4], in the literature the problem is formulated as the maximization of the expected return, under uncertainty of returns. When the multicriteria version is considered, several utility functions need to be maximized, subject to constrains defining the feasible portfolios [1, 4, 5]. In this context, nondominated portfolios may be computed using multiobjective algorithms [6], evolutionary methods for multiobjective models [7, 8], or preference programming [9].
In this article, we focus on the portfolio generation process, when business, budget, or even volatility information is poor. Several stringent situations obligate to split the project portfolio selection process into at least two phases: technical and business concerns. We place ourselves in the first phase and formulate our problem in terms of satisfaction of a given set of technical requirements, with the minimum number of projects and minimum redundancy. This is a particular viewpoint where the selection problem is stated. In fact, as explained below, we are concerned with the generation of interesting portfolios more than with the problem of choosing the best portfolio.
Therefore, we place the problem in a very early phase of nondominated portfolios identification. In order to fix the ideas, a general multicriteria decision aiding (MCDA) framework is chosen [10], centering our attention on the problem formulation stage, which supposes the definition of the set of potential actions (i.e., alternatives, candidates, or decision subjects) [11]. Nevertheless, as given in many practical situations, we assume that actions are not welldefined elements at hand. Indeed, Simon was one of the first scientists proposing that actions do not come perfectly defined and represented in decision processes [12]. Thus, when actions are not given in advance, a searching process must be activated in order to discover or design them. In some situations, except for abstract or very elementary decision processes, a potential action must be constructed from objects available in a repertory of primitive elements. The project portfolio problem parallels this situation very well. Stipulated as a management activity, portfolio selection is the processes of conceiving portfolios from discrete, even interrelated, projects.
We propose an approach to support exploratory search of an analyst, aiding him/her to identify a restricted list of nondominated interesting portfolios from a projects set, satisfying the whole set of requirements. Observing that the most relevant projects in a portfolio do not necessarily correspond to those satisfying the largest number of requirements, but rather to those blending well with other projects, we propose an algorithm that provides a list of all potentially interesting portfolios, based on requirements coverage and minimal collective redundancy. The approach is applicable in situations where detailed business information is poor and nondominated interesting portfolios could be analyzed in further stages of development, when better information is available [13].
This article is organized as follows. In Section 2, the problem stated is modeled as a bicriteria integer linear program and a constructive procedure allowing exploring a set of projects in order to identify nondominated portfolios is defined. A proposition guaranteeing that such procedure converges is proved. Next, an algorithm grounded on the constructive approach is introduced in Section 3. An improvement to this process is proposed, realizing that the approach is sensitive to initial values obtained on one of criteria of the ILP model. In Section 4 our approach is tested and results are compared to those obtained in a previous work. Finally, Section 6 is dedicated to conclusions.
2. Model Formulation
Let us consider a manager involved in a portfolio selection and composition process, looking for different combinations of projects, satisfying some technical requirements. In addition, let us assume that she/he prefers to initially explore a restricted set of projects, only looking for some interesting portfolio alternatives. The small exploration of pieces reveals to her/him possibilities to further consider them with new business criteria (costs, benefits, profit, etc.) or market related features (expansion, collocation, territorial coverage, etc.), among others. Therefore, the problem consists of identifying the set of portfolios satisfying a given number of requirements defined by the analyst, knowing that a systematic exploration might be a very hard task, even not reasonable. In such a context, is it possible to aid the analyst in the construction of early interesting solutions, using limited information?
Methodologically, this is a decisionmaking activity unfolding the following phases [11].
2.1. Problem Formulation
The problem formulation is a statement defining the triplet , where is a set of actions, in our case portfolios, is a set of points of view (e.g., dimensions) considered to evaluate elements of , and is a statement defining what would be done with elements of (selection, ranking, classification, etc.).
2.2. Evaluation Model
Formally, an evaluation model is a tuple , where is a set of criteria, eventually derived from , allowing the evaltation of elements of in terms of each criteria; models uncertainty regarding available information in ; and is an aggregation logic defining the way that the information concerning and is operated in order to obtain a global conclusion solving the problem . The evaluation model produces a process output .
2.3. Recommendation
The output of the evaluation model is translated into the decision maker’s language, verifying that it is technically sound and deployable in the decision maker’s setting and processes.
In the problem formulation phase, is frequently supposed to be a known fact or the result of a modeling task, as usually done in linear programming, for instance, when a set of feasible alternatives is modeled by linear restrictions. Under such assumption, elements of become the matter of analysis, evaluation, and recommendation. Then, further phases may be applied in a regular way.
In our case, a portfolio is not known in advance and becomes an alternative once it has been conceived or designed as a project composite. We restrict ourselves to the problem of definition of a set of portfolios, considering that a pool of projects is given a priori. Then, a portfolio will be a subset of components covering a set of predefined requirements.
Finding of nondominated portfolios is formulated here as a bicriteria integer linear program. Let us consider(i) is the collection of projects;(ii) the set of possible portfolios, , ;(iii); matrix of projects and requirements;(iv) if covers the requirement otherwise.(v) if was included in a feasible solution, and if not;(vi) times th requirement is covered.
The following program is a model of the problem, where the whole set of nondominated portfolios simultaneously minimize the number of projects in a composite and the portfolio redundancy; that is, the number of times that requirements are of covered by more than one project. Consider Inspecting this model we observe that taking into account only the objective (), this program corresponds to an instance of the set covering problem, well known to be NPhard. In consequence, there are instances where this program cannot be solved by a traditional approach as the branch and bound method [14]. Instead, we propose an algorithm based on a constructive procedure adapted from a preference programming approach presented in [9], where an efficient algorithm aiding to find robust nondominated portfolios has been introduced. Such an algorithm was originally presented in the context of imperfect information regarding evaluation of projects on a number of continue value functions and their respective weights. Interestingly, the algorithm was also based on the progressive generation of nondominated portfolios, which could be split in two phases: generation of candidates and pruning.
Proposition 1 below guarantees a procedure for finding every nondominated solution in the Pareto front, grounded on the twophase approach described.
Proposition 1. Let , be a set of projects and a portfolio having at most k projects. Equally, let and be defined as the redundancy and coverage levels of , the set of requirements to be satisfied, and the set of nondominated portfolios in given the sets , defined as follows: Then
Proof. Let us proceed by induction. Thus, assume that contains the whole set of nondominated portfolios bringing together 1, 2, or size projects. By construction, contains the whole set of feasible and unfeasible project portfolios improving, or at least equaling, the minimum redundancy value computed in the stage. Any portfolio worsen this value is not included in this collection. Therefore, . Let be defined as
Then, it is clear that and , . In consequence, is set of nondominated portfolios discovered until the stage . In the th round, will contain the whole set of nondominated portfolios.
In the next section, we present an algorithm based on Proposition 1. Further, an improvement is introduced noticing that the progressive construction and pruning of project portfolios may be accelerated when a convenient upper bound is set for and a convenient sorting is applied on the information matrix.
3. Algorithm
The main purpose of the algorithm presented in this section is the identification of the whole set of nondominated portfolios. Different algorithms and approaches have been proposed for this task [14]. However, we focus on a solution provided by [9], which we have adapted as a strategy for the program (1). It is based on the candidates building and pruning processes implementing a constructive generation of project portfolios. In this algorithm, in order to generate candidates potentially selected as feasible solutions, any portfolio having a redundancy greater or equal than the minimal redundancy, found at the current level of the procedure, is pruned. Next, potential feasible solutions are compared to nondominated candidates, which have been found in the precedent iteration. The algorithm is defined in Algorithm 1.

The function candidates (Algorithm 2) is the generation module in this approach. It is interesting because we could change it in order to alter the behavior of the algorithm [15]. Notice that an initial redundancy value is set at , a number big enough that will be modified the first time a feasible solution is found (i.e., a portfolio covering ). Actually, the algorithm progressively generates nonfeasible candidates until a feasible solution is found, which sets the first values for the minimal redundancy and the portfolio size. However, such behavior implies that, depending on the coverage structure of components over requirements, these initial values could be identified after a very expensive searching process.

Indeed, the procedure could be improved if a convenient initial value for was known at the very start of the algorithm. Then, let us consider the following program:
In this case, we enforce the monocriteria version of the bicriteria ILP program, where its optimal solution does not necessarily solve the original problem, but it gives a good upper bound for the redundancy and the portfolio size. In what follows a comparison between the situation with and without an upper bound is analyzed. Results for both programs are presented in Section 4.
4. Results
In order to know if differences exist between the original algorithm and the version with an upper bound, both applied on model (1), a group of instances has been defined. An instance corresponds to a set of projects, requirements, and an information matrix . According to exploratory results [15], the instances are tested against different ratio of zeroes, or density, in : 50%, 75%, and 80%. For each instance and expected ratio of zeroes, thirty random matrices have been filled using a Montecarlo process, agreeing to the given density value. In consequence, each instance has been simulated thirty times and the average time to solution, measured in milliseconds, has been calculated. A MacBook Pro I5, 8 Gb RAM, and 2.3 Ghz was used for the experiments.
Average time to find the Pareto front for the algorithm with the upper bound (WUB) and the algorithm without that bound (NUB) is presented in Table 1. We compare these results with those found with an Aprioribased algorithm [13, 14]. In this algorithm, which we call AP, the pruning rules are the same: minimum number of objects and redundancy. Only average time less or equal to 1 minute is reported, which emphasizes cases where both or one of algorithms respond in a very short period of time. Cases where no entry is shown for an instance mean that the respective model is not capable of solving the problem in such time.
When a particular simulation of the WUB algorithm is considered, the initial value of redundancy is selected as the optimal value obtained in the respective simulation run in NUB. In this way, we enforce the WUB model to present its best behavior. As expected, the more requirements or number of projects increase, the more time to solution rises. Results show that the WUB model outperforms the others, except in one case (underlined). Density (the ratio of zeroes) is an important condition for algorithms. NUB and AP are very sensitive to that feature, but it allows us to consider how this density acts on the algorithm performance. We hypothesize that distribution of zeroes in the information matrix may be important, because it determines the way that the first feasible solution is found.
In order to know how the distribution of zeroes impacts algorithms, a sorting process is applied on the information matrices for every case of the experiment as follows.(i)Requirements (columns) are sorted from left to right, according the number of times they are covered.(ii)Projects (rows) are sorted in descending order according to the number of requirements they cover. Therefore, two processes are added to the analysis: the WUB and NUB algorithms, where an early sorting process is applied to the current information matrix, named WUB/Ord and NUB/Ord, respectively. In Table 2, results for WUB and the new processes are compared.
It is observed that WUB/Ord performs better than other algorithms. This suggests that sorting could have good effects on results. In Table 3, WUB/Ord is compared to other two versions of the Apriori algorithm, the first with a sorting process and the upper limit for redundancy (WAP/Ord) and the second with the sorting process but without the upper limit (NAP/Ord).
These results show that WUB/Ord is not the best method in all cases. For example, it does not solve some instances solved by the Apriori based processes. Conversely, WUB/Ord may outperform these processes in different situations. In general, we conclude that the distribution of zeroes in the information matrix is a critical issue. Additionally, the sorting process is a good strategy, but it does not scale in front of sparse structures (e.g., ).
5. Discussion
Results show that performance of the algorithms depends on the number of projects, the number of requirements, and the distribution of zeroes in the information matrix of a given instance. A simple strategy to apply consists of having small size instances. This approach may be found in other studies. Actually, the portfolio composition problem has been modeled using multicriteria frameworks. The rationale of these methods consists of identifying interesting projects and next composing portfolios with them.
In [16] a twostage combination of discrete and continuous multicriteria decision aid methods is proposed for mutual funds selection and composition. In the first stage, a multicriteria decision method is used to select the most promising mutual funds. In the second stage, a goal programming approach is applied in order to search for the best proportions of mutual funds in the final portfolio. Criteria used for selection and composition are grouped into three categories: criteria regarding expected outcome of investment in a mutual fund; criteria measuring risk to obtain an outcome; and criteria about efficiency of mutual funds. In [5] a twolevel process is applied for selection and composition of portfolios from a set of projects. In the first level, the ELECTRE TRI decision aid method is used to sort projects according to given categories, for instance, good, average, and bad projects. In the second level, a portfolio on each category is identified as the list of projects satisfying specific constraints.
In [4], the selection and composition portfolio problem is also solved in a twostage process but in a different way. First, a multicriteria decision analysis is applied to evaluate projects. Next, a knapsack optimization problem, where portfolio benefit is maximized, subject to a budget constraint, is solved to find an efficient portfolio. The optimization problem is included in an algorithm searching for the set of efficient portfolios.
Several aspects of techniques mentioned above may be relevant for our purpose. For instance, following [16] or [5], a process may be applied to filter noninteresting projects and reduce the size of the instance to be analyzed. In addition, according to [4], the constructive algorithm could be implemented as a succession of optimization problems aiding to find the Pareto frontier. However, these procedures do not necessarily allow detecting the whole set of efficient solutions. As an illustration, let us consider Table 4, where the information matrix of an instance composed of eight projects and eight requirements is presented. This problem has three solutions, each one equaling a redundancy value of 3: , , and . In this case, any of the exhaustive algorithms we have proposed above allows identifying these portfolios, while an optimization problem finds just one of them.
Our main purpose is to identify interesting solutions, before expending resources and time for information concerning projects and portfolios. Thus, the problem we have established here consists of having a reduced number of portfolios and projects to which limited resources may be destined for more exhaustive analysis. Indeed, we could classify interesting projects as follows [9]:(i)core: which are projects present in every nondominated portfolio,(ii)borderline: projects present in some of the nondominated portfolios,(iii)exterior: projects excluded from any nondominated portfolio.
For instance, in the previous example, no project in nondominated portfolios is core but borderline. In general, we assume that the best is to use resources and time for better information on core and borderline projects. In other words, the portfolio problem posed by Markowitz could be applied on these interesting projects.
Efficiency is a critical issue for algorithms. As observed in Table 3, the WUB/Ord method works very well whether an information matrix is not sparse, or equivalently, it is filled at least at 50. In [9] 50 projects were analyzed using the original algorithm we adapted here. In that case, projects were evaluated with regard to four criteria, subject to a budget constraint. They showed that the resolution of this problem was time consuming, but if an initial interesting nondominated portfolio was introduced early in the algorithm, the time of resolution improved. This idea is used in the WUB/Ord method, finding the upper limit for the redundancy value, which confirms that such strategy works fine even for the discrete model proposed in this paper.
6. Conclusions
A portfolio constructive model has been proposed, formulated in terms of satisfaction of a given set of technical requirements, with the minimum number of projects and minimum redundancy. The approach aids to support exploratory search of an analyst, aiding him/her to identify a restricted list of nondominated interesting portfolios from a project set, satisfying the whole set of requirements. It is argued that the approach is applicable in situations where detailed business information regarding projects is poor or difficult to obtain, given resources available. Nondominated interesting portfolios could be analyzed in further stages of development, when better information is obtainable.
Resolution of an integer linear program allows finding a first optimal solution, used as a feasible result aiding to construct new nondominated solutions. Numerical examples show that this process improves the computational efficiency of an original algorithm. We have found that the ratio of zeroes in the information matrix appears as a critical issue because it determines that the time passing before the first feasible solution is found by the algorithm. Therefore, a sorting process is proposed, which improves time to find solutions.
Further research needs to be done in order to test pruning with different distributions of zero in the information matrix. Additional research also includes analytical and simulation studies concerning the effect of metaheuristics (or hybrid methods) [17] and hyperheuristics [18] on quality and time to solution.
Conflict of Interests
The authors declare that there is no conflict of interests regarding the publication of this paper.
Acknowledgments
Broderick Crawford is supported by Grant CONICYT/FONDECYT/REGULAR/1140897. Ricardo Soto is supported by Grant CONICYT/FONDECYT/INICIACION/11130459. Javier Pereira and Fernando Paredes are supported by Grant CONICYT/FONDECYT/REGULAR/1130455.