#### Abstract

Natural gas marketing has considerably evolved since the early 1990s, when a set of liberalizing rules were passed in both the United States and the European Union that eliminated state-driven regulations in favor of open energy markets. These new rules changed many things in the business of energetics, and therefore new research opportunities arose. Econometric studies about natural gas emerged as an important area of study since natural gas may now be sold and traded in a number of stock markets, each one responding to potentially different behavioral drives. In this work, we present a method to differentiate sets of time series based on a regression model relating price, consumption, supply, and other factors. Our objective is to develop a method to classify different areas, regions, or states into groups or classes that share similar regression parameters. Once obtained, these groups may be used to make assumptions about corresponding natural gas prices in further studies.

#### 1. Introduction

In the early 1990s, several regulations were passed in the US and the European Union [1–3] changing the way natural gas was marketed and traded. Particularly, this liberalization [4] effectively ended a period in which natural gas was a state-driven industry. The liberalization has also created the emergent natural gas markets, as well as a strong demand for models to better tackle the new problems and profit from this new setting [5, 6].

Owing not only to this liberalization, but also to the new local conditions that are more open to competition, new small players entered the natural gas industry, especially at the local scale. Indeed, the US has over 80 interstate, long-distance pipelines [7], serving different regions with various climatic, demographic, economic, and political circumstances. Natural gas usage in Alabama, for example, intuitively is not the same as in Oregon; thus the market dynamics of the fuel are also different, and this, we presume, should be reflected in some way in the econometric data of the states.

Not only macroeconomic trends, however, are affected by this setting. When doing cross-regions studies of various aspects of the supply chain, such as the forecasting of demand [8, 9], the balancing of the pipelines after imbalances have been created by the natural gas shippers [10–12], or the dynamics of interstate-intrastate systems [13], one has to take into account the existence of different markets. The existence of a common relationship between price and consumption of natural gas across several zones allows for strong claims of uniformity, which are useful when, for example, we are building scenarios for a stochastic problem. Indeed, if we manage to group the regions in clusters with similar price and consumption functions, we can reduce the number of variables needed in a scenario tree formulation [6, 14].

As such, we specify a regression function that relates many of the most relevant econometric figures for each of the 48 contiguous states of the American Union, modeling price as a function of explicative variables such as natural gas consumption, supply, and storage levels, as well as population (number of costumers), oil prices, temperatures, and production. The regression coefficients are then used to divide the set of states into several subsets or groups, obtaining a partition in which all the states in a group share the same regression parameters, and thus can be classified as an (implicit) market. The partition is made considering both statistical and nonstatistical characteristics of the obtained regression coefficients. The resulting partitions are next compared with others in their similitude and statistical significance, which would validate the goodness of the combination of the dendrogram and GRASP grouping methods.

This paper is organized as follows. The motivation and literature review on natural gas econometric regression is given in Section 2. Section 3 describes the way the regression function is derived, while Section 4 details the method for using the said function to perform the classification. Section 5 presents and discusses the results of the study, and conclusions are given in Section 6.

#### 2. Natural Gas Price-Consumption Model

This work was motivated by our previous research in the natural gas supply chain, specifically developing an optimization model that addresses issues in interstate pipelines. The data used in this model, however, came from different regions, and therefore the time series involved did not necessarily behave in the same way.

As an example, suppose we are trying to model a certain problem that involves forecasting the residential consumption and price of natural gas in the states of Washington and Oregon, that is, four time series. If the robustness of the model is also a concern, then we should additionally consider different forecasting scenarios. Even with only two possible forecasting scenarios for each series (high/low consumption or prices) this translates into possible behaviors of the econometric parameters. If consumption is expressed as a function of price, however, then the scenario tree has only branches. Furthermore, if the regression function for both states is the same, then the number of scenarios can be reduced to just two. As the number of states being modeled scales up, that is, there are more than two parameters of interest, common assumptions like those mentioned above help reduce greatly the amount of scenarios in a stochastic model, optimization, or otherwise.

As we studied particular sets of data, it was noted that historical data of consumption and price showed conspicuous properties that could be used for the sake of our aims. Even though these data collections were taken from different states, all pairs of time series showed elastic consumption/demand [15, 16]; exponentially growing price averages [15, 17]; and both series in every pair seemed to be highly correlated to each other.

Indeed, the possibility of characterizing one set of series as a (regression) function of the other was interesting, as it would reduce the amount of data we needed to consider when modeling optimization problems. It is, of course, a common practice in economic and managerial sciences to do that since, for example, demand data is simpler to work with than price data [18]. The latter is mainly because the demand is usually easier to predict, and its behavior is less chaotic than that of prices. Such historical relationships between price and consumption is a common topic of study in time series economic analysis [19], which is mostly performed with the inclusion of other explicative variables, such as the price of substitutes (electricity, coal) and weather conditions.

This is the case of several models where the calculation of elasticities is the primary goal of the study [20]. Log-linear models [21–24] are generally favored because of the ease they provide when computing elasticity figures. However, linear models also have applications in the natural gas industry, like the Short-Term Integrated Forecasting system (STIFS) used by the United States Energy Information Agency in order to estimate natural gas demand as a function of several types of important variables related to the energy industry [25].

##### 2.1. Former and Current Approaches

As explained in our previous work [26], a carefully designed regression function can help achieve such strong assumptions. Nevertheless, the study of such relationships and the possibility of forming state clusters based merely upon time series data analysis turned out to be interesting by itself, and we developed two different approaches to partition the collection of states. As we observed, neighboring states showed a large amount of diversity, yet different methods of grouping seemed to place certain states consistently together.

Two major areas of opportunity discovered were the design of the regression function, and the trade-off that each partition algorithm made use of.

Our previous paper [26] aimed at a very definite objective regarding the qualities of the regression model: it had to correlate consumption and price of residential natural gas series, using the former as the explicative variable because of the ease in its forecasting. The expression thus obtained served its purpose well, as demonstrated in its application to the optimization models in [27]; nevertheless, a more inclusive approach would involve series that comprise more information. Following the examples found in the literature and our own experience, we revealed that including more explicative series provided very good results in terms of regression fit. This has led to the model presented in the next section.

Coming back to the partitioning method, the two approaches presented before were as follows.(i)The first one is the Dendrogram Grouping Method, which “cuts” a binary tree (whose nodes represent regression parameters) based on how close to each other the parameters are with respect to a given metric function and a weight scheme for the entries. This method proved replicative and fast, but it does not provide statistical significance to the grouped states’ parameters (i.e., one state might find that temperature is a significant regressor, whereas some other state in the same group may not).(ii)Another one is a greedy heuristic that starts with a number of states called “group leaders,” and iteratively selects for each remaining state the group that suits the state best, based on its regression coefficient . Because of the large amount of regressions performed, this method was reported to be slower and subject to accidental fluctuations, but the final results always guaranteed that all states in one group shared the same significance in their parameters.

In the following sections, we explain how we have improved over our latest approach, adding explicative power and robustness to the partitioning method and, ultimately, creating a better technique to identify similar regions based on their econometric data.

#### 3. Regression Analysis

##### 3.1. Individual Multiple Linear Regression (IMLR)

Let be the total number of states, the number of observations per time series (months, in this case), the set of the 48 contiguous states of the American Union, the (discrete) time parameter, the differenced residential natural gas price in state at time , the differenced temperature, in Kelvin, shifted so that the minimum figure is , the differenced average spot price of oil in the US at time , the differenced number of residential consumers of natural gas in state at time , and the differenced consumption of natural gas in state at time .

Notice that all these series are* differenced*, or more precisely, lag- differenced from the original values. This is because the said original values all tested positive for unit roots in the advanced Dickey-Fuller test. In contrast to the original series, the differenced series prove to be stationary; hence we make use of the latter.

This is the linear model we devised to relate the above-described series:

We choose a Robust Regression Analysis using Huber weights to fit the series over traditional least-squares method due to nonnormality of the residuals experienced with the latter. Furthermore, due to the steps described in the next sections, heteroskedasticity would likely appear in the residuals once the pooling regression is carried on.

While most of the series were reasonably fit by (1), a couple of them showed very erratic behavior in either their natural gas price or consumption series. This is expected insofar economic forecasting is commonly subject to the large instability at time . As the driving force behind short-term fluctuations in natural gas pricing is consumer demand rather than production supply, price was shown to be a significant factor when describing market consumption.

The selection of the descriptive variables was made considering other consumption models in the literature, the available data, and the significance found in the preliminary regression analysis. In particular, electricity prices and the natural gas supply and production, as well as a time index, were tested but found not to be significant in most of the states. This was especially interesting in the case of electricity prices, which certain sources cite as usual descriptors for the natural gas demand, but which were found to be 0.05 significant in only 12 of the 48 cases and thus dropped from the model.

The consumption and price of natural gas are endogenous variables as both are correlated to system shocks, such as unstable governments or weather-related events. As an alternative to the use of least squares regression to fit the model given in (1), a two-stage least squares approach could be employed with such instrumental variables as the number of gas producing wells, reserve estimates, and underground storage, to name only a few. However, this approach is not considered here, because the response (reaction) time of consumers’ consumption habits to the shocks is much longer than that to the spot prices set by the market every day.

##### 3.2. Pooled Multiple Linear Regression (PMLR)

Now we address the issue of how one can use the same regression formula for more than one state, which would create several classes of states where demand responds to changes in the descriptors in a similar mode.

Assume that we have split collections of state time series into several classes, with the members of each class sharing a common set of regression parameters. Then the pooled data from the groups would be regressed at the same time, creating* pooled regressions*.

Let be a partition of , and consider the model:

Note that this model—called the Pooled Multiple Linear Regression (PMLR) model—has sets of parameters for each regressor variable, except for the intercepts , which we allow to be different for each state. In comparison, model (1) has sets of parameters.

How should one define the partition of the set of states? A good partition is expected to deliver groups of more or less congruent sizes, while maintaining a high individual value for each state. A good partition method should also be replicative (i.e., the same partition is obtained for the same group of states), be fast enough, and support the statistical significance.

#### 4. Dendrogram-GRASP Grouping Method (DGGM)

In this section, a combination of both grouping methods mentioned in [26] into a GRASP heuristic is proposed. The resulting technique inherits the replicative property of the dendrogram method, while retaining the statistical significance of the heuristic algorithm.

##### 4.1. Dendrograms

Dendrograms are binary trees in which two observation vectors and form the (sub-)branches of a higher branch , so that(i)these two observation vectors are “closer” to each other than to any other observation ,(ii) is not an observation* per se*, but a new, artificial vector formed by some linear combination of and .

The term “closer” is interpreted with respect to some metric (e.g., the Euclidean metric), while the artificial observations are produced by the weighted combination method. Once the dendrogram is formed, it is cut down from the root and thus generating (sub-) dendrograms with the branches resulting from the cut. The height of the cut is determined according to one of several criteria (the number of subdendrograms produced, the maximum allowed membership for the subdendrogram, etc.) The leaves pertaining to a given subdendrogram will pool their regression data together and form one group for the PMLR.

Previous experiments [26] have shown that what is called the “average euclidean” metric [28] delivers satisfactorily high levels with a better homogeneity in the resulting groups than other linkage function options.

##### 4.2. GRASP Heuristics

GRASP stands for Greedy Randomized Adaptive Search Process; it is a metaheuristic, that is, a general method designed to provide good—but not necessarily optimal—results in problems otherwise too complicated to find an optimal solution, especially combinatorial problems [29].

Summarily, our GRASP approach will start with a seed formed by several one-state groups; then, for each state, it will identify those groups that deliver higher values once the data for the current state is pooled with that of the group. This is called the Restricted List of Candidates or RLC. A group from the RLC is chosen at random, and the current state is added to , pooling its data with those already in the group. A number of swaps and movements are performed once the states are all in place, in order to try to improve the values of the resulting statistics .

It is important to note that setting the values for the GRASP routine is rather subjective, since there is no definite objective to be achieved. Indeed, one cannot determine what number of groups is optimal, or which way is the best to define the greedy function. For example, one could prefer to increase the grouped value in each group rather than the average of the individual s in that group, or vice versa. This is exemplified by the function where Here, , and is understood as the average of all of the observations belonging to if the latter is a state (e.g., ) or as the average of the observations of the states in , if the latter is a set of states (e.g., ).

For the local search, we handle the improvement function , which is used when deciding if it is convenient to move state from group to group . It is parametrized by the improvement weight :

##### 4.3. Dendrogram-GRASP Algorithm

The following algorithm is used to classify the set of 48 contiguous states of the United States into groups that share a common regression function.(1)Initialize the values for each of the time series in each of the 48 states. Set a seed size , a maximum number of groups , a RLC size , an individual/grouped weight , an individual/grouped threshold , an improvement weight , a relative improvement threshold , and a maximum number of local search steps, .

*Seed Selection*(2)Perform an IMLR on each of the 48 sets of time series, obtaining , .(3)Form a dendrogram of 48 leaves with the vectors , using the average euclidean mean as the linkage function, and cut it so that there are exactly subtrees.(4)Select the state with the highest from each of the obtained groups and call it the th group’s leader. Define the one-state groups obtained as the partition . All the nonselected (spare) states form the set .

*Greedy Process*(5)For each state in the set , (a)pool the data of with the data of each of the formed groups and perform a pooled regression; select a number of groups with the highest value of the greedy function and form the RLC;(b)choose randomly one of the groups from the RLC, for example, .(i)If none of the candidate groups in the RLC delivers and we have not yet reached the maximum number of groups , create a new group containing only , remove from the active set, and update all the parameters.(ii)Otherwise, assign to , remove from the active set, and update all the parameters.(6)All of the states are now partitioned into the groups, and we can begin the local search.

*Local Search*(7)For to ,(a)randomly select one of the formed groups, , and one state in that group, ; select another group, ; compute ;(b)remove ’s data from and pool the same data of with ; compute ;(c)if , remove from and return it to ; otherwise, continue.(8)Report the obtained groups as the desired partition.(9)End.

##### 4.4. Partition Similarity

To determine the similitude of two partitions, we will use an expression that, roughly speaking, counts the number of coincidences found in two partitions and divides it by the number of total possible coincidences, given the sizes of the groups in each partition. While there are many disputable ways to measure the similitude between partitions with a different number of elements, this method was chosen because of its normality. Indeed, it will always return 1 when both partitions are identical and will always return 0 when there are no coincidences between two partitions, that is, when no two states share a group in both partitions and no state is single-grouped in both partitions.

Let be two arbitrary partitions of the set of states, with , and .

The function defined by assumes the value 1 if group contains a single state in partition and this state also forms a group-singleton in partition .

For every pair of states, we will assess if they share a group in a given partition using the following function :

In case the function has the value of 1, we say that we have a (one-state) coincidence, which means that the state has been found incompatible with other states twice, no matter which method formed partitions .

Similarly, if the function returns 1 for two states* in a group from the partition *, we say that we have a (two-state) coincidence; that is, in both partitions, the two states are members of the same group.

To measure the number of coincidences between two partitions, we use the function: for , .

If the parameter equals 1, then the function counts the number of either type of coincidences that couples of states reveal in the group in comparison to the groups they belong to in the partition . Conversely, if , then we simply count the total number of possible coincidences for the states in the group . Note that the function is not necessarily symmetric with respect to the pairs of partitions: need not have the same value as .

The similitude function used is defined as follows:

Notice that if there is at least one group in either partition containing more than one element, then for that group is at least 1, whereas if there exists no such group in either partition, then and consequently for any . Therefore, the denominator is never 0, which makes this function well defined.

Lemma 1. *Let and be two partitions of the set , and let function be defined by (9). The following statements are true: *(1)*;*(2)* if and only if ;*(3)* if and only if there are neither one-state nor two-state coincidences between and ;*(4)*.*

* Proof. *(1) This is easy to see from the structure of the function.

(2) Let . If for some and , then . Otherwise, if the order of is greater than one, then the second term in (8) (the definition of ) assumes the same value no matter whether or . Therefore, the numerator and denominator in are equal.

Conversely, if there exists one such that for all , then is strictly less than . Since , it follows that the numerator in (9) (defining ) is strictly smaller than the denominator, and therefore .

(3) If there is at least one one-state coincidence, or a two-state coincidence, then the numerator in is larger than 0, and therefore .

Conversely, since is nonnegative for every value of , means that both terms in the numerator are zero, which is only possible if for every member of and ,* and * for every , which means that there is no coincidence of any type between these two partitions.

(4) The first inequality follows from the fact that both the numerator and denominator in (9) are positive. The second inequality comes from the same argument as in item (2); that is, the numerator is either equal or strictly less than the denominator.

#### 5. Experimental Results

This section presents the results of the numerical experimentation performed on a number of times series pertaining to each of the 48 data sets. The values for the historical natural gas prices, consumption, and number of consumers, as well as the oil spot prices were taken from the US Energy Information Agency, whereas the temperature figures for each state were obtained from the US Department of Commerce National Oceanographic and Atmospheric Agency [30].

##### 5.1. IMLR Results

The first step was to perform the IMLR for the 48 sets of time series; this provided the regression parameters for the dendrogram formation. The five time series corresponding to every state had 240 monthly observations each.

Individual regression models showed regression coefficients with the average of 0.77 and the minimum of 0.61. The normality and heteroskedasticity were not tested due to the use of Robust Regression with Huber weights. Randomness of the residuals was tested, and high values were found for many states.

##### 5.2. Dendrogram-GRASP Grouping Results

There are two main aspects we wanted to consider when evaluating the effectiveness of the Dendrogram-GRASP approach: how replicative it is, and how good a partition is produced. The first issue is evaluated by examining how good and how similar the partitions are that come from the same seed (as opposed to those that come from randomly generated seeds). The goodness of one partition is measured with the average group [state] coefficient of determination, [], calculated across all the groups [states] of the partitions.

There are, however, a number of different design parameters that should be included in the experimentation. Each experimental observation consists of the generation of 10 partitions, using the following parameters.(i)A seed choice: the dendrogram seed (DDR), a random seed common to all 20 partitions (FIX), and a random seed for each partition (RND).(ii)The individual versus grouped weight, , which determines what is more important when adding a new state to an existing group in the GRASP routine: values considered in the experimentation are (only the single states’ s are considered), 0.5, and 1 (only the groups’ s are important).(iii)The new group threshold, : the closer the value of to 1, the more likely new single-state groups will be created in the GRASP routine. The tested values are .(iv)The length of the restricted candidate list, : the values considered are .(v)The number of local search moves: .(vi)The local search individual/grouped weight, : considered values are .

The starting number of groups was fixed at 10, and the maximum number of groups allowed was set at 15. Each combination of levels was replicated 20 times. This resulted in 5760 experimental observations.

In each observation, we calculated the average similitude between the various partitions involved, as well as their similitude with a randomly created partition. The compared similitudes were as follows:(i)the average similitude of the dendrogram partition to each of the 20 GRASP partitions (DG);(ii)the average similitude of a random partition and each of the 20 GRASP partitions (GR);(iii)the average similitude of the 20 GRASP partitions among themselves (GG).

The first part of the analysis consisted in testing all the experimental observations. After that, only the most convenient levels were kept.

Tables 1 and 2 present a summary of the results of the experimental runs. The first three data columns show the average similarities for each of the three comparisons of interest, whereas the last two columns show the average of the individual and grouped coefficients of determination.

A quick look at this table suggests that the similitude figures are characteristically low: the average similarity of an arbitrary partition to a randomly formed one, calculated using all the observations, is 0.0947. This will be called the partitions’ randomness. If columns 3 and (particularly) 5 approach the average randomness for this experiment, the partition method is not very efficient. This especially concerns the cases , , and , whose similarity measures are fairly low. Luckily enough, in all these cases the average GG similarities were found to be statistically different (higher) than their respective GR similarities by making use of the Wilcoxon signed-rank (WSR) test.

The average values in columns 6 and 7 do not deviate much from the averages across all the observations, 0.602 and 0.624, respectively, with the exception of the grouped individual parameter for . It is clear that certain similarity values for some levels are consistently lower than others. There is, for example, a very large difference between the average DG similitude obtained using a DDR seed than using a RND or FIX seed and so on. Based on this, we decided to discard some of the levels whose averages are not only considerably lower, but also the observations for each level are determined to be different by a WSR test.

Now let us look at each of the level values we should consider to drop. The first level, the GRASP new group threshold , shows a very similar GG figure, and equally similar values. We decide to keep the factor levels intact, in case these figures change once other levels are removed.

Seeds are more difficult to assess. The FIX seed shows lower values than the DDG one, but still higher than the RND. Weight shows much better numbers in all but the grouped entry. Because of this, we pick it as the only label for the later study. On the contrary, is better at value 1, except again in the grouped column. This result for is very counter intuitive! However, the two values serve a similar purpose at different parts of the process, so this behavior might indeed be justified.

The factors and were introduced to add variation in the GRASP routine, and their results appear separated in Table 2. This is because, while their similitude values work in the same way as the other factors, the measurements per observation are not the average across all 10 partitions in the observation, but rather the maximum obtained. In a common GRASP routine, the process will be repeated several times and the best solution will be adopted. For our case, this means that we should choose the best of the 20 partitions in each observation, and this decision becomes the result for that observation. Arguably, both the individual and grouped average maximum coefficients of determination seem to show little difference. In particular, the differences are deemed not large enough to justify the trade-off with similarity in all cases. While this was expected from the extended RLC size, the poor results obtained by the local search suggest that we should rethink our local search procedure in the future.

Based on similarity alone, we decided to eliminate the poorest levels and kept only a single-group state list and a zero-swaps local search for the second part of the analysis. After deciding to drop several levels, we will rewrite the results table including only the accepted levels, to see how the figures change once the poorest results are winnowed.

The much smaller Table 3 is the consequence of fixing , , , and and eliminating the RND seed choice, which results in 100 observations. Now the similitudes look much better: we have the sample average of 0.438 and the maximum of 0.477, which means that, for the parameters chosen, the similitudes obtained are remarkably higher than the average randomness.

For the first factor, , the similitudes are of little difference, the same as the determination coefficients in all accounts. However, for the seed levels, the DDR seed clearly favors similitude between the seed and the resulting partition. Similitude among resulting partitions is also good at the RND partition, which could indicate the particular FIX seed was initially a bad choice when compared to either an average partition seed or one selected in a methodical way.

The coefficients of determination present a rather interesting development. The individual coefficients are decent enough when compared to the ones from the dropped levels, but there is a dramatic drop in the group figures , which decreased from an average of around 0.53 to as low as 0.299. This happens because, while focusing on similitude, we chose in favor of , which yields the mean of only 0.366, as opposed to the 0.706 value obtained after fixing . In Table 2, however, we see the greater max because it was relevant to that table. If we were to remake Table 3 using the value of for this level, similitudes would fall around 10%, but the average group determination coefficients would increase to roughly 0.43, which is much better than that with . Maximum values for the different s, correspondent to those in Table 3, remain mostly unchanged.

#### 6. Concluding Remarks

In this paper, we propose and justify a heuristic method to group several zones based on a regression function that estimates several factors related to the natural gas demand. The groups thus obtained share key information regarding the behavior of natural gas-related historic econometric data.

We start by developing a linear regression model that correlates natural gas historic residential consumption and several explicative variables, such as the residential price, number of consumers, and temperature. This model, inspired by several examples in the literature, fits well the time series employed and has good predictive power, but it is by no means the only one that can be used nor necessarily the best.

The results of each of the 48 regressions performed are then used to create dendrogram-based partitions, which are in turn used as the starting point in a GRASP routine. The latter, while tending to form rather dissimilar partitions (compared to the dendrogram grouping), has the advantage of adding statistical significance to all the regressions in all the groups formed.

We tested several parameters in an experimental design consisting of more than 4300 observations, six factors, and two or three levels per factor. Using ad hoc and nonparametric selections, we tried to obtain a good combination of parameters, namely, one that delivers high similitude between partitions obtained from the same seed and a satisfactory goodness of the pooled regressions.

Similitude is measured by a standardized function which equals 0 if there are no common groups between two partitions of a fixed set and 1 if both partitions are identical. We were able to obtain experimental conditions with similitudes (mostly) above 0.43, which are deemed good considering that the average randomness of a partition in the study is around 0.09.

It is encouraging that, using the regression function herein proposed, the GRASP routine worked well by itself and also when combined with the dendrogram partitioning method. Unfortunately, the inclusion of randomness did not provide for good results, as it offered no increase in goodness of the partitions but a considerable decrease in similitude when a long RLC was used. The proposed local search approach was found to have a negative impact on the similitude values, though not overly so. However, at the same time it did affect heavily the values of the grouped coefficients of determination when the maximum values were considered in the selection but the averaged values were looked into in the end results. The “goodness" of the regressions, as discussed, must then be judged with a more nuanced approach.

The entire work frame summarized here is intended to provide a way to identify individuals (states, in this case) with common econometric behavior among themselves by means of statistically significant information. Such results used to help us in the past in the context of optimization theory (by greatly decreasing the number of variables in stochastic problems), and we believe this technique has other applications in economic analysis.

The planned future work includes enhancing the robustness of the method by designing better GRASP RLC and local search procedures, trying sampled regressions when forming large groups to gain on time and studying how different data sets and regression models would work in combination with the Dendrogram-GRASP approach proposed in the paper.

#### Conflict of Interests

The authors declare that there is no conflict of interests for any of the authors of the paper.

#### Acknowledgments

The research activity of the first author was financially supported by the R&D Department (Cátedra de Investigación) CAT-174 of the Tecnológico de Monterrey (ITESM), Campus Monterrey and by the SEP-CONACYT Project CB-2008-01-106664, Mexico. Also, the work of the fourth author was supported by the National Council of Science and Technology (CONACyT) of Mexico as part of the Project CB-2011-01-169765, PROMEP 103.5/11/4330, and PAICYT 464-10.