Advances in Operations Research

Advances in Operations Research / 2012 / Article

Research Article | Open Access

Volume 2012 |Article ID 904797 | https://doi.org/10.1155/2012/904797

Konstantinos Salpasaranis, Vasilios Stylianakis, "A Hybrid Genetic Programming Method in Optimization and Forecasting: A Case Study of the Broadband Penetration in OECD Countries", Advances in Operations Research, vol. 2012, Article ID 904797, 32 pages, 2012. https://doi.org/10.1155/2012/904797

A Hybrid Genetic Programming Method in Optimization and Forecasting: A Case Study of the Broadband Penetration in OECD Countries

Academic Editor: Yi-Kuei Lin
Received23 Mar 2012
Revised21 Jun 2012
Accepted22 Jun 2012
Published02 Oct 2012

Abstract

The introduction of a hybrid genetic programming method (hGP) in fitting and forecasting of the broadband penetration data is proposed. The hGP uses some well-known diffusion models, such as those of Gompertz, Logistic, and Bass, in the initial population of the solutions in order to accelerate the algorithm. The produced solutions models of the hGP are used in fitting and forecasting the adoption of broadband penetration. We investigate the fitting performance of the hGP, and we use the hGP to forecast the broadband penetration in OECD (Organisation for Economic Co-operation and Development) countries. The results of the optimized diffusion models are compared to those of the hGP-generated models. The comparison indicates that the hGP manages to generate solutions with high-performance statistical indicators. The hGP cooperates with the existing diffusion models, thus allowing multiple approaches to forecasting. The modified algorithm is implemented in the Python programming language, which is fast in execution time, compact, and user friendly.

1. Introduction

Many methods have been proposed for predicting the penetration of new technology in a community. The subject has been described and analyzed by worldwide literature, extensively [16].

Examples of the above methods are the diffusion models for the adoption of new technologies. The diffusion models are mathematical functions that follow an S-shaped curve in time. The diffusion models used in this study are the Gompertz, Logistic, and Bass [4]. The parameters of the models have been estimated by regression analysis [5].

Genetic algorithm (GA) is a probabilistic search method which uses the Darwinian principle of natural selection in finding an appropriate solution of a specific problem [7]. GP is more general than GA, because the produced solution corresponds to a new program [8]. The implementation of genetic programming (GP) in optimization problems has produced some important forecasting tools [7, 8].

Generally, a GP begins with a set of initial randomly chosen functions (solutions) and this set is called population. A chromosome is a program solution of GP. Each solution has a fitness value, and this chromosome’s fitness is evaluated. The next generation is the resultant of the Darwinian selection process. In this process, the best chromosomes, according to their fitness values, are selected for the next generation. Some of the selected chromosomes are randomly combined (crossover) and generate new chromosomes (offspring). The mutation process also occurs, according to which a part of a randomly selected chromosome is changing. Finally, the chromosomes with better fitness values have better probability of being selected. The chromosomes of the new generation have better overall fitness value than those of the past generations. The whole process is repeated until an end condition has occurred [8, 9].

This paper is structured as follows. At first, the diffusion models are shortly presented. The basic structure of the new modified GP (hGP) and its analysis follow. The next section contains the results analysis, and, finally, the conclusion is presented. In the Appendices A, B, and C, the syntax of some produced models and the statistic indicators as well as the estimation formulas of the modified GA are provided.

2. Diffusion Models

Typically the diffusion process of innovative technologies in a society, such as broadband access, follows the sigmoid curve [6]. In this paper, some well-known diffusion models have been used as follows.

2.1. Logistic Model

The logistic diffusion model is the most known model among the sigmoid curves. The model is described by (2.1): where the diffusion of a new product in a society at time is presented as . Also, is a time-dependant function, and are constant parameters. The parameter is the limit of the function . When time , then [5].

2.2. Gompertz Model

The Gompertz model was introduced as a law of mortality [10]. The equation that we use is the known format of (2.2), as follows:

We can also use the alternative format, Gompertz with constant, which is described in (2.3): where is a time-dependent function and are constant parameters [5].

2.3. Bass Model

The Bass model proposes that the market for a new product consists of two major categories: innovators and imitators.

Firstly, the innovators purchase the new product, and the imitators follow afterwards.

This model’s function is an extended logistic curve, where the cumulative adoption of the new technology for time is presented in (2.4):

In (2.4), parameter corresponds to initial purchasers of the new technology product. Parameter is the sum of the innovators and imitators coefficients, and , respectively. This is presented as .

C parameter is , where is a constant and parameter is [4, 11].

3. Genetic Programming Method

The specific hGP implements a hybrid strategy, which consists of two parts, the nonlinear regression analysis and the modified genetic algorithm part. The flowchart of Figure 1 shows the parts of the hGP.

Firstly, a random set of solutions is created. Simultaneously, the candidate diffusion models are optimized by regression analysis and the produced models are inserted into the initial set of solutions. Now, the initial population of the hGP is prepared. Then, each solution is evaluated by its fitness function and the best solutions are inserted into a sorted list. The first, initial generation is ready. The termination criterion of the program is the maximum number of generations. After that, the selection of the best solutions takes place. It consists of the random combination of the best solutions, according to the crossover law and the random change of another randomly chosen solution. These solutions are going to be optimized by regression, and the next generation is ready. Finally, the whole process is repeated.

3.1. Solution Representation

In GP, each chromosome is not a fixed-length character string but a program that is a possible solution to the problem [12]. A chromosome in hGP, specifically, is presented as a string of characters in the Python programming language or as a parse tree. For example, the chromosome is presented in Figures 2 and 3 as string and parse tree, respectively [12].

The functions that are used in hGP are the addition (+), subtraction (−), protected division (/), multiplication (*), and exponential (), because these are the same with the diffusion models’ functions.

The parse tree consists of nodes. The terminal nodes are referred to as leaves. The leaves of the tree contain the variables or the constants of the program. As we can see the nonterminal nodes of the tree correspond to the function set of the hGP.

3.2. Initial Population

The initial population is generated by the fusion of a randomly produced number of chromosomes and the diffusion models which are optimized by regression analysis. The initial diffusion models’ chromosomes are the optimized Logistic, Gompertz family, and Bass models. It should be noted that the diffusion models may differ according to the problem. The parameters of the initial diffusion models are optimized by nonlinear regression analysis, and the Levenberg-Marquardt algorithm has been used.

From a programmer’s point of view, each chromosome is a string data type which is presented as a parse tree in the internal structure of the Python programming language. The population appears as a list of strings. This list corresponds to the first generation for the program.

3.3. Evaluation

During the evaluation process, we can use many functions as fitness indicators (with real data). In the specific implementation, we use two different fitness functions as follows.

In the fitting process, each chromosome is evaluated with the sum of squared error (SSE), as in Appendix C in (C.1). In forecasting, the evaluation function corresponds to the weighted sum of squared error (wSSE) function, as in (C.2).

3.4. Selection

In hGP, the chromosomes with the best values of the fitness function (smaller than a precision limit) are inserted into a Python’s list. The members of the list are sorted according to their fitness value. In Figure 4, we can see the structure of the sorted list. Then some randomly chosen chromosomes will be selected in order to produce the next generation using crossover and mutation processes.

3.5. Crossover

As it has been aforementioned, in the implementation of hGP, each chromosome is a string data type or a parse tree in the internal structure of the programming language. In the crossover process, two parents are randomly selected from the chromosomes’ list with the best fitness value.

A crossover point is randomly chosen in the first parent string. Also another crossover point is randomly chosen in the second parent string. The first child (offspring) is generated when the substring of the first parent, which begins at the crossover point, is replaced by the substring from the second parent, which begins at the second crossover point. The second child is generated by the crossing over of the other parts (substrings) of the parents’ strings. In Figures 5 and 6, the crossover operation is presented.

3.6. Mutation

In the mutation process, a chromosome is randomly chosen from the chromosomes list, with the best fitness value. The substring point, in which the mutation will take place, is also randomly chosen. The mutation replaces the function (+, −, /, *, ) which is presented at the mutation point with a new random function in the substring.

The mutation operation is presented in Figures 7 and 8.

4. Results

In this section, the dataset, the statistic indicators, and the results of the study will be presented. The results will also be analysed in order to provide a satisfactory prediction for the broadband penetration in OECD countries.

4.1. Dataset

In this study, the proposed method has been implemented on two different datasets. The first dataset presents the overall OECD broadband penetration. The data came from the OECD portal [13], which presents the total fixed (wired) broadband penetration in OECD countries. According to OECD reports, the overall broadband penetration is rapidly growing. The dataset concerns the time period from the second quarter (Q2) of the year 2002 until the forth quarter (Q4) of the year 2010. The dataset is comprised of 18 data points.

The method is also implemented on the second dataset which concerns three innovators countries in fixed (wired) broadband technology adoption [14], namely, Sweden, The Netherlands, and Denmark. This dataset concerns the time period from the second quarter (Q2) of the year 2001 until the fourth quarter (Q4) of 2010 that is 20 data points. Thus, the fitting and forecasting ability of hGP are both tested in markets that have reached or are going to reach the saturation point of the penetration curve.

4.2. Statistic Indicators

As mentioned above, the fitness function of each individual is the sum of squared error (SSE) for the fitting process. The results are analyzed by the estimation of some widely used statistical indices. The main indices of our analysis are the mean absolute percentage error (MAPE), the mean square error (MSE), the mean absolute error (MAE), and the root mean square error (RMSE).

The statistic indicators of MAPE, MSE, MAE and RMSE are presented in Appendix B.

4.3. Fitting Results

As aforementioned, the statistic indicator used for the fitting process is the SSE. In this section, the fitting performance of the generated models by the hGP as well as the comparison of the optimized diffusion models with the hGP’s models is presented. Both hGP and optimized diffusion models concerning the fitting performance are presented in Appendix A.1.

4.3.1. Fitting Results for the Overall OECD Broadband Penetration

Table 1 contains the initialization parameters for the execution of hGP concerning the whole dataset of the OECD countries.


Parameters of the hGP

Maximum number of generations500
Evaluation functionSSE
Upper limit of the precision for the candidates for crossover and mutation0.5

An example of the fitting performance for the first five hGP models, according to their fitness value (SSE), is presented in Figure 9. The graphs represent the broadband penetration percentage (-axis) and time (-axis) as date. Each time point (-axis) corresponds to a six-month period. The relative statistical indices SSE, MAPE, MSE, RMSE, and MAE of the produced hGP models are presented in Table 2.


Statistical indices of the hGP for OECD overall dataset
Model nameMAPESSEMSERMSEMAE

hGP Model 10.007863 0.0015860.001301
hGP Model 20.009163 0.0016530.001271
hGP Model 30.010628 0.001660.001363
hGP Model 40.010628 0.001660.001363
hGP Model 50.010628 0.001660.001363

Also, the fitting performance of the optimized diffusion models, according to their fitness value (SSE), is presented in Figure 10 and their statistical indices in Table 3.


Statistical indices of the optimized diffusion models for OECD overall dataset
Model nameMAPESSEMSERMSEMAE

Logistic0.0155440.000134615 0.002734710.00211296
Bass0.0155440.000134616 0.002734720.00211297
Gompertz with constant0.0227550.000166627 0.003042540.00249006
Gompertz0.0339510.000316133 0.004190820.0037315

When considering Tables 2 and 3, the hGP method achieves better statistical indices than those of the optimized diffusion models. For example, the first hGP model shows a SSE value of 4.53047E −05 and the Logistic model 0.000134615. It should be noted that hGP was executed for 500 generations, which is a rather medium number of generations. Thus, better results could be achieved for larger number of generations.

4.3.2. Fitting Results for Sweden, The Netherlands, and Denmark Broadband Penetration

Table 4 contains the initialization parameters for the execution of hGP concerning the dataset for The Netherlands-Sweden-Denmark countries.


Parameters of the hGP

Maximum number of generations500
Evaluation functionSSE
Upper limit of the precision for the candidates for crossover and mutation0.5

The fitting performance for the first five hGP models, according to their fitness value (SSE) as well as the performance of the optimized diffusion models is presented in Figures 11 and 12 for Sweden, Figures 13 and 14 for The Netherlands, and Figures 15 and 16 for Denmark, respectively. The relative statistical indices SSE, MAPE, MSE, RMSE, and MAE of the produced hGP models are presented in Tables 5 and 6 for Sweden, Tables 7 and 8 for The Netherlands, and Tables 9 and 10 for Denmark.


Statistical indices of the hGP for Sweden dataset
Model nameMAPESSEMSERMSEMAE

hGP Model 10.008675744 0.001932299
hGP Model 20.009181524 0.002110357
hGP Model 30.033258076 0.000264358 0.003635641
hGP Model 40.048504881 0.000515968 0.000103702 0.005079212
hGP Model 50.04856113 0.000517456 0.000104259 0.005086532


Statistical indices of the optimized diffusion models for Sweden dataset
Model nameMAPESSEMSERMSEMAE

Logistic0.0558270.001689 0.0091910.007991
Bass0.0558270.001689 0.0091910.007991
Gompertz with constant0.0833960.0020710.0001040.0101750.009002
Gompertz0.0864250.0035510.0001780.0133260.011753


Statistical indices of hGP for The Netherlands dataset
Model nameMAPESSEMSERMSEMAE

hGP Model 10.0137290.00018787 0.0030650.002343
hGP Model 20.0128420.00018848 0.003070.002304
hGP Model 30.013770.00019367 0.0031120.002264
hGP Model 40.013770.00019367 0.0031120.002264
hGP Model 50.0109850.00020294 0.0031850.002155


Statistical indices of the optimized diffusion models for The Netherlands dataset
Model nameMAPESSEMSERMSEMAE

Logistic0.0154460.00023692 0.0034420.002548
Bass0.0154470.00023692 0.0034420.002548
Gompertz with constant0.0315350.000412654 0.0045420.003828
Gompertz0.079460.001367898 0.008270.007501


Statistical indices of hGP for Denmark dataset
Model nameMAPESSEMSERMSEMAE

hGP Model 10.0455120.000566649 0.0053230.004332
hGP Model 20.0474660.000599483 0.0054750.004452
hGP Model 30.0474870.000599823 0.0054760.004453
hGP Model 40.0474890.000599875 0.0054770.004454
hGP Model 50.0475310.000600572 0.005480.004456


Statistical indices of the optimized diffusion models for Denmark dataset
Model nameMAPESSEMSERMSEMAE

Logistic0.0491340.001110701 0.0074520.006494
Bass0.0491340.001110704 0.0074520.006494
Gompertz with constant0.0491340.001110701 0.0074520.006494
Gompertz0.0491340.001110704 0.0074520.006494

According to Tables 5 and 6, the hGP method achieves better statistical indices than those of the optimized diffusion models. For example, the first hGP model achieves an SSE value 7.46756E −05 while the Logistic model an SSE value of 0.001689.

Once again, according to Tables 7 and 8, the hGP method achieves better statistical indices than those of the optimized diffusion models.

Finally, according to Tables 9 and 10, the hGP method achieves better statistical indices than those of the optimized diffusion models. The first hGP model achieves an SSE value of 0.000566649 while the Logistic model 0.001110701.

4.4. Forecasting Results

The forecasting results of the generated models by hGP are presented in this section as well as the comparison of the optimized diffusion models with the hGP’s models. The statistic indicator used for the forecasting process is the wSSE. Both hGP and the optimized diffusion models concerning the forecasting performance are presented in Appendix A.2.

4.4.1. Forecasting Results for the Overall OECD Broadband Penetration

Table 11 contains the initialization parameters for the execution of hGP for the forecasting process. The whole dataset of the OECD countries contains 18 data points (2 points per year since 2002 until 2010). The forecasting method for a 2-year prediction uses 14 points (data for hGP training) of the dataset and implements the statistical indices to this 14-point subset. The graph of the forecasting performance is shown in Figure 17. In every graph, the forecast period window is presented into the blue rectangle.


Parameters of the hGP

Maximum number of generations500
Evaluation functionwSSE
Upper limit of the precision for the candidates for crossover and mutation0.09

The relative statistical indices, for the training points, of the produced forecasting hGP models are presented in Table 12.


Statistical indices of the hGP for OECD forecasting (14 points training)
Model nameMAPEwSSEMSERMSEMAE

hGP Model 10.009744 0.0020050.001412
hGP Model 20.014944 0.0021950.001608
hGP Model 30.014527 0.0023190.001683
hGP Model 40.019881 0.0025790.002041
hGP Model 50.020006 0.002620.002063

The forecasting performance of the optimized diffusion models, according to their fitness value (wSSE) for the 14 training points, is presented in Figure 18 and their statistical indices in Table 13.


Statistical indices of the optimized diffusion models for OECD forecasting (14-point training)
Model nameMAPEwSSEMSERMSEMAE

Logistic0.016716 0.0026790.002025
Bass0.016716 0.0026790.002025
Gompertz with constant0.021155 0.0029170.002349
Gompertz0.022379 0.0032840.002678

Considering Tables 12 and 13, the conclusion is that the hGP method achieves better statistical indices than those of the optimized diffusion models. We can see that the first hGP model achieves a wSSE value 3.67091E −05 while the Logistic model 6.13566E −05. The comparison of the models residuals against time (data points), especially for the 4 last data points (the forecast period), shows the predominance of the hGP models (see Figure 19).

4.4.2. Forecasting Results for the Sweden-The Netherlands-Denmark Broadband Penetration

Table 14 contains the initialization parameters for the execution of hGP in the forecasting process. The datasets of Sweden, The Netherlands, and Denmark contain 20 data points (2 points per year since 2001 until 2010). The forecasting method for a 2-year prediction uses 16 data points for hGP training of the dataset and implements the statistical indices to the subset of the training points. The graphs of the forecasting performance of the hGP and optimized diffusion models are shown in Figures 20 and 21 for Sweden, Figures 23 and 24 for The Netherlands, and Figures 26 and 27 for Denmark, respectively. The relative statistical indices for the training data are presented in Tables 15 and 16 for Sweden, Tables 17 and 18 for The Netherlands, and Tables 19 and 20 for Denmark. Finally, the statistical indices, MAPE and MAE, that describe the forecasting performance of the models in a forecasting horizon of 2-years are presented in Figures 22, 25, and 28 for the three countries.


Parameters of the hGP

Maximum number of generations500
Evaluation functionwSSE
Upper limit of the precision for the candidates for crossover and mutation0.9


Statistical indices of hGP for Sweden dataset (16-point training)
Model nameMAPEwSSEMSERMSEMAE

hGP Model 10.011145 0.0021920.001692
hGP Model 20.008093 0.0021710.001546
hGP Model 30.008504 0.0021660.001538
hGP Model 40.008105 0.0021760.001549
hGP Model 50.008151 0.0021920.001557


Statistical indices of the optimized diffusion models for Sweden dataset (16-point training)
Model nameMAPEwSSEMSERMSEMAE

Logistic0.0545390.000433 0.0070890.006376
Bass0.0545390.000433 0.0070890.006376
Gompertz with constant0.069640.000513 0.0081140.007267
Gompertz0.0540880.000825 0.0092020.007889


Statistical indices of hGP for The Netherlands dataset (16-point training)
Model nameMAPEwSSEMSERMSEMAE

hGP Model 10.0146730.000100 0.0029290.002194
hGP Model 20.0172090.000103 0.0029730.00227
hGP Model 30.0172090.000103 0.0029730.00227
hGP Model 40.0172230.000103 0.0029750.002271
hGP Model 50.0172530.000103 0.0029760.002272


Statistical indices of the optimized diffusion models for The Netherlands dataset (16-point training)
Model nameMAPEwSSEMSERMSEMAE

Logistic0.018228 0.0027120.002069
Bass 0.018228 0.0027120.002069
Gompertz with constant0.033352 0.000162 0.0041990.003502
Gompertz0.074251 0.000437 0.0076390.006336


Statistical indices of the hGP for Denmark dataset (16-point training)
Model nameMAPEwSSEMSERMSEMAE

hGP Model 10.024844 0.0026920.002018
hGP Model 20.024808 0.0032010.002468
hGP Model 30.022951 0.0032330.002483
hGP Model 40.022944 0.0032330.002483
hGP Model 50.022944 0.0032330.002483


Statistical indices of the optimized diffusion models for Denmark dataset (16-point training)
Model nameMAPEwSSEMSERMSEMAE

Logistic 0.0589570.000319962 0.0065820.005759
Bass0.0480660.000362021 0.0064340.005734
Gompertz with constant0.057760.000562689 0.0077560.007007
Gompertz0.0420590.000683336 0.0081250.006943

Forecasting Results for Sweden Broadband Penetration
Considering Tables 15 and 16, it is concluded that the hGP method again achieves better statistical indices than those of the optimized diffusion models for the training subset. The first hGP model has a wSSE value of 5.53E −05 while the Logistic model 0.000433. It should be mentioned that the hGP achieves a satisfactory performance after the 16th data point, with minimized residuals (errors), MAPE and MAE for the last 4 data points (2-year forecast horizon, see Figure 22).

 Forpecasting Results for The Netherlands Broadband Penetration
Further commenting on the forecasting results, according to Tables 17 and 18, it is concluded that the hGP method again achieves better statistical indices than those of the optimized diffusion models for the 16-point training subset. The first hGP model shows a wSSE value of 0.00010 while the Logistic and Bass models 7.31E-05. Both parts, the hGP and diffusion models, specially Logistic and Bass models, achieve well enough performance after the 16th data point, with minimized residuals (errors) for the last 4 data points (2-year forecast horizon, see Figure 25).

 Forecasting Results for Denmark Broadband Penetration
Finally, according to Tables 19 and 20, once again the hGP method achieves better statistical indices than those of the optimized diffusion models for the training subset. The first hGP model achieves a wSSE value of 3.39655E −05 while the Logistic model 0.000319962. The hGP achieves well enough performance after the 16th data point, with minimized residuals (errors), MAPE and MAE for the last 4 data points (2-year forecast horizon, see Figure 28).

5. Conclusions

This paper introduces a new GP method that produces fitting and forecasting solutions models with well enough performance. The whole process is assisted by the insertion of some widely used diffusion models like Logistic, Gompertz family, and Bass, in the initial population of the chromosomes. The hGP method was implemented with a dataset concerning the overall broadband penetration of the OECD countries, as well as the datasets of Sweden, The Netherlands, and Denmark which are pioneers in broadband technologies. Both the fitting and the forecasting performance of the method presented satisfactory statistical indices.

The proposed method differs from the classic GP method in several points. First, some well-known diffusion models, Logistic, Gompertz family, and Bass, which have been optimized by regression analysis, have been inserted in the randomly generated initial population. Second, the regression process has been implemented for each chromosome and after each crossover and mutation operation in order to maximize the algorithm’s efficiency. Also, the crossover and mutation processes are implemented by a random selection of individuals from a sorted list which contains the chromosomes with the best performance.

In general, the produced hGP could be considered as a tool which generates solutions with well enough performance in fitting and forecasting of the broadband penetration.

In this paper, the forecasting horizon of hGP was two years ahead. A further investigation of the hGP performance for a longer forecast horizon would be desirable. It should be noted that the hGP method performance could be further improved with the insertion of more functions into the function set. A future study with an enrichment of the function set of the hGP could thus be considered.

Appendices

A.

In this section, some of the produced hGP models as well as the optimized diffusion models concerning the fitting and the forecasting performance that were analysed before are presented. Each produced model corresponds to a Python’s string data type of the program.

A.1. Fitting Performance Models

For more details, see Tables 21, 22, 23, 24, 25, 26, 27, and 28.


Fitting performance—hGP models for OECD overall
Model nameModel

hGP Model 10.371142995572*E(−3.18695928215*E(−0.0179031441739*t))*E(−E(−4.31337138346e+11
*E(−0.39450322724*t))*( 3.81808835539*t−0.290266301465+3.81536111536*t))
hGP Model 20.310471455673*E(−3.11349554655*E(−0.0204726299602*t))**E(−0.168943728131/(1+E
(−(−50.6231851732+0.733563132765*t))))
hGP Model 30.297974863522*E(−3.04030207778*E(−0.0232771010194*t))*E(−0.12721433893*E
(−109810.431808/(1+E(−(−40.4802143374+0.418592190558*t)))))
hGP Model 40.297974843236*E(−3.0403025557*E(−0.0232771051305*t))*E(−0.12721423417*E
(−70692.9894579/(1+E(−(−40.0400529746+0.418595561014*t)))))
hGP Model 50.297974680462*E(−1.8377143802*E(−0.0232772109071*t))*E(−1.20258571767*E
(−0.0232769748624*t))*E(−0.1272152801*E(−22946.6443056/(1+E(−(−38.9144349592
+0.418589930059*t)))))


Fitting performance—optimized diffusion models for OECD overall
Model nameModel

Logistic0.267462866067/(1+E(−(−2.58338763745+0.0419392545351*t)))
Bass0.26746298117*(1−E(−(4.30693372098e−08+0.0419391296103)*(t−(−267.18543419))))/
(1+(0.0419391296103/4.30693372098e−08)*E(−(4.30693372098e−08+0.0419391296103)
*(t−(−267.18543419))))
Gompertz with constant0.24779967624*E(−6.09503667714*E(−0.0312988127799*t))+0.0323712682778
Gompertz0.311174465387*E(−3.43140971411*E(−0.0230181532619*t))


Fitting performance—hGP models for Sweden
Model nameModel

hGP Model 10.316195914503/(1+E(−((−24.3613307631+((0.248179188862*t))−((+2856.86270612
−0.73365067168*(−7632.23570515−264.609582528*t))*((−188.226187141*(1/(1+E(−
(−11.2194594526−0.0269181697779*t)))))))))))
hGP Model 20.635230289913/(1+E(−((−859.98155469−13.6682736089*t))/(1+E(−(−0.246865293706
−0.112785107324*t))+(−645.208699231864.791617138+15.8723835928*t))))
hGP Model 30.322283774013/(1+E(−(0.0238625468755+0.297883008597*(12.8929141638
−(−134.989713256+2.79343973525*(98.6539018027−0.344395312981*t))/(1+E
(−(5.312543151582.46101246618−0.118615909978*t))+5.93885038216)))))
hGP Model 40.317301851063/(1+(E(−(−4.86018930519−1.40562027222+2.17969243374*E
(−(−225.362926623/(1+E(5.79151748628−0.00873094773031*t))))))))
hGP Model 50.317329645302*(1/(1+E((−(−6.28781400917+0.520317506477*(E(−((−22.2379905754
−21.8755007663−(−42.6520973363/(1+E(−(3.89062429403+0.258885588424
−0.00909501063577*t)))))))))))))


Fitting performance—optimized diffusion models for Sweden
Model nameModel

Logistic0.336282902699/(1+E(−(−2.56246214+0.0518475273171*t)))
Bass0.336282965867*(1−E(−(2.4303762338e−08+0.0518474673078)*(t−(−231.654837703))))
/(1+(0.0518474673078/2.4303762338e−08)*E(−(2.4303762338e−08+0.0518474673078)* 
(t−(−231.654837703))))
Gompertz with constant0.288901526679*E(−8.20983958007*E(−0.0446528448837*t))+0.0500790650845
Gompertz0.366385197367*E(−3.31412845356*E(−0.0301554112794*t))


Fitting performance—hGP models for The Netherlands
Model nameModel

hGP Model 10.00328297044203*E(−((−1.62805354054−1.51126326886*E(−0.308998348091
*E(0.00681524096293*t))*E(−32.6517776745*E(−2.63952059485−0.0444857287095*t))*E
(E(−0.00300820120766*t))*E(E(−6.16053667129*E(−7.75605043494*E(−1.862862714
−195.916190767*E(−250.192549588*E(−0.0440511725397*t)) −1.01338834582)))))))
hGP Model 20.00272274693312*E(−((−1.4875958793−4.39774477765*E(+1.48779357596e−06*E
(+0.0712548718888*t))*E(−25.1750484592*E(−2.51499099805−0.0357061452836*t))*E
(E(−0.00188093694928*t))*E(E(−59.713187786*E(−697.16935125*120.774228698*E
(−0.127948050362*t))*E(−0.0885251390425*t))−2.0188443226))))
hGP Model 30.00279956582383*E(−((−1.87785817855−0.691671611195*E(−0.937233488237*E
(−0.0154741062254*t))*E(−29.8530474333*E(−2.62003795345−0.0517069348527*t))*E(E
(−0.0036068106375*t))*E(E(−20.7938314693*E(−104.667728363*E(30.4510810431
−64.4593053792*E(−3388.8924389*E(−0.174118745631*t))+31.2957260778)))))))
hGP Model 40.00279956582383*E(−((−1.87785817855−0.691671611195*E(−0.937233488237*E
(−0.0154741062254*t))*E(−29.8530474333*E(−2.62003795345−0.0517069348527*t))*E
(E(−0.0036068106375*t))*E(E(−20.7938314693*E(−104.667728363*E(+30.4510810431
−64.4593053792*E(−3388.8924389*E(−0.174118745631*t))+31.2957260778)))))))
hGP Model 50.0056229772664*(((E(E(−567628.398566*t))*E(E(−5.98069338075*E(−0.246097777946*t)
)*E(0.00111530764774*t))+E(E(0.000864502471227*t))*E(E(−18544.326574*E
(−0.147581997545*t))*E(−0.0398044163448*t))*E(E(−2.07368114239*E(t))+E
((−3.39589351006*E(−0.0540701741002*t))+1.12064956451)))))


Fitting performance—optimized diffusion models for The Netherlands
Model nameModel

Logistic0.383817447797/(1+E(−(−2.99687030519+0.0615659713049*t)))
Bass0.383817510204*(1−E(−(4.51096518494e−08+0.0615658365913)*(t−(−180.776347814))))
/(1+(0.0615658365913/4.51096518494e−08)*E(−(4.51096518494e−08+0.0615658365913)* 
(t−(−180.776347814))))
Gompertz with constant0.362060867532*E(−7.28281633264*E(−0.0455738781315*t))+0.0321086407123
Gompertz0.408631571889*E(−4.42176509808*E(−0.0373156465688*t))


Fitting performance—hGP models for Denmark
Model nameModel

hGP Model 15.63426994728*E(−1618550.49488*E(−15.2920199213*E(−3.43879264146*E(−1.28765774838
*E(E(−39343593.2921*E(−19.8648947917*E(((−0.155432332908*(E(−0.740392625374*E(E
(−0.43418452549−0.184927854534+0.0142460870633*t))))))))))))))
hGP Model 21.54576267797*E(−65887807.219*E(−17.6521068857*E(((−0.328946437325*(E
(−0.937705801568*E(E(−0.300061035521−0.249644860355+0.0126941999487*t)))))))))
hGP Model 31.55049461946*E(−25211043.951*E(−16.6892960686*E(−0.348864738858*E
(−0.937562881193*E(E(−0.702455330352*0.778787005609+0.012669874873*t))))))
hGP Model 41.55178035993*E(−21975484.4721*E(−16.5513605349*((E(−0.352162700827*(E
(−0.938395470029*E(E(−0.295038781436−0.252306036288+0.0126692295308*t)))))))))
hGP Model 51.56240433165*E(−4303572.92965*E(−14.9161171585*E(−0.395041661739*E
(−0.944193764022*E(E(−0.257102951003−0.289694506506−0.000548574587926
+0.0126461044806*t))))))


Fitting performance—optimized diffusion models for Denmark
Model nameModel

Logistic0.386701712691/(1+E(−(−2.76922996844+0.058481412812*t)))
Bass0.386701824675*(1−E(−(7.48033939618e−08+0.0584811962598)*(t−(−184.676718196))))
/(1+(0.0584811962598/7.48033939618e−08)*E(−(7.48033939618e−08+0.0584811962598)
*(t−(−184.676718196))))
Gompertz with constant0.364158594712*E(−6.08132991138*E(−0.042851861025*t))+0.0339245270247
Gompertz0.411115617425*E(−3.86441490696*E(−0.0356437099203*t))

A.2. Forecasting Performance Models

For more details, see Tables 29, 30, 31, 32, 33, 34, 35, and 36.


Forecasting performance—hGP models for OECD overall
Model nameModel

hGP Model 10.266691660828*(1*E(−4.5558998293*E(+0.395188436078*E(+6.76421029076e−07*t))* 
E(−0.0364443088831*t))+E(−32.929440994*E(−14.9394803737*E(−0.0140586124973*t))*  
((t))−2.16676076882))
hGP Model 20.266691660828*(1*E(−4.5558998293*E(+0.395188436078*E(+6.76421029076e−07*t))*  
E(−0.0364443088831*t))+E(−32.929440994*E(−14.9394803737*E(−0.0140586124973*t))*  
((t))−2.16676076882))
hGP Model 30.260458088163*(1*E(+20.7470896309*E(−1.45363179622*E(+0.0230415722335*t))−((E
(−0.0405561643755*t))*E(+0.68313768074*E(+0.549850481378*E(−6.35620276975e−06*t)
)+E(+0.68313768074*E(+0.549850481378*((−0.120362874712*t))−20243.8982316))))))
hGP Model 40.260458088163*(1*E(+20.7470896309*E(−1.45363179622*E(+0.0230415722335*t))−
((E(−0.0405561643755*t))*E(+0.68313768074*E(+0.549850481378*E(−6.35620276975e−06
*t))+E(+0.68313768074*E(+0.549850481378*((−0.120362874712*t))−20243.8982316))))))
hGP Model 50.503856681046/(1+E(−(−4.93152601976*E(−297363.42521*E(−13.3313258294*E
(−0.00150407130185*t))+E(−5.0614088398*E((1−((−16.026189607*t))
−284.509095804)))))))


Forecasting performance—optimized diffusion models for OECD overall
Model nameModel

Logistic0.277910816466/(1+E(−(−2.58069789463+0.040512555172*t)))
Bass0.277911999052*(1−E(−(1.4531815579e−07+0.0405120087548)*(t−(−245.790644296))))
/(1+(0.0405120087548/1.4531815579e−07)*E(−(1.4531815579e−07+0.0405120087548)*  
(t−(−245.790644296))))
Gompertz with constant0.291419722502*E(−4.90258116522*E(−0.0259253838506*t))+0.0253819182253
Gompertz0.395094721399*E(−3.35529164695*E(−0.0182708343745*t))


Forecasting performance—hGP models for Sweden
Model nameModel

hGP Model 17.20927380763/(1+E(−(−43.9996049107*E(+2.70293611052−(−(+2.803498099
−8.15837562235*E(−1.70915730227*E(−1.74720440508−(−(−1.20189076128
−0.464224975147*E(−1.02239739844−(−(E(+0.0141394755816*t))*E(−5.74593580027*E
(−0.254782583675*t))))))))))))))
hGP Model 20.310010281135*E(−8.27016729973*E(−1.60830108009*E(−0.00327207897458−(−(E
(−0.119609366795*t))*−0.614180508706−10774.1877418*E(−4.50941741619*E((E
(−0.00432500059282*t))))))))
hGP Model 30.309720977447*E(−3.32613006656*E(−1828787.41004*E(−21.0309802655*E
(−0.00529622245859*t))−(−(−0.380077801407*0.0270286551485*t))*(E(E(E
(−251239.306145*E(−13.3031814351*E(−0.00387849535444*t))))))))
hGP Model 40.309960585568*E(−9.58732389121*E(−1.6194568498*E(+0.0756117709535−(
−(E(−0.118396422714*t))*−0.553628776298−22454.3177439*E(−4.79688961086*E((E(
−0.00396787850031*t))−3.14178858275e−05))))))
hGP Model 50.309820654157*E(−14.6146178732*E(−1.72422423849*E(+0.218999757571−(−(E
(−0.116856333856*t))*−0.430780829643−509200.484595*E(−5.9633868508*E((E
(−0.00290035446778*t))))))))


Forecasting performance—optimized diffusion models for Sweden
Model nameModel

Logistic0.375347548406/(1+E(−(−2.50124733322+0.0454566554106*t)))
Bass0.375348107159*(1−E(−(4.05055161391e−08+0.0454564711241)*(t−(−251.439978251))))
/(1+(0.0454564711241/4.05055161391e−08)*E(−(4.05055161391e−08+0.0454564711241)
*(t−(−251.439978251))))
Gompertz with constant0.370148682751*E(−5.03987923004*E(−0.0313473706267*t))+0.0375384813167
Gompertz0.493581707395*E(−3.13558666165*E(−0.0214071552547*t))


Forecasting performance—hGP models for The Netherlands
Model nameModel

hGP Model 10.253326242314*E(−25.4808891372*E(−0.0589928095313*t))*(E(+23.1207979455*E
(−0.0648090808052*t))+0.521707030392)
hGP Model 20.383595542499*(1−E(−(1.73831772618e−08+0.0617339056621)*(t−(−195.681455628))))
/(1+(0.0617339056621/1.73831772618e−08)*E(−(1.73831772618e−08+0.0617339056621)
*(t−(−195.681455628))))
hGP Model 30.38359548724/(1+E(−(((−3.00265784135++0.0617339638782*t)))))
hGP Model 40.383619520777/(1+E(−(+30.303671841−30.3139537846*E(−6.97443602173e−06*t))*(
E(+5.67696636631))))
hGP Model 5−6.99832857574e−05/(1−1.00018242854*E(((+0.00367050894083*E(−0.0617182268173*t))
−((0.00104546915744*t))*(E(−(−(−642.723570678−0.745098043066*t)))))))


Forecasting performance—optimized diffusion models for The Netherlands
Model nameModel

Logistic0.37898025831/(1+E(−(−3.01882999246+0.062714536225*t)))
Bass0.378980333001*(1−E(−(2.34739032545e−08+0.0627144548701)*(t−(−187.825594969))))
/(1+(0.0627144548701/2.34739032545e−08)*E(−(2.34739032545e−08+0.0627144548701)
*(t−(−187.825594969))))
Gompertz with constant0.368021712377*E(−6.9150399399*E(−0.0442055310107*t))+0.0308188707983
Gompertz0.433015517173*E(−4.14875467224*E(−0.0340271367285*t))


Forecasting performance—hGP models for Denmark
Model nameModel

hGP Model 10.370274249473/(1+E(−(−1.4932887728−1.6616878321−(0.0657816856919
−0.0657829197159)*(t*(t−(−54744.3143422)−(+26.3542094658)*(t−46.1208987279)*  
(t+25.8109328343−(+47.3153899385−46.1208987279)*(t/(1+(−2.94897421702/
+14.4559028944)*E(−(E(−(E(−(E(−(−69.3535656648+1.1508868672*t))))))))))))))))
hGP Model 20.367675661705/(1+E(−(−1.54302066592−1.68154541948−(0.0657658786124
−0.0657664779573)*(t*(t−(−56772.6811399)−(−57863.1392956−57863.1735902
−(2053.24155389*(t++9.11367676744−(0.0657743518948−0.06575839253)*(t*(t−
(−95622.1907084)−(+3.71296243976)*(t++153.381465941)*(t−69.4331806343)))))))))))
hGP Model 30.368792284508/(1+E(−(−1.59162101334−1.73786706106−(0.0657680461973
−0.0657550491287)*(t*(t−9961.43677147−(0.0655352138407−0.0654902811262)*(t*(t−
(−(−(−0.417805111612)*(t−87094.1655681+161159.989637)*(t−115.359792771))))))))))
hGP Model 40.368796568856/(1+E(−(−1.55630251856−1.77353320735−(0.0654733227916
−0.0655131749375)*(t*(t−(−(−(−0.0997652183839)*(t+10381.0881184+11105.3767878+
+11089.8380866−(0.0661623277964−0.0667280200974)*(t*(t−(−(+4067.51578195+
+3972.08191893)*(t−118.562082811))))))))))))
hGP Model 50.368796569031/(1+E(−(−1.59261997915−1.73721570272−(0.0654378132468
−0.0655261015789)*(t*(t−(−(−(−726.874412652)+740.122972564−(0.0656174439567
−0.0654501910722)*(t*(t−(−(148193.4147−(+601.176257868++624.503460915)*(t/(1+
(−4.14155341813/5.62371847802)*E(−(E(−(−26.4655872285
+0.152122244894*t)))))))))))))))))


Forecasting performance—optimized diffusion models for Denmark
Model nameModel

Logistic0.405722295441/(1+E(−(−2.70897871256+0.0546356858504*t)))
Bass0.412729366859*(1−E(−(0.00143820912913+0.0495453230285)*(t−(−20.4381091217))))
/(1+(0.0495453230285/0.00143820912913)*E(−(0.00143820912913+0.0495453230285)
*(t−(−20.4381091217))))
Gompertz with constant0.433911203761*E(−4.34698049161*E(−0.033071306649*t))+0.0190975994121
Gompertz0.476280140231*E(−3.53102816452*E(−0.0290174259708*t))

B.

The MAPE is presented in (B.1). The sum is over time period , is the raw data actual value for time , and is the estimated model’s value:

MSE, MAE, and RMSE are presented in (B.2), (B.3), and (B.4), respectively:

C.

In the fitting process, each chromosome is evaluated with the sum of squared error (SSE), as in

In (C.1), the sum is over the time period . Also, is the real data for time , and is the model’s value [15].

In forecasting, the evaluation function corresponds to the weighted sum of squared error (wSSE) function, as in

In this function, a weight is used, in order to focus on the time interval near the last observed or training data.

Acknowledgment

The authors wish to express their acknowledgments to Professor Yi-Kuei Lin, National Taiwan University of Science and Technology, Taiwan, for his constructive comments and suggestions, which helped to improve the quality of this paper.

References

  1. N. Meade and T. Islam, “Modelling and forecasting the diffusion of innovation—a 25-year review,” International Journal of Forecasting, vol. 22, no. 3, pp. 519–545, 2006. View at: Publisher Site | Google Scholar
  2. Z. Griliches, “Hybrid corn: an exploration in the economics of technological change,” Econometrica, vol. 25, no. 4, pp. 501–522, 1957. View at: Google Scholar
  3. E. Mansfield, “Technical change and the rate of imitation,” Econometrica, vol. 29, pp. 741–766, 1961. View at: Google Scholar
  4. F. M. Bass, “A new product growth for model consumer durables,” Management Science, vol. 50, no. 12, pp. 1825–1832, 2004. View at: Publisher Site | Google Scholar
  5. S. Konstantinos and S. Vasilios, “A new empirical model for short-term forecasting of the broadband penetration: a short research in Greece,” Modelling and Simulation in Engineering, vol. 2011, Article ID 798960, 10 pages, 2011. View at: Publisher Site | Google Scholar
  6. E. M. Rogers, Diffusion of Innovations, The Free Press, New York, NY, USA, 5th edition, 2003.
  7. J. H. Holland, Adaptation in Natural and Artificial Systems, University of Michigan Press, Ann Arbor, Mich, USA, 1975. View at: Zentralblatt MATH
  8. J. R. Koza, “Genetic programming as a means for programming computers by natural selection,” Statistics and Computing, vol. 4, no. 2, pp. 87–112, 1994. View at: Publisher Site | Google Scholar
  9. W. Lee and H. Y. Kim, “Genetic algorithm implementation in Python,” in Proceedings of the 4th Annual ACIS International Conference on Computer and Information Science (ICIS '05), pp. 8–12, July 2006. View at: Publisher Site | Google Scholar
  10. C. Christodoulos, C. Michalakelis, and D. Varoutas, “On the combination of exponential smoothing and diffusion forecasts: an application to broadband diffusion in the OECD area,” Technological Forecasting and Social Change, vol. 78, no. 1, pp. 163–170, 2011. View at: Publisher Site | Google Scholar
  11. N. Meade and T. Islam, “Forecasting with growth curves: an empirical comparison,” International Journal of Forecasting, vol. 11, no. 2, pp. 199–215, 1995. View at: Google Scholar
  12. K. Li, Z. Chen, Y. Li, and A. Zhou, “An application of genetic programming to economic forecasting,” in Current Trends in High Performance Computing and Its Applications, Proceedings of the International Conference on High Performance Computing and Applications, Part I, pp. 71–80, Springer, 2005. View at: Google Scholar
  13. OECD Broadband portal, 2011, http://www.oecd.org/dataoecd/22/12/39574779.xls.
  14. OECD Broadband portal, 2011, http://www.oecd.org/dataoecd/22/13/39574788.xls.
  15. M. A. Kaboudan, “Forecasting with computer-evolved model specifications: a genetic programming application,” Computers and Operations Research, vol. 30, no. 11, pp. 1661–1681, 2003. View at: Publisher Site | Google Scholar

Copyright © 2012 Konstantinos Salpasaranis and Vasilios Stylianakis. This is an open access article distributed under the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.


More related articles

 PDF Download Citation Citation
 Download other formatsMore
 Order printed copiesOrder
Views1732
Downloads732
Citations

Related articles

We are committed to sharing findings related to COVID-19 as quickly as possible. We will be providing unlimited waivers of publication charges for accepted research articles as well as case reports and case series related to COVID-19. Review articles are excluded from this waiver policy. Sign up here as a reviewer to help fast-track new submissions.