Abstract

The objective of this paper is to present a short research about the overall broadband penetration in Greece. In this research, a new empirical deterministic model is proposed for the short-term forecast of the cumulative broadband adoption. The fitting performance of the model is compared with some widely used diffusion models for the cumulative adoption of new telecommunication products, namely, Logistic, Gompertz, Flexible Logistic (FLOG), Box-Cox, Richards, and Bass models. The fitting process is done with broadband penetration official data for Greece. In conclusion, comparing these models with the empirical model, it could be argued that the latter yields well enough statistics indicators for fitting and forecasting performance. It also stresses the need for further research and performance analysis of the model in other more mature broadband markets.

1. Introduction

The diffusion of innovative products is the process in which innovation is adopted by society. An innovative product, after it has been widely adopted by society, can be the basis for developing other innovations. The modern diffusion theory concerning the construction of diffusion models for product adoption has been a topic of academic interest since the 1950s. The literature on this area is significantly large. In this brief historical overview, some important papers are presented.

Most of the diffusion models, used in market studies, are S-shaped curves. S-shaped curves (sigmoids) describe the growth of various phenomena in physics, biology, and social sciences [1]. The Logistic model has been used by Griliches (1957) in explaining the adoption of hybrid corn in the USA [2]. The linear form of the model was also used by Mansfield (1961) [3, 4]. A model derived from a study of the plants growth is that of Richards (1959) [5, 6]. The theory of diffusion of new products adopted by a social system has been presented by Rogers (1962) [7].

Rogers considers five categories according to the time response of consumers in purchasing new products. These categories of consumers, profiles are: innovators, early adopters, early majority, late majority, and laggards. This study is a standard reference for diffusion theory. The theory of Rogers is a description of the life cycle of a new product [7].

The Gompertz forecasting model was proposed by Gregg, Hassel, and Richardson (1964) [8]. According to Chow (1967), the Gompertz model is better than the Logistic model in explaining computer demand [9, 10]. Bass (1969) proposes a growth model, which accurately forecasts peak color TV sales [11]. The Bass model proposes the coefficient of innovation in diffusion’s equation, which makes it suitable for the modeling of product adoption [11, 12]. Bewley and Fiebig (1988) use the flexible logistic growth (FLOG), as well as the Box-Cox model with applications in telecommunications [13, 14].

In conclusion, it can be said that the objective of diffusion models is to describe and to predict market trends [1].

2. Diffusion Models

In general, diffusion models are deterministic time functions, and they use historical data to estimate the parameters of the diffusion process of a product’s life cycle. These models are used to forecast the cumulative adoption of a new technology [15].

Diffusion models are divided into, at least, two categories according to whether the level of saturation is constant (nondynamic models) or changing over time (dynamic models).

The differential equation which describes the fundamental diffusion model follows the following formulation: where is the estimated diffusion saturation level for time , is the diffusion penetration and function is the diffusion coefficient. In dynamic diffusion models, the saturation factor is dependant on time. It should be considered that the difference corresponds to the cumulative number of potential adopters, , for time . So, (1) could be written as In this paper the formulations of the Gompertz, Logistic, Richards, Flexible Logistic (FLOG), Box-Cox are used, as well as that of the Bass model. Here follows a short review of the above models.

2.1. Gompertz Model

The Gompertz model is described by the following equations: where is the estimated diffusion level at time , and parameter is the saturation level [16]. The parameters that determine the model are , , . Parameter is related to the year that diffusion reaches the point of inflection, and parameter measures the “diffusion speed”. Parameter is a constant (4) and is called Gompertz with constant.

2.2. Logistic Model

The general form of the logistic model is given by where and constants. Various functions derive from the logistic model such as the following.

2.2.1. Fisher-Pry Model

When the parameters of the logistic model are and and time , the model is known as linear logistic or Fisher-Pry model. The linear logistic model has an inflection point that occurs when the diffusion level is the half of its saturation level.

2.2.2. Flexible Logistic Model

The form of Flexible Logistic model (FLOG) derives from the logistic, where , and The FLOG model locates the point of inflection anywhere between its upper and lower bounds [4]. The parameters are and [16].

2.2.3. Box-Cox Model

The Box-Cox model comes from the logistic, where . The general logistic parameters are and [16].

2.3. Richards Model

The Richards’ growth model introduces an additional parameter to the linear logistic model [17]: where constants .

2.4. Bass Model

The Bass model includes two categories of adopters, the innovators, at the early stage of the diffusion, and the imitators, afterwards. , the diffusion level at time , is expressed as where is the saturation level of adoption, is the innovation coefficient, and is the imitation coefficient [6, 11].

3. The Empirical Diffusion Model

3.1. General Aspect of the Model

An empirical model for short-term forecasting of the broadband penetration is proposed, described by the following discrete equation of differences: where is the estimated penetration level at time . For the initial penetration is taken. function measures the “adoption rate” at time , given by where , and model’s time is given by: where is the current time (natural number without 0) of the observation, is the step between , and and (real numbers without 0).

Also, an equivalent mathematical expression in (9) that can be used is The continuous format of (9) is given by where, for , we take the initial diffusion level .

Regression analysis shows that, in many cases, the range of factor varies between values 2, 3. In the special case that , the solution of (13) gives: where represents the polylogarithm function of order and argument , defined by the infinite series In this paper, the discrete format of this model is used, described in (9) or (12), (10), and (11).

3.2. Model’s Analysis

The empirical model uses the formalism of Planck’s model for black body emission to introduce the parameters in macroscopic level. The parameters of the adoption rate model into (10) are , , , and . It belongs to the dynamic diffusion type of models, meaning that the cumulative number of potential adopters is changing against time. It follows that the estimation of the saturation level is time dependant. Parameter is a scaling factor of the model that corresponds to the initial broadband market size (potential adopters).

It should be noted that the penetration growth rate is proportional to the number of potential adopters and to the likelihood of the fact that total number of adopters is for time . This can be written in discrete format as where

Exponent relates to the behavior of the time function which describes the total number of potential adopters. According to the empirical model, is given by The factor ties up with the speed of the cumulative adoption and the saturation level of the diffusion model. Specifically, the factor is related to the probability distribution of the model. The discrete probability of total potential adopters at time is presented in

Equation (18) is the discrete resultant of the reasonable assumption that the rate of probability in time will be in negative proportion to the probability and factor [11]. The continuous format of the differential equation which describes this assumption is given by The normalized discrete probability function for time is given in The factor is the sum of the geometric series .

So, (20) is equivalent to Replacing (21) in (16) yields which is equivalent to (10).

Finally, by replacing (22) in (12), the following is obtained: The difference is a constant number and equal to . So, (23) becomes which is equivalent to (12).

Consequently, parameter relates to the time period of the growth as well as to the slope of the adoption rate curve (Figure 1). Figure 2 shows the influence of the model parameters on its performance.

4. Market Dataset

4.1. Market Overview

National Telecommunications and Post Commission (EETT) is the regulatory authority in Greece. EETT’s objective is the development of the broadband market in Greece, while at the same time enforcing the fair competition and consumer protection [18]. Digital Subscriber Line (DSL) technology is the predominant of the fixed broadband market. The DSL coverage is available in terms of percentage of population. According to the European Commission (EU), the DSL availability covers 100% of urban areas, while in rural areas is 50% [19]. According to the OECD Broadband Portal December 2008 data [20], 95% of fixed broadband connections in Greece are based on xDSL technologies. Broadband cable access is not available in Greece. Mobile broadband is at an early growth stage, lower than the EU average [1924]. It should be noted that EETT’s data refer only to fixed broadband connections, without taking into account the penetration of broadband mobile networks.

4.2. Dataset

The actual data used in this short analysis concerns the quarterly total broadband connections and the percentage of broadband connections per 100 inhabitants in Greece, from December 2004 until June 2010 (see Figure 3) [25]. The construction of Figure 3 is based on data of the National Telecommunications and Post Commission (EETT) [2123], as well as data of the Observatory for the Greek Information Society [24].

5. Fitting and Forecasting Method

Regression analysis is used to fit sampled data to a model. Curve fitting is done using the ordinary least squares method (OLS) [12]. The objective of ordinary least squares method is to minimize the sum of squared error (SSE) between data points and model evaluated points : where is a time function, with a set of parameters and known data points [12].

In forecasting, parameter estimation is usually focused on the time interval near the last observed data points. Thus, the weighted least squares method is used [12]. The weighted least squares method is to minimize the weighted sum of squared error (SSE) between points of the dataset and model estimated points : where is the weight and is the variance of th observation [12].

6. Results

6.1. Fitting Results

According to the Sum of Squared Error , the fitting curves are sorted in Table 1.

The forecasting model shows the best fitting performance. It should be noted that the statistical indices concern the whole dataset. The fitting performance of the forecasting model improves its performance for any subset of the dataset, especially for the latest data. Analytically, the Mean Absolute Error (MAE), the Root Mean Squared Error (RMSE), and the Mean Absolute Percentage Error (MAPE) are statistics for evaluating the overall quality of a regression model [26]. The Mean Absolute Error (MAE) is the average of the absolute value of the residuals. We observe that the MAE of the empirical model is the smallest concerning the whole set of models. It should be taken into account that this reduction is observed mainly in the last data. MAE is very similar to RMSE, but less sensitive to large errors. It can also be seen that the Root Mean Squared Error (RMSE), which is the square root of the average squared distance of a data point from the fitted line, is the smallest. The Mean Absolute Percentage Error (MAPE) is the percentage of the summarized absolute error divided by the summarized actual values of the observation. The MAPE indicator of the empirical model is also the smallest. Figure 4 presents the errors of the models in time, namely, residuals per model.

It can be seen that the MAE of the forecasting model is the smallest. In addition, Gompertz (with constant), Gompertz, and Richards’ models show a good behaviour. The aforementioned performance is improved on the time interval near the last observed data points. In Figure 5, the fitting model plots in time, December 2004 until June 2010, is shown.

The overall fitting performance of the models is satisfactory, as is the correlation coefficient between fitted and actual data of the models. This coefficient varies between 0.999285392 (forecasting model) and 0.998073643 (logistic model). In case the parameters of the models are calculated by the OLS regression method, the R squared (R2) indicator which is the square of the correlation coefficient depends on the number of the parameters. So, the indicator’s range varies between 0.99897498 (forecasting model) and 0.99696321 (logistic model).

It can be observed that the models with the best behaviour are the forecasting model, the Gompertz (with constant), the Gompertz, Richards’s growth model, and the Box Cox-FLOG.

6.2. Forecasting Results

The parameters are estimated by the means of regression analysis (WLS method). The dataset consists of 23 data points (December 2004 until June 2010).The forecasting period is one year, June 2009 until June 2010 (after 19th data point). The statistical indices concerning the forecasting performance of the models are presented in Table 2.

In the literature, MAPE is a reliable indicator for evaluating the prediction performance of the models. The proposed model performs better than the others. Specifically, the empirical model achieves well enough indicators (MAPE, MAE, and MSE). In general, it could be argued that the models that achieve good indicators, especially MAPE, are the Richards and Gompertz family.

In Figure 6, the aforementioned performance is graphically presented, for a forecasting period of one year.

6.3. Future Trends

The performance of the models using the dataset (23 data points) is presented here. The WLS method was also applied here. The parameters of the forecasting model are presented in Table 3.

According to the MAPE indicator, the forecasting curves are sorted in Table 4.

According to the MAPE indicator, the forecasting model, Gompertz with constant, Richards and Gompertz perform better than the others. According to SSEWLS, the models with the best performance are the forecasting model, FLOG, Logistics (with constant), and Bass. The estimation of our model for the broadband penetration in Greece on June 2012 (two years ahead) is 22.09% approximately. The most optimistic estimation concerns the Gompertz model (BB penetration 22.34%), and the most pessimistic estimation is done by Logistic with 19.46%. Also for forecasting period of one year, the empirical model estimates the BB penetration 20.73%, Gompertz 20.85%, Gompertz (constant) 20.75%, and Richards 20.34% (Figure 7).

7. Future Implementation of the Empirical Model

It is mentioned that the adoption of an innovative technology by society follows the sigmoid curve. So, the empirical model could be implemented for a general-purpose time-series forecasting that follows the sigmoid curve. The implementation to different markets would give, using regression analysis, different parameters estimation.

The time horizon of the forecast depends on the dataset density. Generally, it should be noted that the reliability of the diffusion models depends on the number of the time series data. This principle governs the empirical model. So the implementation of the model should be chosen based on this principle.

Specifically, the coverage of the FTTH (Fiber To The Home) technology is rapidly growing, according to OECD [27]. So, a suggestion of future implementation of the model (when the historical data are satisfactory in number) would be the forecast of the FTTH technology penetration in a country or in a geographical sector (e.g., Europe, Asia, America, etc.)

8. Conclusion

In this paper, a new short-term forecasting model was introduced, concerning the overall broadband penetration. The forecasting model exhibited a well-fitting performance to the observed data. The residuals, as well as the RMSE, MAE, and MAPE indicators of the empirical model were satisfactory. The statistical indicators, concerning the forecasting behaviour of the model, namely, MAE, RMSE, and MAPE, showed satisfactory results, also.

Future study about the performance of the empirical model for the broadband penetration, in different markets, should be considered. The comparison of the behaviour of the proposed model in different markets can offer a better estimation of the model’s parameters and the correlation with financial and social indices.

Acknowledgment

The authors wish to express their acknowledgments to Professor Luis Carlos Rabelo, University of Central Florida, USA, for his constructive comments and suggestions, which helped to improve the quality of this paper.