#### Abstract

Accurate calculation of power grid investment scale is an important work of power grid management. It is very important to power grid efficient development. Due to the characteristics of short data time series, lots of influencing factors, and large change of power grid investment, it is very difficult to calculate grid investment accurately. Firstly, this paper uses hierarchical clustering analysis method to divide the 23 provinces into four classes with considering fifteen power grid influencing factors, then uses spearman’s rank-order correlation to find out five key influencing factors, and then establishes the regression relationship between the growth rate of investment scale and GDP, permanent population, total social electricity consumption, installed power capacity of operation area, maximum power load, and other growth rates by using the multiple linear regression method (MLR), and the estimation error is corrected by using RBF neural network. Finally, the validity of the model is verified by using data related to power grid investment. The calculation error indicates that the model is feasible and effective.

#### 1. Introduction

Power Grid Corporation is responsible for guaranteeing the sustainable development of power grid and improving the reliability of power supply. Power grid investment is one kind of the fixed assets investment, which has the general fixed assets characteristic. Fixed asset investment has been affected by many factors, including GDP, interest rate, and other macroeconomic factors. However, due to the inability to store electricity, the power gird has some specific characteristics, such as instantaneity and network-based. So the power grid investment needs satisfy transmission demand and network construction demand. Following the development of human society, power grid also needs to satisfy the energy source optimization distribution, renewable resources combined to the grid, and environment protection. As a result, the power grid investment has a lot of influencing factors and it is difficult to be forecasted. But it is very important for optimizing planning and development work of Power Grid Corporation to know the investment scale. Especially, China State Grid Corporation has invested more than CNY (China Yuan) 2770 billion on power grid capital construction from 2005 to 2015. The average increase rate is 14.29%. While provincial corporations’ investments have vast difference, change from CNY 1.12 billion to 14.9 billion, investment average increasing rate changes form -1.2% to 19.38%. So it is very important to study the forecasting model to realize the accurate calculation of provincial corporation investment.

The second part of this paper is literature review, summarizing the existing research methods; the third part introduces the proposed method; the fourth part forecasts the provincial power grid corporations’ annual investments with the proposed method; the fifth part is the main conclusion and policy suggestion.

#### 2. Literature Review

In order to study the power grid investment forecasting models, previously popular forecasting models employed in other fields are worthy references for us; some classical forecasting models such as error correction model, autoregressive integrated moving average model, and Grey model are widely employed for forecasting among different fields. Until now, the research on investment prediction model has been studied by a lot of academics and obtained a great deal of achievements at home and abroad; the specific literatures are shown in Table 1.

#### 3. Model and Estimation Methods

##### 3.1. Research Route

The technical route is shown in Figure 1.

##### 3.2. Spearman’s Rank-Order Correlation

Spearman’s rank-order correlation is equivalent to Pearson’s product-moment correlation coefficient performed on the ranks of the data rather than the raw data, and it is the nonparametric version of the Pearson product-moment correlation. Spearman’s correlation coefficient can measure the strength of association between two ranked variables. Spearman’s correlation coefficient of two variables and (*i* is the number of samples) can be calculated by formula (1) [17].

Here, is Spearman’s correlation and are the average values of the two variables and . Spearman’s correlation rank will yield a value . Higher absolute values of correspond to stronger correlations between the two variables. A positive value suggests a positive correlation, while a negative value represents a negative correlation.

##### 3.3. Multiple Linear Regression Method

Multiple linear regression attempts to model the relationship between two or more explanatory variables and a response variable by fitting a linear equation to observed data. Every value of the independent variable is associated with a value of the dependent variable. In general, the multiregressive model can be expressed as

where

= intercept,

= ith independent variable with effect on y,

= slope or regression coefficient of , which can get by least-squares estimates method [18],

= prediction error.

##### 3.4. RBF Neural Network

Radial basis function (RBF) neural network is a three-layer forward neural network model with good performance and global approximation and is free from the local minima problems [19]. It is a multi-input, single-output system consisting of an input layer, a hidden layer, and an output layer. During the data processing, the hidden layer performs nonlinear transforms for feature extraction and the output layer gives a linear combination of output weights.

The transformation of the RBF neural network from the input space to the hidden layer space is nonlinear, and the transformation from the hidden layer space to the output layer space is linear. RBF neural network has a simple structure, simple training and fast learning convergence, which can fit any nonlinear function, so it is widely used in time series analysis forecasting [19–22].

The RBF neural network commonly uses the Gauss radial basis function as the activation function of the hidden layer neurons. The Gauss radial basis function can be expressed as follows:

in which is the input to the hidden layer, is the center of the Gaussian function, is the distance between the input vector and the center of the Gaussian function, and is the variance of the Gaussian function. The output of the network can be express as follows:

In the formula, is the pth input sample of the network; p=1,2,…,P. P is the total number of samples; i=1,2,…,h is the number of hidden layer nodes; is the actual output of the jth output node of the network corresponding to the input sample.

There are three parameters that need to be solved in the RBF neural network model: the center of the radial basis function, the variance of the radial basis function, and the weightfrom the hidden layer to the output layer.

According to the different selection methods of radial basis function centers, there are many learning methods for parameters and , such as random selection center method, self-organization selection method, supervised selection center method, and orthogonal least-squares method [23]. The least square algorithm is also applied to train the output weight .

##### 3.5. Hierarchical Clustering Analysis Method

Hierarchical clustering analysis method (also called hierarchical cluster analysis or HCA) is a method of cluster analysis which seeks to build a hierarchy of clusters. Strategies for hierarchical clustering generally fall into two types [24]:

*(1) Agglomerative (Bottom-Up) Clustering*(1)Start with each example in its own singleton cluster.(2)At each time-step, greedily merge 2 most similar clusters.(3)Stop when there is a single cluster of all examples, else go to .

*(2) Divisive (Top-Down) Clustering*(1)Start with all examples in the same cluster.(2)At each time-step, remove the “outsiders” from the least cohesive cluster.(3)Stop when each example is in its own singleton cluster, else go to .

In this paper, we use the SPSS software to realize the clustering function.

#### 4. Case Study

##### 4.1. Provincial Power Grid Investment Influencing Factors

Power grid investment demand is determined by various influencing factors such as economic activities; social development; power stable and reliable; etc. In this paper, we divide the influencing factors into 5 categories; they are economy society development; electricity transmission; power grid security; quality of electric energy; and generation grid-connected, all of which consist of three specific factors which can be measured numerically, as depicted in Figure 2.

As we can see form Figure 2, we choose 15 power grid investment influencing factors to analyze. In this section, we firstly collect provincial historical data of the above-mentioned influencing factors; all these data are obtained from China statistical yearbook, state grid statistical information, and so on. The descriptive statistics of these influencing factors values of 23 provinces from 2012 to 2014 are showed in Table 2.

Then Spearman’s rank-order correlation method is employed to select key influencing factors of the power grid investment demand from the above-mentioned influencing factors. These key influencing factors have significant impact on power grid investment demand, so these key influencing factors are employed to forecast the provincial power grid investment demand.

Here we calculated the spearman rank degree between upper influencing factors and power grid investment. The calculation results are showed in Table 3. The results showed that gross domestic product (GDP), population (POP), social electricity consumption (SEC), electric installed capacity (EIC), and peak load (PL) are the main influencing factors of the electric grid investment (EGI). All these key influencing factors are selected for the power grid investment forecasting.

##### 4.2. Multiple Linear Regression (MLR) Method of 23 Provinces Power Grid Investment

Here we use the increasing rates of power grid investment demand, GDP, POP, SEC, EIC, and PL to set up the multiple linear regression (MLR) models of 23 provinces from 2004 to 2016. The independent variables of the MLR model are GDP, POP, SEC, EIC, and PL while the dependent variable is power grid investment. The coefficients of the multiple linear regression models are showed in Table 4.

##### 4.3. The Forecasting Error Correction by RBF Neural Network

In this part, we firstly collect the provincial power grid investment demand forecasting errors of multiple linear regression (MLR) model. All the provincial power grid investment demand forecasting errors of MLR models are showed in Figure 3 (A total of 299 sample data). All these samples are used to train the RBF neural network model.

Then RBF neural network models are employed to fit the provincial power grid investment demand forecasting errors of multiple linear regression model, the inputs of the RBF models are GDP, POP, SEC, EIC, and PL while the output of the RBF models are corresponding forecasting errors of MLR models.

Finally, the fitting results of RBF models are used to modify the forecasting results of the MLR forecasting model. MLR and MLR-RBF forecasting results as well as the actual power grid investment of four typical provinces are showed in Figures 4–7. From these figures we can conclude that the forecasting results of MLR-RBF model do have a little improvement to single MLR model.

To further validate the accuracy of the proposed MLR-RBF model, Grey model is also put forward as a benchmark model of the proposed model. The specific provincial power grid investment demand forecasting errors by different forecasting model are showed in Table 5. The results showed that the MLR-RBF model has the best forecasting performance and the RBF error correction method has positive effect to improve the MLR model forecasting accuracy. In addition, both MLR and MLR-RBF model outperform the Grey model in terms of average forecasting errors. These results demonstrate the proposed MLR-RBF model is a promising tool for provincial power grid investment demand forecasting.

##### 4.4. The Power Grid Investment Forecasting Results and Analysis

In this section, we employ the proposed MLR-RBF model to forecast the provincial power grid investment in 2018. Provincial historical data from 2004 to 2016 are used to estimate model parameters. To forecast future provincial power grid investment, influencing factors (GDP, POP, SEC, EIC, and PL) should be defined firstly for 2018 provincial power grid investment demand forecasting. So it is a key issue to ascertain the input data of the MLR-RBF model to forecast 2018 provincial power grid investment demand. In this paper, these provincial input variables for 2018 are got from relevant provinces’ statistical bureaus and development planning. In addition, based on these relevant provinces’ statistical bureaus and development planning, three different scenarios are set to forecast 2018 provincial power grid investment demand.

To simplify the complexity of forecasting process, we firstly employ the hierarchical clustering analysis (HCA) method to divide 23 provinces into 4 categories; the clustering results are showed in Figure 8 and Table 6. Then three different scenarios are set for all different categories; the specific scenario setting results are shown in Table 7. Finally, 2018 provincial power grid investment demand forecasting results are obtained based on the proposed MLR-RBF model and scenario setting results.

The forecasting results of 2018 provincial power grid investment demand under different scenarios are showed in Figure 9.

#### 5. Conclusion

This article uses spearman rank degree to find out the key influencing factors of power grid investment; they are GDP, POP, SEC, EIC, and PL and next, the 23 provinces power grid investment multiple linear regressive models with power grid investment and the five key influencing factors’ increasing rates. Then the RBF neural networks are used to modify the forecasting errors. At last, we divided the 23 provinces into four classes by hierarchical clustering analysis method with considering five key influencing factors. Based on the regional differences of electric grid investment development, this essay can represent, quantifiably, the correlations between electric grid investment and various influencing factors. And the min, mean, and max values of the influencing factors can realize the scenario analysis, which make the forecasting results become more reasonable. Predicting electric grid investment can help to perfect further project decision-making processes and improve management and decision levels of investment projects.

#### Data Availability

The data used to support the findings of this study are available from the corresponding author upon request.

#### Conflicts of Interest

The authors declare no conflicts of interest.

#### Authors’ Contributions

Ersheng Pan contributed to the conception and design. Dong Peng collected and interpreted the data. Wangcheng Long and Yawei Xue analyzed the calculation results. Lang Zhao and Jinchao Li wrote and did the computation. All of the authors drafted and revised the manuscript together and approved its final publication.

#### Acknowledgments

This work has been supported by National Key R&D Program of China (2016YFB0900100).