Abstract

Economic development in China requires lots of energy to support it, but how to acquire an adequate energy supply is a difficult problem. Meantime, environmental pollution caused by energy consumption is a problem that immediately needs to be solved. To adapt to China’s rapidly emerging economy, and based on existing policies, giving more consideration to energy saving and environmental safety is more important. Therefore, to investigate China’s regional environmental efficiency and its factors has key importance. In order to evaluate the environmental efficiency input in China, this study first selects some indexes of environmental efficiency and applies the Data Envelopment Analysis (DAE) method to measure the efficiency of input and output. Then, the relative index of environmental efficiency input is selected as the input variable and the efficiency value as the output variable. The Backpropagation neural network is employed to learn and establish the prediction model and achieve high prediction accuracy. The performance of the model is improved by optimizing the index of environmental efficiency investment, adopting the latest data, and increasing the learning samples. This method is not only suitable for the evaluation of macro-environmental efficiency investment, but also suitable for enterprises in specific industries.

1. Introduction

Environmental efficiency investment is the material basis of scientific environmental efficiency and technological innovation. The quantity of environmental efficiency investment determines the scale of environmental efficiency activities [1]. It is an important goal of China’s environmental efficiency management in the new period to make better utilization of environmental efficiency resources, encourage the optimum allocation of environmental efficiency resources in the whole society, maximize the benefits of environmental efficiency investment, rely on scientific and technological progress to promote economic development, and realize the fundamental change of economic growth mode [2].

Environmental efficiency resources are scarce resources. The evaluation of environmental efficiency investment is conducive to optimize the investment structure, save environmental efficiency resources, and maximize the benefits of limited environmental efficiency investment [3]. In recent years, the performance of environmental efficiency input and output has attracted people’s attention. Song et al. [4] suggested a slack-based radial index to assess the efficiency of the environment of different provinces in China for eight years and reported that the efficiency of the environment in the eastern provinces is the maximum, as compared to the central region. Zhang and Choi [5] used DEA to investigate the environmental performance of different provinces in China, showing that the majority of the provinces show minimum energy efficiency and that different regions have different environmental efficiency. A hybrid model based on the combination of DEA and Slacks-Based Measure (SBM) was presented in [6] to predict the environmental effects of different industries in China for four years. They concluded that the environmental efficiency of industries in China is the minimum. Wu et al. [7] employed a two-phase DEA technique to assess the environmental efficiency of unexpected output and complete research on 8 cities and 30 provinces of China. They validated their model with different obtained results and estimated the actual situations of the environment in eight cities. A model based on meta-frontier nonradial Malmquist CO2 emission performance index (MNMCPI) was presented by Yao et al. [8]. They applied panel data of Chinese industries for three years (1998 to2001) and predicted the change in the efficiency of CO2 emission in China and its important factors. They reported that the average emission of carbon dioxide industries in the eastern, western, and central regions dropped to 5.52% in total.

Castellet and Molinos-Senante [9] revealed that each waste pollutant produced by an industrial plant has a distinct impact on the environment. Moreover, they employed the weighted slacks-based assess model to evaluate the effectiveness of the plant. Giovanna and colleagues [10] applied NDDF and analytical hierarch process (AHP) methods to examine the environmental management performance of 95 sewage management plants. Feng and Chiu [11] utilized the DDF method to estimate the waste-water management competence of 29 cities and provinces in China from 2010 to 2015. They found that the management of waste-water was better in economically advanced regions and the waste-water management efficiency needs to be upgraded. Liu et al. [12] developed an improved DEA model to assess the biological efficiency of different cities in China and the outcomes show that economic developmental factors, such as environmental regulation, industrial organization, and technological innovation, have impacted the environment of several regions. Zhang et al. [13] applied the improved SBM model to investigate the efficiency of environmental management of water pollutants of industries from 2012 to 2015. They found that the efficiency of management for industrial water pollution varies, and the degree of industrial output inequity with industrial waste-water management performance.

All the existing environmental efficiencies methods are based on the input and output of environmental efficiency. The evaluation index in these systems involves two kinds of indicators, input and output, which belong to the postevaluation. Therefore, even if poor performance is found, the result cannot be changed. In addition, many evaluation systems use absolute indicators, resulting in poor comparability in diverse areas and units. To reduce the waste of environmental efficiency resources, we must find a technique for evaluation evaluate in the stage of environmental efficiency investment. In this way, once problems are found, they can be adjusted in time and the postevaluation can be changed into precontrol. To evaluate and compare the input of environmental efficiency in different regions, this paper selects some indexes of environmental efficiency input and output in Chinese mainland provinces and cities. We use the data of envelopment analysis to measure the efficiency value, then reselect the relative index that may affect efficiency as output and output value, take it as a learning sample, and apply BP neural network to train the model. On this basis, we predict the efficiency of environmental efficiency input and establish a new evaluation model of environmental efficiency input efficiency.

The remaining sections of the paper are ordered as follows. In Section 2, the proposed environmental efficiency method and prediction model are presented. In Section 3, different variables and data are discussed. The results are explained in Section 4, and Section 5 concludes the manuscript.

2. Environmental Efficiency Method

2.1. Data Envelopment Analysis (DEA)

DEA method is a system analysis method presented by Farrell [14]. It is based on the theory of relative efficiency evaluation. It is an efficient evaluation method for the effectiveness of multi-index input and output. The actual input-output data points are then multiplied by the calculated weights to predict the efficiency scores. It is mainly used to evaluate the comparative effectiveness of similar units. The evaluation index can include humanities, economics, and economics for the unstructured factors in social, psychological, and other fields. The dimensions of each indicator are generally different, and dimensionless indicators can also be used. It can appraise the effective production limits based on a collection of input and output observations, which is a nonparametric statistical analysis. In addition, DEA can also measure the suitability of the input scale of each decision-making unit (DMU) and provide the direction and degree of each DMU to adjust its input scale.

DEA evaluates the individual samples in a sample set that is relatively effective according to the sample data. When measuring the relative efficiency of several decision-making units, DEA focuses on the optimization of each decision-making unit, and the relative efficiency obtained is its maximum value, which is the most beneficial to the relative efficiency of the decision-making unit. To assess the validity of DMU, Charnes and Cooper [15] introduced the concept of non-Archimedean infinitesimal, so that the simplex method of linear programming can be used to solve the model and judge the DMU at one time. When we use the constant returns to scale model to evaluate the efficiency, we must assume that each DMU is located in the optimal production scale. Otherwise, the measured efficiency value includes the influence of the scale effect. To quantify the pure technical efficiency of a production unit, a variable return to scale (BCC) model was proposed by Haidong et al. [16]. Under the theory of variable returns to scale method, production set is computed as follows:

Based on the model of pure technical efficiency evaluation (with relaxation variables SA and SB and perturbation) ε (after) is as follows:

If the result of problem-solving is , the results are as follows.(1)If  = 1 and SA = 0 and SB = 0 then DMU0 is valid.(2)If  = 1 then DMU0 is poorly efficient.(3)If  < 1, then DMU0 is not effective.

The efficiency value calculated by constant returns to scale methods is comprised of pure technical efficiency and scale efficiency. The variable return to scale model examines the pure technical efficiency of production units: The technical efficiency , pure technical efficiency , and scale efficiency . The relationship among these efficiencies is computed as

By executing the DEA models of VRS and CRS, respectively, the results are obtained and . We can use them to calculate the level of scale efficiency. When , the scale efficiency of the production unit is 1, which shows that the production is on the best scale. Otherwise, the scale efficiency of the production unit will be lost. There are two reasons for the loss of scale efficiency, namely, too large scale and too small scale. As calculated above, when  < 1, the two cases cannot be differentiated. That is, it is impossible to determine whether the production is in the stage of increasing returns to scale or decreasing returns to scale, which reduces the performance of scale efficiency analysis. Therefore, Coellitj [17] proposed a nonincreasing return to scale (NIRS) model (when changes to it becomes NIRS model).

When the production unit is in scale invalid, ( < 1) and can be used to evaluate the stage of return to scale. When(1), production is in the stage of diminishing returns to scale and(2), production is in the stage of increasing returns to scale.

2.2. Backward Propagation Neural Network

The backpropagation (BP) neural network is a feed-forward network comprised of multiple layers. It is trained using the error backpropagation algorithm and is a widely used neural network algorithm. Its unique nonlinear adaptive information processing capability beats the traditional artificial intelligence method in intuitive cognition [18,19]. For example, in pattern recognition, speech recognition and unstructured information processing BP neural networks have the advantages of strong nonlinear approximation, parallel processing, self-learning, and fault tolerance. So they are widely used in the fields of neural expert systems, combinatorial optimization, and prediction. BP network is a one-way propagation multi-layer forward network, which solves the learning problem of implicit unit connection weight in a multi-layer network. The input data is forwarded from the input node to each hidden layer in turn, and then to the output node. The result of each layer node affects the output of the next layer node. To speed up the convergence of network training, the input vector can be standardized, and the initial value of each connection weight can be set. The neural network can approximate complex functions by compounding simple nonlinear functions several times.

The basic mechanism of BP network model data processing is that the input signal “IP”is processed through the middle nodes (hidden layer point), and the output signal “OK”is generated through nonlinear transformation. Each sample of network training is comprised of the input vector “p” and the expected output “t,” and the deviation between the output “o”and the expected output “t,” by modifying the connection weight between the input node and the hidden node, the connection weight “cw” between the hidden node and the output node, the threshold, and the error decreases along the gradient direction.

After repetitive learning and training, the network weights and thresholds with the minimum error are predicted, and then the training is stopped. Meanwhile, the trained BP neural network can process the input information within the input range by itself, and then the output information with the least error is transformed into the output information through nonlinear transformation. The specific mathematical model BP neural network is comprised of the following.

2.2.1. Transfer Function

It is a function representing the stimulation pulse intensity of the lower input to the upper node, also known as the stimulation function. Usually, it takes a sigmoid function with continuous values in (0,1)

2.2.2. Error Computation

It is a function reflecting the error between the expected output and the output of the neural network. The output error of the jth node is computed as follows:

The total error is calculated using

The mathematical model of the middle layer node is as follows:where signifies the output of jth node when the kth sample is in the middle layer and is the input of the jth node. represents the weight from the input layer to the middle layer. The mathematical model of the output node is as follows: is the output of jth node when the kth sample is input on the output layer. is the weight from the middle layer.

2.2.3. Weight Correction

The weight correction is computed as

The propagation process of the BP algorithm is comprised of two steps, namely, forward propagation and backward propagation. During forward propagation, the input samples are handled layer by layer from the input layer through the hidden cells and then transmitted to the output layer after passing through all the hidden layers. In layer-wise processing, the neurons of each layer only alter the status of the next layer of neurons. In the output layer, the current output is compared with the expected output. If the current output is different from the expected output, the backpropagation process is started. In backpropagation, the error signal is transmitted back through to the initial forward propagation path, and the weight coefficient of each neuron in each hidden layer is altered to lessen the error.

3. Variables and Data

Due to the two-step analysis, the input-output variables used in this study are divided into two parts, as shown in Table 1. In the first part, in the stage of DEA analysis, scientific and technological funds, R&D funds, the input of scientists and engineers, and R&D scientists and engineers are selected as input variables, and technology market turnover, invention patents, utility model and design patents, SCI papers, EI papers, and ISTP papers are selected as output variables. It should be noted that in the selection of output variables, considering that the difficulty and potential economic impact of invention patents, utility models, and designs may also be different, different patent data are separated. In addition, not all scientific and technological inputs are used to produce papers and patents, but a considerable part is used for the construction of scientific environmental efficiency infrastructure, such as the manufacturing of large-scale equipment and instruments, network construction, experimental animal breeding, and scientific and technological literature procurement. The second part is in the learning stage of the BP neural network. From the perspective of efficiency analysis, the regions with low investment do not necessarily have low efficiency, so the relative indicators must be used in the evaluation of environmental efficiency investment. Therefore, the number of scientific and technological personnel with 10000 local people, the proportion of R&D funds in environmental efficiency funds, the proportion of scientists and engineers in environmental efficiency personnel, the proportion of scientific and technological personnel, the proportion of scientific and technological personnel in environmental efficiency funds, and the proportion of scientific and technological personnel in environmental efficiency investment are selected. The proportion of R&D scientists and engineers in R&D personnel is taken as an input variable [20].

The efficiency analysis value is taken as the output variable, and BP neural network is further used for learning. Owing to the lag between the input and output of environmental efficiency, it must be seriously considered when selecting data. The authors in [21] adopted a four-year lag method based on empirical estimation when calculating the degree of dependence on foreign technology. In the time series, the lag variables usually choose 1–3 periods. Based on the actual situation of environmental efficiency input and output, this study selected 3 periods according to the empirical estimation. The input index is used to select the data of China Environmental Efficiency, and the output index selected the environmental efficiency statistics data of the national statistical information network in 2006. Due to the lack of some data in Tibet, it was omitted.

4. Empirical Results

4.1. Efficiency Analysis

DEA analysis was carried out using the DEAP 2.1 software toolkit, and the results are shown in Table 2. On the whole, the input-output efficiency of environmental efficiency in China is comparatively high, with an average value of 0.84. There are 13 provinces and cities that have achieved full efficiency. The environmental efficiency of the majority of the provinces has been enhanced gradually in the context of rapid economic growth. These provinces and cities have achieved greater scientific and technological output with less scientific and technological input, namely Beijing, Heilongjiang, Zhejiang, Tianjin, Hubei, Hunan, Shanghai, Hainan, Chongqing, Shaanxi, Gansu, Yunnan, and Xinjiang. This is owing to its special economic and political status in China. Five of these provinces are located in the eastern region, 3 in the central region, and 5 in the western region of China. There is no obvious gap among the eastern, central, and western regions in the input-output efficiency of environmental efficiency. The low efficiency of some areas indicates that their environmental efficiency input structure may not be reasonable, and they do not produce corresponding output or output is slow. As efficiency is a relative analysis, this is the result that can be achieved in the existing resources and institutional environment, so it has practical significance.

4.2. BP Neural Network Prediction

We employed the Alyuda neurointelligence 2.2 neural network software in this study. The proposed BP neural network has four input nodes and only one output node, which represents the efficiency value. It is generally believed that when only 1 or 2 hidden layers are used, the convergence property of the network is the best. In this study, after a preliminary test, it was decided to use one hidden layer, that is, a three-layer network to establish the nonlinear mapping relationship between environmental efficiency system and efficiency. In node selection, if the number of hidden layer nodes is too small, the capability of a network to obtain information from the samples is week, which is not sufficient to summarize and reflect the sample rules in the training set. If there are too many hidden layer nodes, they may learn to remember the irregular noise in the sample, which leads to the over coincidence problem and reduces the generalization ability. Because the software Alyuda neurointelligence has the function of predicting hidden nodes, the default optimization value 8 was selected as the number of hidden nodes. In the network training, the learning rate was set to 0.1, and the momentum was set to 0.1, and it was stable after 80000 iterations. To test the prediction accuracy of the model, the input data was taken as the simulation value, and the calculation result of BP neural network prediction was obtained. The error of only one province was more than 5% which is the maximum.

The error was 9.77%. According to the neural network model, the efficiency of environmental efficiency input structure in any region can be evaluated in advance. Once the signs of low efficiency are found, we must immediately analyze the reasons and take measures to strengthen environmental efficiency management and financial supervision, to save resources and create more scientific environmental efficiency output.

The evaluation of environmental efficiency input is a complex task. The method introduced in this study is only applicable to the macro evaluation of a region. In the specific application, the latest available data and historical data must be used to recalculate the efficiency value, and then use BP neural network for prediction. Due to the different characteristics of environmental efficiency input and output in different regions and the difficulty of quantitatively evaluating the external environmental efficiency output, this model needs to be further optimized by increasing some index data of environmental efficiency input and output and learning samples. The proposed method can only predict the possible output efficiency of regional environmental efficiency input from a macro perspective. Further analysis and diagnosis are needed to improve the environmental efficiency management and improve the environmental efficiency input-output performance. Since the present study is based on obtaining larger scientific and technological output with smaller scientific and technological input as the basic premise, it must be modified based on this method if social benefits are taken into account. This method can also be used to evaluate the enterprise’s environmental efficiency investment in a specific industry. As DEA is a relative analysis of efficiency, it does not examine the scale of environmental efficiency input and output of various provinces and cities. Therefore, its role must be treated dialectically, especially in the evaluation of microenterprises.

5. Conclusion

This study used the DEA model to evaluate the environmental efficiency of China in terms of subregional, subprovincial, and overall perspectives. We evaluated the technical differences in regional environmental inefficiency. Initially unique indexes of environmental efficiency are selected followed by the DAE technique to measure the efficiency of input and output. Then, the relative index of environmental efficiency input is applied as the input variable and the efficiency value as the output variable. Machine learning technique such as Backpropagation neural network is used to develop the prediction model and achieve high prediction performance (error <8%). The performance of the model is improved by enhancing the index of environmental efficiency investment, adopting the latest data, and increasing the learning samples. This method can be helpful in the evaluation of macro-environmental efficiency investment and enterprises in specific industries.

Data Availability

The data used to support the findings of this study are available from the corresponding author upon request.

Conflicts of Interest

The authors declare that they have no conflicts of interest.

Acknowledgments

This work was supported by the National Natural Science Foundation of China (No. 71673022), Beijing Social Science Foundation (No. 17LJB004), Fundamental Research Funds for the Central Universities (No. FRF-BR-19-006A), and Tsinghua University—Inditex Sustainable Development Fund (No. TISD201902).