Nature-Inspired Algorithms and Applications: Selected Papers from CIS2013View this Special Issue
Research Article | Open Access
Research and Application of Improved AGP Algorithm for Structural Optimization Based on Feedforward Neural Networks
The adaptive growing and pruning algorithm (AGP) has been improved, and the network pruning is based on the sigmoidal activation value of the node and all the weights of its outgoing connections. The nodes are pruned directly, but those nodes that have internal relation are not removed. The network growing is based on the idea of variance. We directly copy those nodes with high correlation. An improved AGP algorithm (IAGP) is proposed. And it improves the network performance and efficiency. The simulation results show that, compared with the AGP algorithm, the improved method (IAGP) can quickly and accurately predict traffic capacity.
Artificial neural networks have been widely applied in data mining, web mining, multimedia data processing, and bioinformatics . The success of the artificial neural network is largely determined by its structure. The optimization of network structure is usually a trial-and-error process by growing or pruning method. However, many algorithms employ the hybrid algorithm to optimize network structure , such as AGP.
Generally speaking, the method of optimizing neural network structure includes growing method, pruning method, and the hybrid algorithm of the two strategies basically. The first is also known as a constructive method. Based on the minimum network, adding new hidden units trains the network by data . We know that the grow-when-required (GWR) algorithm of Marsland adds the hidden nodes based on the network performance requirements . The disadvantages of growing methods are that the initial small network can be easily overfitting and trapped in local minima and it may also increase the training time .
The second method is called the destructive method, which deletes the unimportant nodes or weights in the original large network . Lauret et al. put forward the extended Fourier amplitude sensitivity algorithms to prune the hidden neurons. This algorithm quantifies the correlation of neurons in the hidden layer and sorts it. And finally it iterates the most favorable neurons by using quantitative information and prunes the notes that rank late. By this method, however, the output and input of the network hidden neurons are independent . When there are dependencies between them, this method is invalid. Xu and Ho describe a UB-OBS pruning algorithm that prunes the hidden units of feedforward neural network. It uses orthogonal decomposition method to determine the hidden node that needs pruning and then recalculate the weights of the remaining nodes to maintain the network performance. But the biggest drawback of pruning method is to determine the size of the initial network .
There will be more problems only by growing or pruning algorithms, so the hybrid algorithm of growing and pruning algorithms is proposed. It does not need to determine the initial network and does not carry out overfitting . And it can be complementarily the two kinds of algorithms by enlarging their respective advantages and narrowing disadvantages . AGP is a kind of growing pruning hybrid algorithm. In the structural design, the algorithm is based on the sigmoidal activation value of the node to adjust the neural network by pruning the little value neurons, merging similar neurons, and increasing the corresponding neurons, so it can adjust the structure of network self-adaptively . In recent decades, the structure optimization algorithm of neural network has received extensive attention [10–17]. The algorithm could be applied to nonlinear function approximation problems, but it has many times of iteration, complex calculation and needs to set threshold and adjust the parameters frequently.
Therefore, the feedforward neural network structure optimization algorithm still has much room for improvement. So IAGP was presented in this paper. Network pruning is based on the sigmoidal activation value of the node and all the weights of its outgoing connections. Network growing is based on the idea of variance. We directly copy those nodes with high correlation. It can rapidly, accurately, and self-adaptively optimize network structure.
Finally, it is applied to nonlinear function approximation and prediction of traffic capacity, and simulation results show the effectiveness of the improved AGP algorithm.
This algorithm can solve the problem of adjusting the structure of network self-adaptively. First, it creates an initial feedforward neural network and then trains network by using BP algorithm until it reaches the target error. Otherwise, it calculates the sigmoidal activation value of the node to prune all the insignificant neurons and combines a large number of neurons to achieve the purpose of simplifying the network. Then after a certain amount of training, if it still does not reach the target accuracy, we will increase node based on the idea of cell division. It ensures that the growing node is the best. At the same time, it ensures the correlation between the two nodes. Then we retrain the network. If classification accuracy of the network falls below an acceptable level, then stop training; otherwise, continue training .
In order to improve network performance and efficiency, IAGP was presented in this paper. First, the algorithm creates an initial network based on the actual problem. Here we assume that the initial network is a fully connected multilayer feedforward neural network with layers, as shown in Figure 1.
In each th layer, let be the number of neurons where . Here we let the first layer 0 be an input layer, let the layers between 0 and be hidden layers, and let the last layer be an output layer. The th input neuron of 0th layer is , , and the th input neuron’s bias value is always equal to 1. Let be the number of patterns, in a dataset, and the value of the th input neurons of th pattern is . Among layers in a network, the th neuron of th hidden layer is , where and . The weight between input neuron and hidden neuron is , . The weight between a neuron and a neuron is , , and the initial weights generally take a random value between −1 and 1.
The activation value of neuron is , and the activation value of neuron is . Here let be the output of the th neuron in output layer , where . BP algorithm is adopted here, and and can be written aswhere , ; based on the above, we can get the output :where .
Here the value mean squared error is ; we know the dataset with objects and the desired known target value ; and we can use BP algorithm to train the dataset. The total net value of the neuron can be written asThen the significance measure can be expressed aswhere , , and .
According to the above formula, we can see that the significance measure of a neuron is computed by adding its aggregated activation value over all the patterns with all its outgoing connections.
In order to achieve the purpose of pruning neural network, we should combine similar neurons, and the weight of the new neurons can be expressed aswhere and are 2 initial neuron weights and is their similarity, where .
When the neural network needs pruning, network adjustment for hidden layer neurons is based on the following formula:where is the threshold value, , and is the neuronal contribution value; if it is less than the threshold, the neuron is meaningless; if it is more than the threshold, it is significant.
The process of identifying insignificant hidden neurons is shown in Figure 2.
Similarly, we can get the rule of pruning the input layer as follows:where , , and .
Here the threshold is obtained by calculating the average of all contributions based on the sigmoidal activation value of the node and all the weights of its outgoing connections. It only eliminates neurons below the threshold and less number of iterations. Because it inherits the weight of the previous network, it reduces the amount of pruning step and does not make any complicated calculations, sets thresholds, and adjusts parameters.
After the above steps, if it still cannot reach the target, here we assume that the algorithm cannot fully learn the sample. So we need to add nodes with the idea of inheritance and significance measure’s variance. We directly copy those nodes with high correlation (select the intensity broad point and then average them):where .
is significance measure’s variance, is the significance measure of the neuron, is the average value of the significance measure of all neurons from 1 to , is the smallest variance, is an intensity variance near , and is the node whose density is wide. Let the hidden neuron be a parent node, and copy it into parts. The input weight of the new node is and the output weight is , , .
and are, respectively, the input weights of old and new neuron and and are, respectively, the output weights of old and new neuron. The direct “copy,” thought to add new nodes, can retain the relevance between nodes, greatly reduce the error, prevent overfitting, and quickly converge, and be fewer iterations.
2.3. The Algorithm
IAGP is based on the sigmoidal activation value of the node and all the weights of its outgoing connections. We optimize the neural network structure by increasing or decreasing the neurons. We can use BP algorithm to train network until it reaches the target error. It can quickly and effectively achieve the target error.
Compared with AGP algorithm, the improved AGP algorithm has the following advantages.(1)Because the growth method and pruning method are adopted, the training time is greatly reduced and the training step is relatively short.(2)Although the structure of neural network that is optimized by IAGP algorithm is more simple, it also keeps the overall performance of the original network.(3)It does not need to set parameters in advance and these parameters are directly obtained by calculation.(4)The IAGP has better fitting accuracy and generalization ability than the original algorithm.(5)It can achieve network performance requirements faster and better.
The pseudocode of IAGP is as follows.
Step 1. Create a small initial network, and then use BP algorithm to train network.
Step 3. Calculate the sigmoidal activation value of the node and combine a large number of neurons to achieve the purpose of simplifying the network.
Step 5. After the above steps, if it still does not reach the target accuracy, at that time, we use the improved growing method to train the dataset; as we know, if it met the network performance, go to Step 6; otherwise, go to Step 2.
Step 6. End the neural network training.
Research indicates that IAGP can quickly and efficiently adjust the network structure accurately, reduce a large number of steps, and improve the efficiency.
3. Simulation Experiments
In this paper, considering the effectiveness of IAGP, it is applied in the prediction of nonlinear function approximation and the transportation capacity. The algorithm is proven to be effective according to simulation result.
3.1. Approximation of the Nonlinear Function
Consider the following nonlinear function:where . There are 70 groups of experimental data as the training samples and 30 groups as test samples. There are 15 initial hidden neurons, and we use improved AGP algorithm to train the network. Then 7 hidden nodes are left.
Figure 3 shows the effect of nonlinear function’s approximation by neural network. Compared with AGP, we easily find that AGP can approximate the function effective better, faster and more effective. In Figure 4, it is a training error.
3.2. Application for Transport Capacity
We all know that the transportation has the nonlinearity complexity and randomness . This paper adopts the IAGP algorithm to predict the transportation capacity. In order to be able to handle the transport demand and well predict the transportation capacity, we need to get some parameters based on the analysis of the factors influencing the freight volume. These parameters maybe include GDP, industrial output, the length of railway line, the proportion of double track mileage, the length of road transport routes, the proportion of grade highway, the number of railway train, and the number of laden civilian vehicles. These parameters can be used as the input vectors of the artificial neural network and the output vectors are total volumes of cargo transportation, rail freight, and highway freight. The neural network structure of experiment is 8-24-3, the model is shown in Figure 5, and the experimental data comes from China yearbook.
It selects the statistical data of 2002 to 2009 as training sample of the experiment and the statistical data of 2009 to 2013 as test sample of the experiment. Let the initial number of neurons in the hidden layer of new AGP and AGP be 10, and the network training error is 0.01.
It is shown in Table 1 that 2 kinds of optimization algorithm performance are compared. With IAGP, the number of neurons in hidden layer of neural network is 6, training error is 0.031, training steps are 246, and training time is 23.8. By contrast, the improved algorithm has the corresponding improvement in the four aspects, and the improved AGP algorithm does not change the overall structure of neural network. So this method is very practical.
Figure 6 shows the performance of the improved AGP algorithm and AGP algorithm in the traffic prediction. It can be seen that the improved AGP algorithm results are basically consistent with the actual situation. Although AGP algorithm generally can keep up with the actual traffic forecast, it has a little lag error or a little gap. Network training step is shown in Figure 7, probably in iteration 250; the network gradually stabilized. While Figure 8 is about 610 iterations, the network became more stable. So the use of IAGP algorithm is faster than the AGP algorithm. The training error is as shown in Figure 8.
Simulation results show that IAGP can well predict the transportation capacity, can be very good to follow the actual output, and has little error. The approximation speed of AGP is slower and maybe has a bigger error.
As you can see from Figure 6, the traffic freight volume of China is increasing every year. This algorithm plays an important role in forecasting transport ability in our economy and can be reasonably optimization related traffic resources.
This paper researches and improves AGP. First of all, we use BP algorithm to train network. Then pruning neurons are based on the sigmoidal activation value of the node and all the weights of its outgoing connections, and growing neurons are based on the correlation of significance measure’s variance. Then it trains the neural network until reaching the target accuracy. With the IAGP, we change little network structure, have a few training steps, and have short time, and network structure is more simple. The experimental results show that the method improves the efficiency and accuracy of the traffic prediction.
Conflict of Interests
The authors declare that there is no conflict of interests regarding the publication of this paper.
This work was jointly supported by the Guangxi Key Laboratory Foundation of University and Guangxi Department of Education Foundation.
- Z. Zhang, The Study of Self-Organization Modular Neural Network Architecture Design, Beijing University of Technology, 2013.
- Z. Zhang, J. Qiao, and G. Yang, “An adaptive algorithm for designing optimal feed-forward neural network architecture,” CAAI Transactions on Intelligent Systems, vol. 6, no. 4, 2011.
- X. Yu, The structural optimization research for FNN controller based on the combination of pruning method and growth method [Master thesis], Southwest Jiao tong University, 2009.
- S. Marsland, J. Shapiro, and U. Nehmzow, “A self-organising network that grows when required,” Neural Networks, vol. 15, no. 8-9, pp. 1041–1058, 2002.
- J.-F. Qiao, M. Li, and J. Liu, “A fast pruning algorithm for neural network,” Acta Electronica Sinica, vol. 38, no. 4, pp. 830–834, 2010.
- P. Lauret, E. Fock, and T. A. Mara, “A node pruning agorithm based on a fourier amplitude sensitivity test method,” IEEE Transactions on Neural Networks, vol. 17, no. 2, pp. 273–293, 2006.
- J. Xu and D. W. C. Ho, “A new training and pruning algorithm based on node dependence and Jacobian rank deficiency,” Neurocomputing, vol. 70, no. 1–3, pp. 544–558, 2006.
- H.-Z. Yang, W.-N. Wang, and F. Ding, “Two structure optimization algorithms for neural networks,” Information and Control, vol. 35, no. 6, pp. 700–704, 2006.
- M.-N. Zhang, H. Han, and J. Qiao, “Research on dynamic feed-forward neural network structure based on growing and pruning methods,” CAAI Transactions on Intelligent Systems, vol. 6, no. 2, 2011.
- M. Gethsiyal Augasta and T. Kathirvalavakumar, “A novel pruning algorithm for optimizing feedforward neural network of classification problems,” Neural Processing Letters, vol. 34, no. 3, pp. 241–258, 2011.
- J.-J. Tu, Y.-Z. Zhan, and F. Han, “Neural network correlation pruning optimization based on improved PSO algorithm,” Application Research of Computers, no. 9, pp. 3253–3255, 2010.
- Y. Wang and C. Dang, “An evolutionary algorithm for global optimization based on level-set evolution and latin squares,” IEEE Transactions on Evolutionary Computation, vol. 11, no. 5, pp. 579–595, 2007.
- J. Qiao and Y. Zhang, “Fast unit pruning algorithm for multilayer feed-forward network design,” CAAI Transactions on Intelligent Systems, vol. 3, no. 2, 2008.
- H.-R. Yan and L.-J. Ma, “Design and realization of intelligent prediction model based on fuzzy neural network,” Modern Electronic Technique, no. 2, pp. 84–88, 2008.
- W. Wang and H. Yang, “Pruning algorithm for neural networks based on pseudo-entropy of weights,” Computer Simulation, vol. 23, no. 3, 2006.
- Y. Li, Y. Wang, P. Jiang, and Z. Zhang, “Multi-objective optimization integration of query interfaces for the Deep Web based on attribute constraints,” Data and Knowledge Engineering, vol. 86, pp. 38–60, 2013.
- Q.-K. Song and M. Hao, “Structural optimization of BP neural network based on correlation pruning algorithm,” Control Theory and Applications, vol. 12, 2006.
- X. Xu, “A forecast model of freight capacity based on RBF network,” Aeronautical Computing Technique, vol. 37, no. 5, 2007.
Copyright © 2015 Ruliang Wang et al. This is an open access article distributed under the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.