Abstract

The model and algorithm of BP neural network optimized by expanded multichain quantum optimization algorithm with super parallel and ultra-high speed are proposed based on the analysis of the research status quo and defects of BP neural network to overcome the defects of overfitting, the random initial weights, and the oscillation of the fitting and generalization ability along with subtle changes of the network parameters. The method optimizes the structure of the neural network effectively and can overcome a series of problems existing in the BP neural network optimized by basic genetic algorithm such as slow convergence speed, premature convergence, and bad computational stability. The performance of the BP neural network controller is further improved. The simulation experimental results show that the model is with good stability, high precision of the extracted parameters, and good real-time performance and adaptability in the actual parameter extraction.

1. Introduction

Artificial neural network (ANN) is a new information processing and computer system which is based on the modern neuroscience research and is formed by abstracting, simplifying, and simulating of biological structure. The features of ANN are as follows. ANN can fully approximate to any complex nonlinear relations; all quantitative or qualitative information keeps in storage equipotentially in each neuron of network so it is with very strong robustness and fault tolerance; ANN is a kind of system which can emulate and adapt to the unknown system and is able to deal with the quantitative and qualitative knowledge at the same time [1]. The main research of ANN is how to make computer simulate and realize the self-learning and mathematical thinking ability of human to mine the inner relation from limited samples. It is mainly through studying and storing the data relations, the inference rules, and probability distribution of known sample to deduce and reveal the potential information between variables in the unknown data sample [2].

BP neural network is currently the most popular neural network model in application [3]. BP algorithm was proposed by Rumelhart and Mcllelland in 1986 which well solved the weight adjustment problems of nonlinear continuous function in the field of multilayer feedforward neural network. It is a typical error back propagation algorithm [4, 5]. Selection of activation function, design of structure parameter, and improvement of network defect have been researched a lot since the emergence of the BP neural network. In 1973, Grossberg found sigmoid function is very similar to the work situation of biological neurons so he began to explore the relationship between the features of sigmoid function and the stability of the neural network. He prompted the function to become the most commonly used activation function of BP neural network. Since then many scholars have done improvements on the limitations of the sigmoid function. In order to solve the problem that BP algorithm can obtain larger gradient values in the whole domain, many scholars try to combine activation functions with different characteristics in different intervals together to set a larger derivative value at the needful position to make up for the inadequacy of single activation function [6]. In 1991, Kung and Hu [7] proposed FARM approximate simplified method which uses Frobenius norm according to the ideas of the global optimization and gradual optimization to delete the hidden layer units and determine the number of hidden layer nodes. Fahlman [8] found that if we adopt the sigmoid function as the activation function, the derivative of error of weight will become very small when the output value of BP network is close to 0 and 1. Salomon [9] created two BP neural networks with exactly the same initial parameters and structure and put forward a new method of network learning. The learning rate is increased and decreased, respectively, to update the weight of two networks and network is set with fast error drop as the starting point of the next update. The effect is obvious. Dan Foresee and Hagan [10] proposed BFGS quasi-Newton method which not only can avoid calculation of the second derivative but also keeps the advantage of fast convergence of the Newton algorithm. Riedmiller and Braun [11] proposed elastic BP (RPROP) method in 1993. RPROP method introduced probability of resilient update value to modify weights directly which can reduce influence of the network structure parameters in the whole learning process and avoid unforeseen gradient error convergence of fuzzy network performance.

At present, the research based on ANN to carry out quantitative structure-activity relationship research has gradually been applied to various fields. Wang et al. [12] utilized neural network to study the quantitative structure-activity relationship of angiotensin converting enzyme inhibitors. Zhao et al. [13] established the model of antitumor activity of emodin derivatives on the basis of neural network. Cui et al. [14] applied the principal component analysis and neural network method to study the quantitative structure-activity relationship of the nitrobenzene and its homologue. González-Díaz et al. [15] established some quantitative structure-activity relationship models of synthetic compounds by neural network and verified the reliability of the model. The model can be used to design new drugs. Prado-Prado et al. [16] studied quantitative structure-activity relationship of parasite drug resistance between different kinds of parasites through neural network. Ramírez-Galicia et al. [17] established quantitative structure-activity relationship model of amoeba drug resistance through multiple linear regression, stepwise regression analysis, and neural network. The result shows that the three-dimensional structure of the model is very important.

Although BP algorithm has become the most widely used artificial neural network, the BP neural network has the following defects.

It falls into local minimum value easily: the BP neural network learning algorithm based on gradient descent method is easy to fall into local extremum points and saddle point if error function is not strictly convex function and there are multiple points whose gradient is zero. Then, the network can not converge and is unable to get the optimal solution of the problem so the optimal network connection weights and threshold parameters can not be obtained [18].

The speed of error convergence is slow: convergence speed of BP neural network is decided by two aspects: one is learning rate and the other one is the size of derivative of related excitation function [19]. It is usually not easy to choose the size of the learning rate. If the learning rate is too large, the oscillation or even no convergence phenomenon will happen in the process of training. The learning rate should be a small positive number. If the learning rate is too small, the product of learning rate and negative gradient vector will become smaller. Then, the renewal speed of weights and thresholds will be affected. In addition, the size of derivative of the activation function also affects the rate of convergence. In the flat area of error curved surface, the gradient of error function is small to make update speed of weights and threshold of network slow. The network needs to pass through much iteration to be out of the flat area so the convergence speed of the network becomes slow.

Structure of network is not easy to determine: the determination of the structure of BP neural network usually refers to determination of the number of hidden layers and the number of neurons of hidden layer. Especially after Kolmogorov et al. prove the single hidden layer of BP neural network approximation theorem, how to select the number of neurons in single hidden layer neural network has always been number one of the hot and key problems. Generally, the number of neurons of input layer and output layer is easy to determine according to identification objects. The number of neurons of hidden layer is difficult to determine, and it directly affects topology structure and performance of BP neural network. It has been proved in theory that the three-layer BP neural network can approximate nonlinear functions of arbitrary precision which solves the problem of determination of the number of hidden layers in the BP neural network. If the number of the neurons of hidden layer is too small, the network can not meet the requirements of learning and the approximation performance; if the number of the neurons of hidden layer is too much, there will be adverse phenomenon in the network and it will make the hardware implementation and software calculation complicate at the same time. At present, there is no unified and complete determination theoretical framework to determine the structure of the BP neural network. The experience or grope through lots of experiments is the usual manner to estimate and adjust the structure of neural network.

In order to further improve the efficiency of the BP neural network and overcome the shortage of it, a lot of research has been conducted. Sun et al. established improved BP neural network prediction model and quantitatively researched related parameters [20]. The accuracy of the model was improved to some extent. Xiao et al. proposed BP neural network with rough set for short term load forecasting. The accuracy of prediction of BP neural network was further improved [21].

Genetic algorithm imitates evolutionary and genetic rule of biology and is a mathematical algorithm which can solve problems to find the global optimization. Due to the strong macroscopic search ability and good global optimization performance of genetic algorithm, many scholars try to use genetic algorithm to optimize the connection weights, structure, learning rules, and so forth of BP neural network. Xiao et al. [22] applied genetic algorithm to constitute GA-ANN method to optimize the optimization problems in complex engineering. The method not only takes advantage of the nonlinear mapping, network reasoning, and predicting function of neural network but also uses the characteristics of global optimization of genetic algorithm. It can be widely applied to many complex engineering problems whose objective function is difficult to express by the form of explicit function of decision variables. Yang et al. [23] applied genetic algorithm to select parameters which overcame the restriction of symmetric weight matrix of traditional fluid neural network and broadened the application fields of this intelligent exploration method. Ge [24] applied genetic algorithm to optimize the controller parameters of structure of the neural network and applied the controller to control object with pure lag. The experiment proved that the control system optimized by genetic algorithm was with good static performance and dynamic performance. Li et al. [25] combined genetic algorithm with traditional DBD algorithm to propose a new algorithm to optimize the BP neural network, making network with large scale be able to converge fast and be out of the trap of local minimum. This algorithm is less sensitive to the selection of network parameters and obtains good effect in the application of missile comprehensive test.

In order to further overcome the shortage of the model based on genetic algorithm to optimize the BP neural network, the model and algorithm of BP neural network based on expanded multichain quantum optimization are proposed. The structure of neural network is effectively optimized. The model can overcome a series of problems of basic genetic algorithm, such as the slow convergence speed, premature convergence, and bad computational stability to further improve the performance of the BP neural network controller.

2. BP Neural Network Model and Algorithm

The main idea of BP algorithm is to divide learning process into two stages, that is, positive communication of signal and back propagation of error. In the stage of positive communication, input information is from the input layer to output layer through the hidden layer [26]. The output signal forms on the output side. The weights of network are fixed in the process of the signal transmission forward. The state of neurons of each layer only affects state of neurons of the next layer. If the desired output can not be achieved in the output layer, then the error signal will back propagate. In the back propagation stage, the error signal which failed to meet the accuracy requirement spreads forward step by step and the error is shared by all units of each layer. The connection weights are adjusted dynamically according to the error signal. The weight value between neurons keeps correcting through the cycle of forward and back adjustment. The learning stops when the error of the output signal meets the requirement of precision [27].

2.1. Topology Structure of BP Neutral Network

The simplest BP neural network is with three layers as is shown in Figure 1. It includes input layer, hidden layer, and output layer. The number of nodes in the input layer is equal to the dimension of input vector . The number of nodes in the output layer is equal to the type of output module . The number of nodes in the hidden layer is associated with specific application which is usually selected by test.

The connection weight matrix between input layer nodes and hidden layer nodes is which is defined as formula (1). Similar to it, the connection weight matrix between hidden layer nodes and the output nodes can be defined as . The threshold matrix of nodes of hidden layer and output layer is and , respectively:

2.2. Learning Algorithm of BP Network

The learning algorithm of BP network can be divided into two stages that are forward and back propagation. It is a gradient descent algorithm which can make error of per connection weights of the neural network reduce. At the beginning of learning, assign the random numbers in the range of to the connection weight matrices , and threshold matrices of the hidden layer and output layer nodes , .

2.2.1. Forward Operation

First, enter the learning samples , where is the input vector of the learning samples and is the corresponding output vector:where is the number of nodes of input layer; is the number of nodes of output layer.

The data forward propagates to the output layer through the input layer and hidden layer. The produced weight value of output pattern classification is the learning result. The following steps are mainly included.

Step 1 (calculation of output value of the hidden layer nodes). The input value of nodes in hidden layer iswhere is the number of nodes of input layer, is the number of nodes of hidden layer, is connection weight, and is component of input vector.
The output value of node iswhere is the threshold of node . The activation function used is the sigmoid function given by Rumelhart:

Step 2 (calculation of the output value of the nodes of output layer). The input values of nodes of output layer areThe output value of node iswhere is threshold of node of output layer and is activation function defined by type (5).

2.2.2. The Back Operation

Calculate the error of output value and expected value of output layer. The error back propagates from the output layer through the hidden layer to the input layer. The connection weight values are modified. Steps are as follows.

Step 1 (calculate the output error of nodes of output layer). The error between the learning value of node of output layer and the output value of learning samples is

Step 2 (test learning error). is the allowed maximum learning error, which is in the interval set by the user.
If , enter the next learning sample. Otherwise, adjust the weights of network and reenter the original learning sample.
The learning process is ended when all learning samples meet the aforementioned conditions.

Step 3 (calculate the learning error of nodes of output layer). The learning error of node of output layer is

Step 4 (calculate learning error of nodes of hidden layer). Learning error of node of hidden layer is

Step 5 (revise the value of the connection weights matrix ). Set the weight value at time as adjusted new weight value; thenwhere is learning rate and is momentum factor. Both and are in the scope . Using can accelerate the learning speed and be helpful to overcome the local minimum problem of common BP algorithm.

Step 6 (revise the value of the connection weight matrix ). Consider

Step 7 (revise threshold ). Threshold of nodes of output layer is

Step 8 (revise threshold ). Threshold of nodes of hidden layer is

2.3. Run the BP Neural Network

Run the BP neural network after learning and carry out pattern classification for the input vector . Only the forward operation of BP learning algorithm is used. The process is as follows.

Step 1. Assign the value revised by BP learning algorithm to the connection weight matrices and and threshold matrices and .

Step 2. Input the input vector which needs to be recognized.

Step 3. Apply formulas (3), (4), and (5) to calculate the output of hidden layer.

Step 4. Use formulas (5), (6), and (7) to calculate the output of the output layer, that is, the classification result of input vector .

3. The Quantum Optimal Model Based on Multichain Expanded Coding Scheme

3.1. Principle of Quantum Coding Based on Multichain Expanded Scheme

In quantum computation, the smallest unit of information is represented by quantum bit whose state can be expressed aswhere and satisfy the normalization condition

The complex numbers and satisfying formulas (15) and (16) are called probability amplitude of a quantum bit. Quantum bit can also be expressed by probability amplitude as . According to the property of probability amplitude, a quantum bit can be expressed by Figure 2.

Obviously, in Figure 2, and ; therefore, quantum bit can be represented as

The coding scheme iswhere , ; ; ; is the size of the population; is the number of quantum bit.

Obviously, in Figure 3, can be decomposed into with the same way of formula (17). Combine formulas (17) and (18); quantum bit can be represented as the following form:

Formula (19) is to decompose of formula (17) into two variables, so it also meets the condition of formula (16); however, it describes the feature from the view of two-dimensional space. Only one variable is not conducive to describing the dynamic behavior of the quantum objectively, comprehensively, and vividly.   () is used to replace the described component of hypotenuse according to the characteristics of formulas (17) and (18). Formula (19) can be transformed to

Formula (9) is equivalent to translating in formula (1) to the representation of quantum bit that is the variable in two-dimensional space which is translated to three-dimensional space. The encoding scheme based on three genes of three-dimensional space of the two angle variables is formed:

Similarly, can be added to extend to form the encoding scheme based on four chains of four-dimensional space:

Increase of the supporting angle constantly like this can conclude the encoding scheme based on chain of -dimensional space, where is the number of angle variables of multiple chain code.

It can be concluded from Figure 4 that the chain coding scheme is

3.2. Transform of Solution Space

The process of the quantum genetic optimization based on multichain code is limited within the unit space . Therefore, transformation between unit space and solution space of the optimization problem is required. Suppose that the th solution variable of the optimization problem is and the value of the th quantum bit of the th chromosome is , where is the number of the angle variable of multichain code. The corresponding transformation formula of solution space is

3.3. Update of Quantum Chromosome

The update of the quantum rotation gate makes each chromosome in the current population close to the optimal chromosome. During this process, better chromosome may be generated to make the population evolve. However, different coding scheme needs different quantum rotation gate to make the quantum bit rotate a corresponding angle

Set to be the th quantum bit of contemporary optimal chromosome, and is the th quantum bit of the th chromosome of contemporary population

The rule of determining of rotation angle is as follows. If , the direction is ; if , the direction can be either positive or negative.

The rule of determining of rotation angle is as follows. If , the direction is ; if , the direction can be either positive or negative.

Set the maximal evolutionary generation to be , and the size of the th quantum rotation angle of the th generation is , where is the maximum rotation angle. The changing type of the rotation angle can ensure the angle decreases with the increase of generation to rotate around the optimal solution.

3.4. Mutation of Quantum Chromosome

The mutation of quantum chromosome is realized by quantum Not gate . The quantum bit with the amplitude of is changed to through the role of quantum Not gate. Consider

The specific form of can be determined by the method of undetermined coefficients. Diversity of population can be increased though the quantum Not gate which can prevent local convergence and avoid premature phenomenon in the process of evolution.

3.5. Algorithm Description

The main realization steps of multichain quantum genetic algorithm are as follows.

Step 1 (population initialization). Generate population with initial individuals according to formula (12) randomly.

Step 2 (transformation of solution space). Map the multiple approximate solutions of each chromosome from unit space to the solution space of optimization problem to get the solution set .

Step 3. Calculate the fitness of the approximate solution to obtain the contemporary optimal solution Best and contemporary optimal chromosome Best.

Step 4. Make Best be the optimal solution ; make Best be the global optimal chromosome .

Step 5. Do the iteration cycle, , obtaining new species by updating and mutation.

Step 6. Transform solution space of to get the solution of optimal problem .

Step 7. Evaluate to access the contemporary optimal solution Best and optimal chromosome Best.

Step 8. If , update the contemporary optimal solution and update the contemporary optimal chromosome at the same time to prevent the degradation phenomenon. Otherwise, , . Such control can ensure the convergence to the optimal value.

Step 9. Go back to Step .

4. The Model of BP Neural Network Based on Multichain Code Quantum Optimization Algorithm

4.1. Code

Connect each weight and threshold of BP network in order to form a long string of an array of real number. The string is as a chromosome. The decoding value of individual is the corresponding weights and thresholds.

4.2. Fitness Function

The purpose of using multichain quantum algorithm to optimize BP network is to simplify the structure of the network to minimize the error of network. Therefore, the fitness function iswhere is the global error of network; is the number of training samples; is the number of units of output layer; is the desired output value; is the actual output value.

The combination of multichain quantum optimization algorithm and neural network is realized by using multichain quantum optimization algorithm to optimize the parameters of the neural network. The specific ways are as follows.(1)Use the algorithm to optimize the initial weights and thresholds of neural network.(2)Use the algorithm to optimize structure parameters of BP network, the main of which is to optimize the unit number of hidden layers of network.(3)Use the algorithm to optimize the selection the learning rate and momentum factor of BP network.

The first optimization way is adopted to realize the optimization of neural network in this paper. The specific implementation process is as below.

The basic idea of using multichain quantum optimization algorithm to optimize the initial weight and threshold value of BP network is as follows. The multichain quantum optimization algorithm is applied to ensure the appropriate initial weight and threshold value to find the optimal solution of BP algorithm.

The specific implementation steps of MCQA-BP (BP neural network based on multichain quantum optimization) algorithm are as follows.(1)Initial population with individuals is produced by multichain quantum optimization algorithm. Each individual in the population composed the chromosome strings with the weight, threshold, and the number of units of hidden layer of BP network.(2)Enter the neural network module. Apply BP algorithm to do the learning process and apply multichain quantum optimization algorithm to do the fitness inspection. If the fitness is unqualified, update the chromosome until the fitness meets the requirements. The individuals which meet requirements are updated by multichain quantum optimization algorithm to calculate fitness. The unqualified individuals are updated until they meet the accuracy requirement. Then end the algorithm.

5. Simulation Test

5.1. Test Case 1

Apply MCQA-BP algorithm to diagnose fault state of machine. The data sample after normalization of machine status is shown in Table 1. The algorithm adopts network with three layers. The function of hidden layer is sigmoid tangent function; the function of output layer is sigmoid logarithmic function. The formula is used to calculate the number of neurons of hidden layer, where is the number of neurons of input layer; is the number of neurons of hidden layer. Since there are 15 input parameters and 3 output parameters, the structure of BP neural network is 15-31-3. That is, the number of nodes of input layer is 15, the number of nodes of hidden layer is 31, and the number of nodes of output layer is 3. The number of weight values is and the number of thresholds is . The total number of all the parameters that need to be optimized is 592. The training frequency of network is 1000; the goal of training is 0.01. The minimum value of error of network is as the fitness function. The state of machine failure is divided into 3 types which are, respectively, normal: ; crack: ; defect: . In order to test the trained network, three sets of new data are presented in Table 2.

Use BP, GA-BP, DCQA-BP (BP neural network based on double-chain quantum optimization algorithm), and MCQA-BP (BP neural network based on multichain quantum optimization algorithm, where the number of chain is 4). In GA-BP and DCQA-BP, the population number is set to be 50; the maximal evolutionary generation is set to be 30. The 10 times average values of the performance of algorithms are shown in Table 3.

Apply the test method of mean difference of two normal populations (-test) to analyze the error of BP, GA-BP, DCQA-BP, and MCQA-BP.

The data of error of the four methods can be considered to be samples from normal population . The means of the sample of the four methods are , , , and , respectively, and the variances are , , , and . The -statistic is introduced to be the test statistic:where , , .

The form of rejection region is .

Consider ; then .

So the rejection region is .

The hypotheses : and : need to be tested.

The mean and variance of Error 1 of BP, GA-BP, DCQA-BP, and MCQA-BP are as follows: , , and ; , , and ; , , and ; , , and . Then, , , and .

Consider . So reject ; that is, Error 1 of BP is larger than the GA-BP with probability of more than 99.5%.

Consider . So reject ; that is, Error 1 of GA-BP is larger than the DCQA-BP with probability of more than 90%.

Consider . So reject ; that is, Error 1 of DCQA-BP is larger than the MCQA-BP with probability of more than 90%.

The mean and variance of Error 2 of BP, GA-BP, DCQA-BP, and MCQA-BP are as follows: , , and ; , , and ; , ; , , and . Then, , , and .

Consider . So reject ; that is, Error 2 of BP is larger than the GA-BP with probability of more than 99.5%.

Consider . So reject ; that is, Error 2 of GA-BP is larger than the DCQA-BP with probability of more than 90%.

Consider . So reject ; that is, Error 2 of DCQA-BP is larger than the MCQA-BP with probability of more than 99.5%.

The simulation results show that MCQA-BP is significantly superior to BP, GA-BP, and DCQA-BP. The evolutionary process of BP, GA-BP, DCQA-BP, and MCQA-BP is shown in Figure 5.

It can be seen that the fitting results of different models are roughly the same. The trend of these algorithms is the same. It shows that the method of this paper is scientific and effective. The fitting error of the model of this paper is less than results of other models. So the model in this paper is better than other models.

For the reason that MCQA-BP with super parallel and ultra-high speed which can overcome a series of problems existing in the BP and GA-BP such as slow convergence speed, premature convergence, and bad computational stability optimizes the structure of the neural network effectively, the computational cost of the MCQA-BP is lower than the method of BP, GA-BP, and DCQA-BP in test case and test case .

5.2. Test Case 2

The training data is from the results of numerical simulation of 1365 kinds of target board damage under the kinetic energy rod collision of penetration cases. 15 kinds of destruction values predicted by neural network are randomly chosen and shown in Table 4. In the listed penetration value forecast under collision situation, the fitting results of different models are different. The errors between predicted values and the actual values of different models are distinguished from each other. The error of fitting result of MCQA-BP model is less than results of other models. The efficiency of MCQA-BP is further proved.

It can be seen from the table that the prediction effect of MCQA-BP is better than BP network and GA-BP on the same sample data. The error range of MCQA-BP network is less than that of BP neural network and GA-BP neural network under the same learning times. The high efficiency of MCQA-BP network model is fully illustrated.

6. Conclusions

A kind of super parallel ultra-fast BP neural network model and algorithm based on multichain quantum optimization algorithm are proposed. The algorithm makes full use of local time domain feature of the BP network and global optimization search capability of multichain quantum optimization algorithm to enhance the intelligent search ability of network. It overcomes the disadvantages of BP network to improve the effectiveness of the optimization and accelerate search efficiency and convergence speed. The model controller based on MCQA-BP network can evaluate damage effect of penetration target board of kinetic energy well. The antijamming ability of the model is fine. The MCQA-BP network controller is effective as can be seen from the actual simulation results.

Conflict of Interests

The authors declare that there is no conflict of interests regarding the publication of this paper.