Using Feed Forward Neural Network to Solve Eigenvalue Problems
The aim of this paper is to presents a parallel processor technique for solving eigenvalue problem for ordinary differential equations using artificial neural networks. The proposed network is trained by back propagation with different training algorithms quasi-Newton, Levenberg-Marquardt, and Bayesian Regulation. The next objective of this paper was to compare the performance of aforementioned algorithms with regard to predicting ability.
These days every process is automated. A lot of mathematical procedures have been automated. There is a strong need of software that solves differential equations (DEs) as many problems in science and engineering are reduced to differential equations through the process of mathematical modeling. Although model equations based on physical laws can be constructed, analytical tools are frequently inadequate for the purpose of obtaining their closed form solution and usually numerical methods must be resorted to.
The application of neural networks for solving differential equations can be regarded as a mesh-free numerical method. It has been proved that feed forward neural networks with one hidden layer are capable of universal approximation, for problems of interpolation and approximation of scattered data.
2. Related Work
Neural networks have found application in many disciplines: neurosciences, mathematics, statistics, physics, computer science, and engineering. In the context of the numerical solution of differential equations, high-order derivatives are undesirable in general because they can introduce large approximation error. The use of higher order conventional Lagrange polynomials does not guarantee to yield a better quality (smoothness) of approximation. Many methods have been developed so far for solving differential equations; some of them produce a solution in the form of an array that contains the value of the solution at a selected group of points . Others use basis functions to represent the solution in analytic form and transform the original problem usually to a system of algebraic equations . Most of the previous study in solving differential equations using artificial neural network (Ann) is restricted to the case of solving the systems of algebraic equations which result from the discretisation of the domain . Most of the previous works in solving differential equations using neural networks is restricted to the case of solving the linear systems of algebraic equations which result from the discretisation of the domain. The minimization of the networks energy function provides the solution to the system of equations . Lagaris et al.  employed two networks: a multilayer perceptron and a radial basis function network to solve partial differential equations (PDE) with boundary conditions (Dirichlet or Neumann) defined on boundaries with the case of complex boundary geometry. Mc Fall and Mahan  compared weight reuse for two existing methods of defining the network error function; weight reuse is shown to accelerate training of ODE; the second method outperforms the fails unpredictably when weight reuse is applied to accelerate solution of the diffusion equation. Tawfiq  proposed a radial basis function neural network (RBFNN) and Hopfield neural network (unsupervised training network) as a designer network to solve ODE and PDE and compared between them. Malek and Shekari Beidokhti  reported a novel hybrid method based on optimization techniques and neural networks methods for the solution of high order ODE which used three layered perceptron network. Akca et al.  discussed different approaches of using wavelets in the solution of boundary value problems (BVP) for ODE and also introduced convenient wavelet representations for the derivatives for certain functions and discussed wavelet network algorithm. Mc Fall  presented multilayer perceptron networks to solve BVP of PDE for arbitrary irregular domain where he used logsig. Transfer function in hidden layer and pureline in output layer and used gradient decent training algorithm; also, he used RBFNN for solving this problem and compared between them. Junaid et al.  used Ann with genetic training algorithm and log sigmoid function for solving first order ODE; Zahoor et al.  has been using an evolutionary technique for the solution of nonlinear Riccati differential equations of fractional order and the learning of the unknown parameters in neural network has been achieved with hybrid intelligent algorithms mainly based on genetic algorithm (GA). Abdul Samath et al.  suggested the solution of the matrix Riccati differential equation (MRDE) for nonlinear singular system using Ann. Ibraheem and Khalaf  proposed shooting neural networks algorithm for solving two-point second order BVP in ODEs which reduced the equation to the system of two equations of first order. Hoda and Nagla  described a numerical solution with neural networks for solving PDE, with mixed boundary conditions. Majidzadeh  suggested a new approach for reducing the inverse problem for a domain to an equivalent problem in a variational setting using radial basis functions neural network; also he used “cascade feed forward to solve two-dimensional Poisson equationwith back propagation and Levenberg-Marquardt train algorithm with the architecture three layers and 12 input nodes, 18 tansig. transfer function in hidden layer, and 3 linear nodes in output layer. Oraibi  designed feed forward neural networks (FFNN) for solving IVP of ODE. Ali  design fast FFNN to solve two-point BVP. This paper proposed FFNN to solve two-point singular boundary value problem (TPSBVP) with back propagation (BP) training algorithm. Tawfiq and Hussein  suggest multilayer FFNN to solve singular boundary value problems.
3. What Is Artificial Neural Network?
Ann is a simplified mathematical model of the human brain; it can be implemented by both electric elements and computer software. It is a parallel distributed processor with large numbers of connections; it is an information processing system that has certain performance characters in common with biological neural networks . The arriving signals, called inputs, multiplied by the connection weights (adjusted) are first summed (combined) and then passed through a transfer function to produce the output for that neuron. The activation (transfer) function acts on the weighted sum of the neuron’s inputs and the most commonly used transfer function is the sigmoid function (tansig) .
There are two main connection formulas (types): feedback (recurrent) and feed forward connection. Feedback is one type of connection where the output of one layer routes back to the input of a previous layer, or to same layer. Feed forward (FFNN) does not have a connection back from the output to the input neurons .
There are many different training algorithms, but the most often used is the Delta-rule or back propagation (BP) rule. A neural network is trained to map a set of input data by iterative adjustment of the weights. Information from inputs is fed forward through the network to optimize the weights between neurons. Optimization of the weights is made by backward propagation of the error during training phase.
The Ann reads the input and output values in the training data set and changes the value of the weighted links to reduce the difference between the predicted and target (observed) values. The error in prediction is minimized across many training cycles (iteration or epoch) until network reaches specified level of accuracy. A complete round of forward-backward passes and weight adjustments using all input-output pairs in the data set is called an epoch or iteration. If a network is left to train for too long, however, it will be overtrained and will lose the ability to generalize.
In this paper, we focused on the training situation known as supervised training, in which a set of input/output data patterns is available. Thus, the Ann has to be trained to produce the desired output according to the examples.
In order to perform a supervised training we need a way of evaluating the Ann output error between the actual and the expected output. A popular measure is the mean squared error (MSE) or root mean squared error (RMSE) .
4. Proposed Design
System design is the process of breaking a complex topic or substance into smaller parts to gain a better understanding of it. We try to design the EVP Solver using block diagrams.
The following are the actors of this application.(1)End user: one who interacts with the system.(2)System: receives commands and actions from the end user and performs required operations. FFNNs allow a conversion of a function from low-dimensional space to high-dimensional space (e.g., 1D–3D) in which the function will be expressed as a linear combination of ridge basis functions.
Provide the EVP of differential equation along with boundary conditions as input through GUI. Based on the EVP generates the data points. Determine the centers with respect to the generated data points. The data points and eigenvalue should be within the solution space. If the data points and eigenvalue are out of the solution space then change the boundary conditions and again find out the data points and eigenvalue.
5. Description of the Method
In the proposed approach the model function is expressed as the sum of two terms: the first term satisfies the boundary conditions (BC) and contains no adjustable parameters. The second term can be found by using FFNN which is trained so as to satisfy the differential equation and such technique called collocation neural network.
In this section we will illustrate how our approach can be used to find the approximate solution of the general form a 2nd order EVP: where a subject to certain BC’s and , denotes the domain, and is the solution to be computed.
If denotes a trial solution with adjustable parameters , the problem is transformed to a discretized form: subject to the constraints imposed by the BC’s.
In our proposed approach, the trial solution employs a FFNN and the parameters correspond to the weights and biases of the neural architecture. We choose a form for the trial function such that it satisfies the BC’s. This is achieved by writing it as a sum of two terms: where is a single-output FFNN with parameters and input units fed with the input vector . The term contains no adjustable parameters and satisfies the BC’s. The second term is constructed so as not to contribute to the BC’s, since satisfy them. This term can be formed by using a FFNN whose weights and biases are to be adjusted in order to deal with the minimization problem.
6. Computation of the Gradient
An efficient minimization of (2) can be considered as a procedure of training the FFNN, where the error corresponding to each input is the value which has to be forced near zero. Computation of this error value involves not only the FFNN output but also the derivatives of the output with respect to any of its inputs.
Therefore, in computing the gradient of the error with respect to the network weights consider a multilayer FFNN with input units (where is the dimensions of the domain), one hidden layer with sigmoid units, and a linear output unit.
For a given input the output of the FFNN is denotes the weight connecting the input unit to the hidden unit , denotes the weight connecting the hidden unit to the output unit, denotes the bias of hidden unit , and is the sigmoid transfer function (tansig).
The gradient of FFNN, with respect to the parameters of the FFNN, can be easily obtained as
Once the derivative of the error with respect to the network parameters has been defined, then it is straightforward to employ any minimization technique. It must also be noted that the batch mode of weight updates may be employed.
7. Illustration of the Method
In this section we describe solution of EVP using FFNN. To illustrate the method, we will consider the 2nd order EVP: where and the BC: , (Dirichlet case) or , (Neumann case) or , (Mixed case). A trial solution can be written as where is the output of a FFNN with one input unit for and weights .
Note. satisfies the BC by construction. The error quantity to be minimized is given by where the . Since it is straightforward to compute the gradient of the error with respect to the parameters using (5). The same holds for all subsequent model problems.
In this section we report numerical result, using a multilayer FFNN having one hidden layer with 5 hidden units (neurons) and one linear output unit. The sigmoid activation of each hidden unit is tansig; the analytic solution was known in advance. Therefore we test the accuracy of the obtained solutions by computing the deviation:
In order to illustrate the characteristics of the solutions provided by the neural network method, we provide figures displaying the corresponding deviation both at the few points (training points) that were used for training and at many other points (test points) of the domain of equation. The latter kind of figures is of major importance since they show the interpolation capabilities of the neural solution which is to be superior compared to other solutions obtained by using other methods. Moreover, we can consider points outside the training interval in order to obtain an estimate of the extrapolation performance of the obtained numerical solution.
Example 1. Consider the following 2nd order EVP:
With BC (Dirishlit case), , .
The analytic solution is ; according to (8) the trial neural form of the solution is taken to be
The FFNN trained using a grid of ten equidistant points in gave ; Figure 1 displays the analytic and neural solutions with different training algorithms. The neural results with different types of training algorithm such as Levenberg-Marquardt (trainlm), quasi-Newton (trainbfg), and Bayesian Regulation (trainbr) are introduced in Table 1 and its errors are given in Table 2; Table 3 gives the performance of the train with epoch and time and Table 4 gives the weight and bias of the designer network.
Example 2. Consider the following 2nd order EBVP:
with BC (Dirishlit case), , .
The analytic solution is ; according to (7) the trial neural form of the solution is
The FFNN trained using a grid of ten equidistant points in gave . Figure 2 displays the analytic and neural solutions with different training algorithms. The neural results with different types of training algorithm such as Levenberg-Marquardt (trainlm), quasi-Newton (trainbfg), and Bayesian Regulation (trainbr) are introduced in Table 5 and its errors are given in Table 6; Table 7 gives the performance of the train with epoch and time and Table 8 gives the weight and bias of the designer network.
Rasheed  solved this problem using semianalytic technique and the results are given in Table 9.
From the above problems it is clear that the network which is proposed can handle effectively EVP and provide accurate approximate solution throughout the whole domain and not only at the training points. As evident from the tables, the results of proposed network are more precise as compared to method suggested in .
In general, the practical results for FFNN show the network which contain up to a few hundred weights with the Levenberg-Marquardt training algorithm (trainlm) having the fastest convergence than the network with trainbfg training algorithm and then the network with trainbr training algorithm. However, “trainbr” does not perform well for function approximation on problems. The performance of the various algorithms can be affected by the accuracy required of the approximation.
Conflict of Interests
The authors declare that there is no conflict of interests regarding the publication of this paper.
K. M. Mohammed, On solution of two point second order boundary value problems by using semi-analytic method [M.S. thesis], University of Baghdad, College of Education-Ibn-Al- Haitham, Baghdad, Iraq, 2009.
R. J. LeVeque, Finite Difference Methods for Differential Equations, University of Washington, AMath 585, Winter Quarter, Seattle, Wash, USA, 2006.
S. A. Hoda and H. A. Nagla, “On neural network methods for mixed boundary value problems,” International Journal of Nonlinear Science, vol. 11, no. 3, pp. 312–316, 2011.View at: Google Scholar
K. S. Mc Fall and J. R. Mahan, “Investigation of weight reuse in multi-layer perceptron networks for accelerating the solution of differential equations,” in Proceedings of the IEEE International Conference on Neural Networks, vol. 14, pp. 109–114, 2004.View at: Google Scholar
L. N. M. Tawfiq, Design and training artificial neural networks for solving differential equations [Ph.D. thesis], University of Baghdad, College of Education-Ibn-Al-Haitham, Baghdad, Iraq, 2004.
H. Akca, M. H. Al-Lail, and V. Covachev, “Survey on wavelet transform and application in ODE and wavelet networks,” Advances in Dynamical Systems and Applications, vol. 1, no. 2, pp. 129–162, 2006.View at: Google Scholar
K. S. Mc Fall, An artificial neural network method for solving boundary value problems with arbitrary irregular boundaries [Ph.D. thesis], Georgia Institute of Technology, Atlanta, Ga, USA, 2006.
A. Junaid, M. A. Z. Raja, and I. M. Qureshi, “Evolutionary computing approach for the solution of initial value problems in ordinary differential equations,” World Academy of Science, Engineering and Technology, vol. 55, pp. 578–581, 2009.View at: Google Scholar
R. M. A. Zahoor, J. A. Khan, and I. M. Qureshi, “Evolutionary computation technique for solving Riccati differential equation of arbitrary order,” World Academy of Science, Engineering and Technology, vol. 58, pp. 303–308, 2009.View at: Google Scholar
J. Abdul Samath, P. S. Kumar, and A. Begum, “Solution of linear electrical circuit problem using neural networks,” International Journal of Computer Applications, vol. 2, no. 1, pp. 6–13, 2010.View at: Google Scholar
K. I. Ibraheem and B. M. Khalaf, “Shooting neural networks algorithm for solving boundary value problems in ODEs,” Applications and Applied Mathematics, vol. 6, no. 11, pp. 1927–1941, 2011.View at: Google Scholar
Y. A. Oraibi, Design feed forward neural networks for solving ordinary initial value problem [M.S. thesis], University of Baghdad, College of Education-Ibn-Al-Haitham, Baghdad, Iraq, 2011.
M. H. Ali, Design fast feed forward neural networks to solve two point boundary value problems [M.S. thesis], University of Baghdad, College of Education-Ibn-Al-Haitham, Baghdad, Iraq, 2012.
I. A. Galushkin, Neural Networks Theory, Springer, Berlin, Germany, 2007.
K. Mehrotra, C. K. Mohan, and S. Ranka, Elements of Artificial Neural Networks, Springer, New York, NY, USA, 1996.
A. Ghaffari, H. Abdollahi, M. R. Khoshayand, I. S. Bozchalooi, A. Dadgar, and M. Rafiee-Tehrani, “Performance comparison of neural network training algorithms in modeling of bimodal drug delivery,” International Journal of Pharmaceutics, vol. 327, no. 1-2, pp. 126–138, 2006.View at: Publisher Site | Google Scholar
H. W. Rasheed, Efficient semi-analytic technique for solving second order singular ordinary boundary value problems [M.S. thesis], University of Baghdad, College of Education-Ibn-Al-Haitham, Baghdad, Iraq, 2011.