Comparison of Artificial Neural Network Architecture in Solving Ordinary Differential Equations

Mall, Susmita; Chakraverty, S.

doi:https://doi.org/10.1155/2013/181895

Advances in Artificial Neural Systems

On this page

Abstract Introduction Discussion and Analysis Conclusion References Copyright Related Articles

Research Article | Open Access

Volume 2013 | Article ID 181895 | https://doi.org/10.1155/2013/181895

Comparison of Artificial Neural Network Architecture in Solving Ordinary Differential Equations

Susmita Mall¹and S. Chakraverty¹

Academic Editor: Ping Feng Pai

Received08 Aug 2013

Revised31 Oct 2013

Accepted31 Oct 2013

Published15 Dec 2013

Abstract

This paper investigates the solution of Ordinary Differential Equations (ODEs) with initial conditions using Regression Based Algorithm (RBA) and compares the results with arbitrary- and regression-based initial weights for different numbers of nodes in hidden layer. Here, we have used feed forward neural network and error back propagation method for minimizing the error function and for the modification of the parameters (weights and biases). Initial weights are taken as combination of random as well as by the proposed regression based model. We present the method for solving a variety of problems and the results are compared. Here, the number of nodes in hidden layer has been fixed according to the degree of polynomial in the regression fitting. For this, the input and output data are fitted first with various degree polynomials using regression analysis and the coefficients involved are taken as initial weights to start with the neural training. Fixing of the hidden nodes depends upon the degree of the polynomial. For the example problems, the analytical results have been compared with neural results with arbitrary and regression based weights with four, five, and six nodes in hidden layer and are found to be in good agreement.

1. Introduction

Differential equations play vital role in various fields of engineering and science. The exact solution of differential equations may not be always possible [1]. So various types of well known numerical methods such as Euler, Runge-kutta, Predictor-Corrector, finite element, and finite difference methods, are used for solving these equations. Although these numerical methods provide good approximations to the solution, but these may be challenging for higher dimension problems. In recent years, many researchers tried to find new methods for solving differential equations. As such here Artificial Neural Network (ANN) based models are used to solve ordinary differential equations with initial conditions.

Lee and Kang [2] first introduced a method to solve first order differential equation using Hopfield neural network models. Then, another approach by Meade and Fernandez [3, 4] has been proposed for both linear and nonlinear differential equations using -splines and feed forward neural network. Artificial neural networks based on Broyden-Fletcher-Goldfarb-Shanno (BFGS) optimization technique for solving ordinary and partial differential equations have been excellently presented by Lagaris et al. [5]. Also Lagaris et al. [6] investigated neural network methods for boundary value problems with irregular boundaries. Parisi et al. [7] presented unsupervised feed forward neural network for the solution of differential equations. The potential of the hybrid and optimization technique to deal with differential equation of lower order as well as higher order has been presented by Malek and Shekari Beidokhti [8]. Choi and Lee [9] discussed comparison of generalizing ability on solving differential equation using back propagation and reformulated radial basis function network. Yazdi et al. [10] used unsupervised kernel least mean square algorithm for solving ordinary differential equations. A new algorithm for solving matrix Riccati differential equations has been developed by Selvaraju and Abdul Samant [11]. He et al. [12] investigated a class of partial differential equations using multilayer neural network. Kumar and Yadav [13] surveyed multilayer perceptrons and radial basis function neural network methods for the solution of differential equations. Tsoulos et al. [14] solved differential equations with neural networks using a scheme based on grammatical evolution. Numerical solution of elliptic partial differential equation using radial basis function neural networks has been presented by Jianyu et al. [15]. Shirvany et al. [16] proposed multilayer perceptron and radial basis function (RBF) neural networks with a new unsupervised training method for numerical solution of partial differential equations. Mai-Duy and Tran-Cong [17] discussed numerical solution of differential equations using multiquadric radial basis function networks. Fuzzy linguistic model in neural network to solve differential equations is presented by Leephakpreeda [18]. Franke and Schaback [19] solved partial differential equations by collocation using radial basis functions. Smaoui and Al-Enezi [20] presented the dynamics of two nonlinear partial differential equations using artificial neural networks. Differential equations with genetic programming have been analyzed by Tsoulos and Lagaris [21]. McFall and Mahan [22] used artificial neural network for solution of boundary value problems with exact satisfaction of arbitrary boundary conditions. Hoda and Nagla [23] solved mixed boundary value problems using multilayer perceptron neural network method.

As per the review of the literatures, it reveals that authors have taken the parameters (weights/biases) as arbitrary (random) and the numbers of nodes in hidden layer are considered by trial and error method. In this paper, we propose a method for solving ordinary differential equations using feed forward neural network as a basic approximation element and error back propagation algorithm [24, 25] by fixing hidden nodes as per the required accuracy. The trial solution of the model is generated by training the algorithm. The approximate solution by ANN has many benefits compared with traditional numerical methods. The ANN trial solution is written as sum of two terms, first one satisfies initial/boundary conditions and the second part involves regression based neural network with adjustable parameters. The computational complexity does not increase considerably with the number of sampling points. The method is general so it can be applied to solve linear and nonlinear ordinary and partial differential equations. The modification of parameters has been done without direct use of optimization technique. For which computation of the gradient of error with respect to the network parameters is required. A regression based artificial neural network with combinations of initial weights (arbitrary and regression based) in the connections is first proposed by Chakraverty et al. [26] and then by Singh et al. [27]. Here, number of nodes in hidden layer may be fixed according to the degree of polynomial required for the accuracy. We have considered a first order and an application problem such as damped free vibration problem to show the comparison of different ANN models. Mall and Chakraverty [28] proposed regression-based neural network model for solving ordinary differential equations.

Rest of the paper is organized as follows. In Section 2, we describe the general formulation of the proposed approach and computation of gradient of the error function. Section 3 gives details of problem formulation and construction of the appropriate form of trial solution. The proposed regression based artificial neural network method has been presented in Section 4. Numerical examples and its results are presented in Section 5. In this section, we compare arbitrary and regression based weight results and those are shown graphically. Section 6 incorporates the discussion and analysis part. Lastly conclusion is outlined in Section 7.

2. General Formulation for Differential Equations

Let us consider the following general differential equations which represent both ordinary and partial differential equations [4]: subject to some initial or boundary conditions, where , denotes the domain, and is the solution to be computed. Here, is the function which defines the structure of the differential equation and is a differential operator. For the solution of the differential equation, a discretized domain over finite set of points in is considered. Thus, the problem transformed into the system of equations as follows: Let denote the trail solution with adjustable parameters (weights, biases) , and then the problem may be formulated as Corresponding error function with respect to every input data is written as

Now, may be written as the sum of two terms where satisfies initial or boundary condition and contains no adjustable parameters, whereas is the output of feed forward neural network with the parameters and input data . The second term makes no contribution to initial or boundary but this is used to a neural network model whose weights and biases are adjusted to minimize the error function.

2.1. Computation of the Gradient

The error computation not only involves the outputs but also the derivatives of the network output with respect to its inputs. So, it requires finding out the gradient of the network derivatives with respect to its inputs. Let us now consider a multilayered perceptron with one input node, a hidden layer with nodes (fixed number of nods as proposed), and one output unit. For the given inputs , the output is given by where denotes the weight from input unit to the hidden unit , denotes weight from the hidden unit to the output unit, denotes the biases, and is the sigmoid activation function.

The derivatives of ) with respect to input is where and denotes the th order derivative of sigmoid function.

Let denote the derivative of the network with respect to its inputs and then we have the following relation [4]: where The derivative of with respect to other parameters may be obtained as

3. Formulation of First Order Ordinary Differential Equation

Let us consider first order ordinary differential equation as below with initial condition .

In this case, the ANN trail solution may be written as where is the neural output of the feed forward network with one input data with parameters . The trial solution satisfies the initial condition. We differentiate the trial solution to get For evaluating the derivative term in the right hand side of (15), we use (5)–(11).

The error function for this case may be formulated as The weights from input to hidden are modified according to the following rule where Here, is the learning rate and is the iteration step. The weights from hidden to output layer may be updated in a similar formulation as done for input to hidden.

3.1. Formulation of Second Order Ordinary Differential Equation

In this case, the second order ordinary differential equation may be written in general as with initial conditions , .

The ANN trail solution may be discussed as where is the neural output of the feed forward network with one input data with parameters and the trial solution satisfies the initial conditions.

The error function to be minimized for second order ordinary differential equation will be Next, the following weight updating rule is applied for weights from input to hidden connections: where Again, we update the weights from hidden to output layer, as discussed for input to hidden.

4. Proposed Regression-Based Algorithm

Three layer architecture of ANN for the present problem is considered. Usually numbers of nodes in the hidden layer are taken by trial and error method. Here, we fix the number of nodes in hidden layer by using regression-based weight generation [24, 25]. Figure 1 shows the proposed model, in which the input layer consist of single input unit and the output layer consist of one output unit. Numbers of nodes in the hidden layer are fixed according to degree of polynomial to be considered. If th degree polynomial is considered, then the number of nodes in hidden layer will be and coefficients (constants) of the polynomial may be considered as initial weights from input to hidden as well as hidden to output layers or any combination of random and regression based weight. Network architecture with five degree polynomial has been shown in Figure 1, the six coefficients (constants) are taken as initial weights in two stages from input to hidden and hidden to output layer. The constants of the polynomial, that is, are taken as initial weights and six nodes for the six constants in the hidden layer are considered.

5. Numerical Examples

In this section, we present solution of two example problems as mentioned earlier. In all cases, we have used error back propagation algorithm and one hidden layer. The weights are taken as arbitrary and regression based for comparison of the training method. Sigmoid function is considered as an activation function for hidden unit.

Example 1. Let us consider the first order ordinary differential equation as follows: with initial condition .
The trial solution is written as We have trained the network for 20 equidistant points in [0, 1] and compared results between analytical and neural with arbitrary and regression based weights with four, five, and six nodes fixed in hidden layer. Comparison between analytical and neural results with arbitrary and regression based weights is given in Table 1. Analytic results are incorporated in second column. Neural results for arbitrary weights (from input to hidden layer) and (from hidden to output layer) with four, five, and six nodes are cited in third, fifth, and seventh column, respectively. Similarly neural results with regression weights (from input to hidden layer) and (from hidden to output layer) with four, five, and six nodes are given in fourth, sixth, and ninth column, respectively.

Analytical and neural results with arbitrary and regression based weights for six nodes in hidden layer are compared in Figures 2 and 3. The error plot is shown in Figure 4. Absolute deviations in % values have been calculated in Table 1 and the maximum deviation for arbitrary weights neural results (six hidden nodes) is 3.67 (eighth column) and for regression based weights it is 1.47 (tenth column). From Figures 2 and 3, one may see that results from the regression-based weights agree exactly at all points with analytical results but for results with arbitrary weights they are not so. Thus, one may see that the neural results with regression based weights are more accurate.

It may be seen that by increasing the number of nodes in hidden layer from four to six, the results are found to be better. Although the authors increased the number of nodes in hidden layer beyond six, but the results were not improving.

The first problem has also been solved by a well-known numerical method, namely, using Euler and Runge-kutta method. Table 2 shows comparison between the neural results (with six hidden nodes) and other numerical results (Euler and Runge-Kutta results).

Example 2. Let us consider the following second order damped free vibration equation: With initial conditions , .
As discussed above, we can write the trail solution as Then, the network is trained for 40 equidistant points in and with four, five, and six hidden nodes according to arbitrary and regression-based algorithm. In Table 3, we compare the analytical solutions with neural solutions taking arbitrary- and regression-based weights for four, five, and six nodes in hidden layer. Here, analytic results are cited in second column of Table 3. Neural results for arbitrary weights (from input to hidden layer) and (from hidden to output layer) with four, five, and six nodes are shown in third, fifth, and seventh column, respectively. Neural results with regression-based weights (from input to hidden layer) and (from hidden to output layer) with four, five and six nodes are cited in fourth, sixth, and eighth column, respectively.
Analytical and neural results which are obtained for random initial weights are depicted in Figure 5. Figure 6 shows comparison between analytical and neural results for regression-based initial weights for six hidden nodes. Finally, the error plot between analytical and RBNN results are shown in Figure 7.

Example 3. Now we consider an initial value problem as follows: subject to .
The ANN trial solution is written as Ten equidistant points in the given domain which are taken with four, five, and six hidden nodes according to arbitrary and regression-based algorithms have been considered. Comparison of analytical and neural results with arbitrary- and regression-based weights have been shown in Table 4. Also, other numerical results, namely, Euler and Runge-Kutta results are compared with neural results in this table.
Analytical and traditional neural results obtained using random initial weights with six nodes are depicted in Figure 8. Similarly, Figure 9 shows comparison between analytical and neural results with regression-based initial weights for six hidden nodes. Finally, the error plot between analytical and RBNN results are cited in Figure 10.

Example 4. Here, we consider a standard differential equation which represents exponential growth as follows: with initial condition .
Here represents time constant or characteristic time.
Analytic result may be found as Considering , we have the analytical solution as .
The ANN trial solution in this case is Now, the network is trained for ten equidistant points in the domain [0, 1] with four, five, and six hidden nodes according to arbitrary- and regression-based algorithm. Comparison of analytical and neural results with arbitrary- () and regression-based weights ( has been given in Table 5. Analytical and traditional neural results obtained using random initial weights with six nodes are shown in Figure 11. Figure 12 depicts comparison between analytical and neural results with regression-based initial weights for six hidden nodes. Error plot between analytical and RBNN results is cited in Figure 13.

6. Discussion and Analysis

In traditional artificial neural network, the parameters (weights/biases) are usually taken as arbitrary (random) and the number of nodes in hidden layer is considered by trial and error method. Also, few authors have used optimization technique to minimize the error. In this investigation, a regression-based artificial neural network with combinations of initial weights (arbitrary and regression based) in the connections is considered. We have fixed the number of nodes in hidden layer according to the degree of polynomial of regression fitting. The initial weights from input to hidden and hidden to output layer are taken by using regression-based weight generation. Back propagation algorithm has been employed for modification of the parameters without use of any optimization technique. Also, time of computation is less than traditional artificial neural architecture. Table 6 shows the computation of training time in hours with four, five, and six hidden nodes.

It is well known that the other numerical methods are usually iterative in nature, where we fix the step size before the start of the computation. After the solution is obtained, if we want to know the solution in between steps, then again the procedure is to be repeated from initial stage. ANN may be one of the reliefs where we may overcome this repetition of iterations. The authors are not claiming that the method presented is most accurate. As it may be seen by the comparison in Tables 2 and 4 that Runge-Kutta method although it gives better result but the above repetitive nature is required for each step size. Here, after getting the converged ANN, we may use it as a black box to get numerical results of any arbitrary point in the domain.

Here, we have considered three, four, and five degree polynomial for regression fitting. One may consider higher degree polynomial in the simulation but it has been seen that by increasing the degree of the polynomials, the accuracy does not usually increase. In the future, it needs to develop a methodology about what degree polynomial one should use to get a result with acceptable accuracy. This is however not of the scope of this paper and the authors are working in this direction and hope to communicate the findings in the future.

7. Conclusion

This paper presents a new approach to solve ordinary differential equations by using regression based artificial neural network model. Accuracy of the proposed method has been examined by solving a first order and a second order damped free vibration problem. The main value of the paper is that the numbers of nodes in hidden layer are fixed according to the degree of polynomial in the regression. Accordingly, here, comparisons of different neural architectures corresponding to different regression models are investigated. Moreover, the algorithm is unsupervised and error back propagation algorithm is used to minimize the error function. Corresponding initial weights from input to hidden and hidden to output are all obtained by the proposed procedure. The trail solution is closed and differentiable. One may see from the tables and graphs that the initial weights generated by regression model make the results more accurate. Lastly, it may be mentioned that the implemented Regression Based Neural Network (RBNN) algorithm is simple, computationally efficient, and straight forward.

Conflict of Interests

The authors declare that there is no conflict of interests regarding the publication of this paper.

Acknowledgment

The first author is thankful to the Department of Science and Technology (DST), Government of India for the financial support under Women Scientist Scheme-A.

References

H. J. Ricardo, A Modern Introduction to Differential Equations, Elsevier, 2nd edition, 2009.
H. Lee and I. S. Kang, “Neural algorithm for solving differential equations,” Journal of Computational Physics, vol. 91, no. 1, pp. 110–131, 1990.
View at: Publisher Site | Google Scholar
A. J. Meade Jr. and A. A. Fernandez, “The numerical solution of linear ordinary differential equations by feedforward neural networks,” Mathematical and Computer Modelling, vol. 19, no. 12, pp. 1–25, 1994.
View at: Google Scholar
A. J. Meade Jr. and A. A. Fernandez, “Solution of nonlinear ordinary differential equations by feedforward neural networks,” Mathematical and Computer Modelling, vol. 20, no. 9, pp. 19–44, 1994.
View at: Google Scholar
I. E. Lagaris, A. Likas, and D. I. Fotiadis, “Artificial neural networks for solving ordinary and partial differential equations,” IEEE Transactions on Neural Networks, vol. 9, no. 5, pp. 987–1000, 1998.
View at: Publisher Site | Google Scholar
I. E. Lagaris, A. C. Likas, and D. G. Papageorgiou, “Neural-network methods for boundary value problems with irregular boundaries,” IEEE Transactions on Neural Networks, vol. 11, no. 5, pp. 1041–1049, 2000.
View at: Publisher Site | Google Scholar
D. R. Parisi, M. C. Mariani, and M. A. Laborde, “Solving differential equations with unsupervised neural networks,” Chemical Engineering and Processing, vol. 42, no. 8-9, pp. 715–721, 2003.
View at: Publisher Site | Google Scholar
A. Malek and R. Shekari Beidokhti, “Numerical solution for high order differential equations using a hybrid neural network-Optimization method,” Applied Mathematics and Computation, vol. 183, no. 1, pp. 260–271, 2006.
View at: Publisher Site | Google Scholar
B. Choi and J.-H. Lee, “Comparison of generalization ability on solving differential equations using backpropagation and reformulated radial basis function networks,” Neurocomputing, vol. 73, no. 1–3, pp. 115–118, 2009.
View at: Publisher Site | Google Scholar
H. S. Yazdi, M. Pakdaman, and H. Modaghegh, “Unsupervised kernel least mean square algorithm for solving ordinary differential equations,” Neurocomputing, vol. 74, no. 12-13, pp. 2062–2071, 2011.
View at: Publisher Site | Google Scholar
N. Selvaraju and J. Abdul Samant, “Solution of matrix Riccati differential equation for nonlinear singular system using neural networks,” International Journal of Computer Applications, vol. 29, pp. 48–54, 2010.
View at: Google Scholar
S. He, K. Reif, and R. Unbehauen, “Multilayer neural networks for solving a class of partial differential equations,” Neural Networks, vol. 13, no. 3, pp. 385–396, 2000.
View at: Publisher Site | Google Scholar
M. Kumar and N. Yadav, “Multilayer perceptrons and radial basis function neural network methods for the solution of differential equations: a survey,” Computers and Mathematics with Applications, vol. 62, no. 10, pp. 3796–3811, 2011.
View at: Publisher Site | Google Scholar
I. G. Tsoulos, D. Gavrilis, and E. Glavas, “Solving differential equations with constructed neural networks,” Neurocomputing, vol. 72, no. 10–12, pp. 2385–2391, 2009.
View at: Publisher Site | Google Scholar
L. Jianyu, L. Siwei, Q. Yingjian, and H. Yaping, “Numerical solution of elliptic partial differential equation using radial basis function neural networks,” Neural Networks, vol. 16, no. 5-6, pp. 729–734, 2003.
View at: Publisher Site | Google Scholar
Y. Shirvany, M. Hayati, and R. Moradian, “Multilayer perceptron neural networks with novel unsupervised training method for numerical solution of the partial differential equations,” Applied Soft Computing Journal, vol. 9, no. 1, pp. 20–29, 2009.
View at: Publisher Site | Google Scholar
N. Mai-Duy and T. Tran-Cong, “Numerical solution of differential equations using multiquadric radial basis function networks,” Neural Networks, vol. 14, no. 2, pp. 185–199, 2001.
View at: Publisher Site | Google Scholar
T. Leephakpreeda, “Novel determination of differential-equation solutions: universal approximation method,” Journal of Computational and Applied Mathematics, vol. 146, no. 2, pp. 443–457, 2002.
View at: Publisher Site | Google Scholar
C. Franke and R. Schaback, “Solving partial differential equations by collocation using radial basis functions,” Applied Mathematics and Computation, vol. 93, no. 1, pp. 73–82, 1998.
View at: Google Scholar
N. Smaoui and S. Al-Enezi, “Modelling the dynamics of nonlinear partial differential equations using neural networks,” Journal of Computational and Applied Mathematics, vol. 170, no. 1, pp. 27–58, 2004.
View at: Publisher Site | Google Scholar
I. G. Tsoulos and I. E. Lagaris, “Solving differential equations with genetic programming,” Genetic Programming and Evolvable Machines, vol. 7, no. 1, pp. 33–54, 2006.
View at: Publisher Site | Google Scholar
K. S. McFall and J. R. Mahan, “Artificial neural network method for solution of boundary value problems with exact satisfaction of arbitrary boundary conditions,” IEEE Transactions on Neural Networks, vol. 20, no. 8, pp. 1221–1233, 2009.
View at: Publisher Site | Google Scholar
S. A. Hoda and H. A. Nagla, “Neural network methods for mixed boundary value problems,” International Journal of Nonlinear Science, vol. 11, pp. 312–316, 2011.
View at: Google Scholar
J. M. Zurada, Introduction to Artificial Neural Network, West Publishing, 1994.
S. Haykin, Neural Networks a Comprehensive Foundation, Prentice Hall, New York, NY, USA, 1999.
S. Chakraverty, V. P. Singh, and R. K. Sharma, “Regression based weight generation algorithm in neural network for estimation of frequencies of vibrating plates,” Computer Methods in Applied Mechanics and Engineering, vol. 195, no. 33–36, pp. 4194–4202, 2006.
View at: Publisher Site | Google Scholar
V. P. Singh, S. Chakraverty, R. K. Sharma, and G. K. Sharma, “Modeling vibration frequencies of annular plates by regression based neural network,” Applied Soft Computing Journal, vol. 9, no. 1, pp. 439–447, 2009.
View at: Publisher Site | Google Scholar
S. Mall and S. Chakraverty, “Regression Based Neural network training for the solution of ordinary differential equations,” International Journal of Mathematical Modelling and Numerical Optimisation, vol. 4, pp. 136–149, 2013.
View at: Publisher Site | Google Scholar

Copyright

Copyright © 2013 Susmita Mall and S. Chakraverty. This is an open access article distributed under the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.

PDF Download Citation

Download other formats

Order printed copies

Views

7212

Downloads

1654

Citations