Table of Contents Author Guidelines Submit a Manuscript
Computational Intelligence and Neuroscience
Volume 2015, Article ID 818243, 13 pages
http://dx.doi.org/10.1155/2015/818243
Review Article

On Training Efficiency and Computational Costs of a Feed Forward Neural Network: A Review

Department of Engineering, Roma Tre University, Via Vito Volterra 62, 00146 Rome, Italy

Received 7 May 2015; Revised 16 August 2015; Accepted 17 August 2015

Academic Editor: Saeid Sanei

Copyright © 2015 Antonino Laudani et al. This is an open access article distributed under the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.

Linked References

  1. G. Cybenko, “Approximation by superpositions of a sigmoidal function,” Mathematics of Control, Signals, and Systems, vol. 2, no. 4, pp. 303–314, 1989. View at Publisher · View at Google Scholar · View at Zentralblatt MATH · View at MathSciNet · View at Scopus
  2. M. Leshno, V. Y. Lin, A. Pinkus, and S. Schocken, “Multilayer feedforward networks with a nonpolynomial activation function can approximate any function,” Neural Networks, vol. 6, no. 6, pp. 861–867, 1993. View at Publisher · View at Google Scholar · View at Scopus
  3. K. Hornik, M. Stinchcombe, and H. White, “Multilayer feedforward networks are universal approximators,” Neural Networks, vol. 2, no. 5, pp. 359–366, 1989. View at Publisher · View at Google Scholar · View at Scopus
  4. K. Hornik, “Approximation capabilities of multilayer feedforward networks,” Neural Networks, vol. 4, no. 2, pp. 251–257, 1991. View at Publisher · View at Google Scholar · View at Scopus
  5. K.-I. Funahashi, “On the approximate realization of continuous mappings by neural networks,” Neural Networks, vol. 2, no. 3, pp. 183–192, 1989. View at Publisher · View at Google Scholar · View at Scopus
  6. J. L. Castro, C. J. Mantas, and J. M. Benítez, “Neural networks with a continuous squashing function in the output are universal approximators,” Neural Networks, vol. 13, no. 6, pp. 561–563, 2000. View at Publisher · View at Google Scholar · View at Scopus
  7. H. Jaeger, Tutorial on Training Recurrent Neural Networks, Covering BPPT, RTRL, EKF and the ‘Echo State Network’ Approach, GMD—Forschungszentrum Informationstechnik, 2002.
  8. H. Jaeger, “Echo state network,” Scholarpedia, vol. 2, no. 9, article 2330, 2007. View at Publisher · View at Google Scholar
  9. T. Lin, B. G. Horne, P. Tiňo, and C. L. Giles, “Learning long-term dependencies in NARX recurrent neural networks,” IEEE Transactions on Neural Networks, vol. 7, no. 6, pp. 1329–1338, 1996. View at Publisher · View at Google Scholar · View at Scopus
  10. A. Rodan and P. Tiňo, “Minimum complexity echo state network,” IEEE Transactions on Neural Networks, vol. 22, no. 1, pp. 131–144, 2011. View at Publisher · View at Google Scholar · View at Scopus
  11. D. Li, M. Han, and J. Wang, “Chaotic time series prediction based on a novel robust echo state network,” IEEE Transactions on Neural Networks and Learning Systems, vol. 23, no. 5, pp. 787–799, 2012. View at Publisher · View at Google Scholar · View at Scopus
  12. B. M. Wilamowski, “Neural network architectures and learning algorithms,” IEEE Industrial Electronics Magazine, vol. 3, no. 4, pp. 56–63, 2009. View at Publisher · View at Google Scholar · View at Scopus
  13. K. Hornik, M. Stinchcombe, and H. White, “Universal approximation of an unknown mapping and its derivatives using multilayer feedforward networks,” Neural Networks, vol. 3, no. 5, pp. 551–560, 1990. View at Publisher · View at Google Scholar · View at Scopus
  14. S. Haykin, Neural Networks: A Comprehensive Foundation, Prentice-Hall, Upper Saddle River, NJ, USA, 2nd edition, 2004.
  15. T. Chen and H. Chen, “Universal approximation to nonlinear operators by neural networks with arbitrary activation functions and its application to dynamical systems,” IEEE Transactions on Neural Networks, vol. 6, no. 4, pp. 911–917, 1995. View at Publisher · View at Google Scholar · View at Scopus
  16. J. Kamruzzaman and S. M. Aziz, “A note on activation function in multilayer feedforward learning,” in Proceedings of the International Joint Conference on Neural Networks (IJCNN '02), vol. 1, pp. 519–523, IEEE, Honolulu, Hawaii, USA, May 2002. View at Publisher · View at Google Scholar
  17. J. Bilski, “The backpropagation learning with logarithmic transfer function,” in Proceedings of the 5th Conference on Neural Networks and Soft Computing, pp. 71–76, Zakopane, Poland, June 2000.
  18. C.-C. Chiang and H.-C. Fu, “A variant of second-order multilayer perceptron and its application to function approximations,” in Proceedings of the International Joint Conference on Neural Networks (IJCNN '92), vol. 3, pp. 887–892, IEEE, Baltimore, Md, USA, June 1992. View at Publisher · View at Google Scholar
  19. F. Piȩkniewski and L. Rybicki, “Visual comparison of performance for different activation functions in MLP networks,” in Proceedings of the IEEE International Joint Conference on Neural Networks (IJCNN '04), pp. 2947–2952, July 2004. View at Publisher · View at Google Scholar · View at Scopus
  20. P. Campolucci, F. Capparelli, S. Guarnieri, F. Piazza, and A. Uncini, “Neural networks with adaptive spline activation function,” in Proceedings of the 8th Mediterranean Electrotechnical Conference (MELECON '06), pp. 1442–1445, IEEE, May 1996. View at Scopus
  21. L. Ma and K. Khorasani, “Constructive feedforward neural networks using Hermite polynomial activation functions,” IEEE Transactions on Neural Networks, vol. 16, no. 4, pp. 821–833, 2005. View at Publisher · View at Google Scholar · View at Scopus
  22. I. S. Isa, Z. Saad, S. Omar, M. K. Osman, K. A. Ahmad, and H. A. M. Sakim, “Suitable MLP network activation functions for breast cancer and thyroid disease detection,” in Proceedings of the 2nd International Conference on Computational Intelligence, Modelling and Simulation (CIMSim '10), pp. 39–44, IEEE, Bali, India, September 2010. View at Publisher · View at Google Scholar · View at Scopus
  23. K. W. Wong, C. S. Leung, and S.-J. Chang, “Use of periodic and monotonic activation functions in multilayer feedforward neural networks trained by extended Kalman filter algorithm,” Vision, Image and Signal Processing, IEE Proceedings, vol. 149, no. 4, pp. 217–224, 2002. View at Google Scholar
  24. K. Hara and K. Nakayamma, “Comparison of activation functions in multilayer neural network for pattern classification,” in Proceedings of the IEEE International Conference on Neural Networks. IEEE World Congress on Computational Intelligence, vol. 5, pp. 2997–3002, IEEE, Orlando, Fla, USA, June-July 1994. View at Publisher · View at Google Scholar
  25. S.-W. Lee and C. Moraga, “Cosine-modulated Gaussian activation function for hyper-hill neural networks,” in Proceedings of the 3rd International Conference on Signal Processing (ICSP '96), pp. 1397–1400, IEEE, October 1996. View at Scopus
  26. M. Ö. Efe, “Novel neuronal activation functions for feedforward neural networks,” Neural Processing Letters, vol. 28, no. 2, pp. 63–79, 2008. View at Publisher · View at Google Scholar · View at Scopus
  27. E. Soria-Olivas, J. D. Martín-Guerrero, G. Camps-Valls, A. J. Serrano-López, J. Calpe-Maravilla, and L. Gómez-Chova, “A low-complexity fuzzy activation function for artificial neural networks,” IEEE Transactions on Neural Networks, vol. 14, no. 6, pp. 1576–1579, 2003. View at Publisher · View at Google Scholar · View at Scopus
  28. M. Karaköse and E. Akin, “Type-2 fuzzy activation function for multilayer feedforward neural networks,” in Proceedings of the IEEE International Conference on Systems, Man and Cybernetics (SMC '04), pp. 3762–3767, IEEE, October 2004. View at Publisher · View at Google Scholar · View at Scopus
  29. J.-S. R. Jang, C.-T. Sun, and E. Mizutani, Neuro-Fuzzy and Soft Computing: A Computational Approach to Learning and Machine Intelligence, Prentice-Hall, Englewood Cliffs, NJ, USA, 1997.
  30. L. A. Zadeh, “The concept of a linguistic variable and its application to approximate reasoning—I,” Information sciences, vol. 8, no. 3, pp. 199–249, 1975. View at Google Scholar
  31. A. Subasi, “Automatic detection of epileptic seizure using dynamic fuzzy neural networks,” Expert Systems with Applications, vol. 31, no. 2, pp. 320–328, 2006. View at Publisher · View at Google Scholar · View at Scopus
  32. H. T. Huynh and Y. Won, “Extreme learning machine with fuzzy activation function,” in Proceedings of the 5th International Joint Conference on INC, IMS and IDC (NCM '09), pp. 303–307, IEEE, 2009.
  33. B. E. Segee, “Using spectral techniques for improved performance in artificial neural networks,” in Proceedings of the IEEE International Conference on Neural Networks, pp. 500–505, IEEE, April 1993. View at Scopus
  34. Y. Özbay and G. Tezel, “A new method for classification of ECG arrhythmias using neural network with adaptive activation function,” Digital Signal Processing, vol. 20, no. 4, pp. 1040–1049, 2010. View at Publisher · View at Google Scholar · View at Scopus
  35. A. Ismail, D.-S. Jeng, L. L. Zhang, and J.-S. Zhang, “Predictions of bridge scour: application of a feed-forward neural network with an adaptive activation function,” Engineering Applications of Artificial Intelligence, vol. 26, no. 5-6, pp. 1540–1549, 2013. View at Publisher · View at Google Scholar · View at Scopus
  36. S. Xu, “Data mining using higher order neural network models with adaptive neuron activation functions,” International Journal of Advancements in Computing Technology, vol. 2, no. 4, pp. 168–177, 2010. View at Publisher · View at Google Scholar
  37. Y. Wu, M. Zhao, and X. Ding, “Beyond weights adaptation: a new neuron model with trainable activation function and its supervised learning,” in Proceedings of the IEEE International Conference on Neural Networks, pp. 1152–1157, IEEE, June 1997. View at Scopus
  38. A. Laudani, G. M. Lozito, F. Riganti Fulginei, and A. Salvini, “An efficient architecture for floating point based MISO neural neworks on FPGA,” in Proceedings of the 16th International Conference on Computer Modelling and Simulation (UKSim '14), pp. 12–17, IEEE, Cambridge, UK, March 2014. View at Publisher · View at Google Scholar
  39. G.-M. Lozito, A. Laudani, F. Riganti-Fulginei, and A. Salvini, “FPGA implementations of feed forward neural network by using floating point hardware accelerators,” Advances in Electrical and Electronic Engineering, vol. 12, no. 1, pp. 30–39, 2014. View at Publisher · View at Google Scholar · View at Scopus
  40. P. Santos, D. Ouellet-Poulin, D. Shapiro, and M. Bolic, “Artificial neural network acceleration on FPGA using custom instruction,” in Proceedings of the 24th Canadian Conference on Electrical and Computer Engineering (CCECE '11), pp. 450–455, IEEE, Niagara Falls, NY, USA, May 2011. View at Publisher · View at Google Scholar
  41. B. Zamanlooy and M. Mirhassani, “Efficient VLSI implementation of neural networks with hyperbolic tangent activation function,” IEEE Transactions on Very Large Scale Integration (VLSI) Systems, vol. 22, no. 1, pp. 39–48, 2014. View at Publisher · View at Google Scholar · View at Scopus
  42. P. K. Meher, “An optimized lookup-table for the evaluation of sigmoid function for artificial neural networks,” in Proceedings of the 18th IEEE/IFIP VLSI System on Chip Conferene (VLSI-SoC '10), pp. 91–95, IEEE, Madrid, Spain, September 2010. View at Publisher · View at Google Scholar · View at Scopus
  43. K. Leboeuf, R. Muscedere, and M. Ahmadi, “Performance analysis of table-based approximations of the hyperbolic tangent activation function,” in Proceedings of the 54th IEEE International Midwest Symposium on Circuits and Systems (MWSCAS '11), pp. 1–4, IEEE, August 2011. View at Publisher · View at Google Scholar · View at Scopus
  44. A. L. Braga, C. H. Llanos, D. Göhringer, J. Obie, J. Becker, and M. Hübner, “Performance, accuracy, power consumption and resource utilization analysis for hardware/software realized artificial neural networks,” in Proceedings of the IEEE 5th International Conference on Bio-Inspired Computing: Theories and Applications (BIC-TA '10), pp. 1629–1636, IEEE, 2010.
  45. M. Bajger and A. Omondi, “Low-error, high-speed approximation of the sigmoid function for large FPGA implementations,” Journal of Signal Processing Systems, vol. 52, no. 2, pp. 137–151, 2008. View at Publisher · View at Google Scholar · View at Scopus
  46. V. Saichand, D. M. Nirmala, S. Arumugam, and N. Mohankumar, “FPGA realization of activation function for artificial neural networks,” in Proceedings of the 8th International Conference on Intelligent Systems Design and Applications (ISDA '08), vol. 3, pp. 159–164, IEEE, Kaohsiung, Taiwan, November 2008. View at Publisher · View at Google Scholar
  47. A. Tisan, S. Oniga, D. Mic, and A. Buchman, “Digital implementation of the sigmoid function for FPGA circuits,” Acta Technica Napocensis—Electronics and Telecommunications, vol. 50, no. 2, p. 6, 2009. View at Google Scholar
  48. D. J. Myers and R. A. Hutchinson, “Efficient implementation of piecewise linear activation function for digital VLSI neural networks,” Electronics Letters, vol. 25, no. 24, pp. 1662–1663, 1989. View at Publisher · View at Google Scholar · View at Scopus
  49. C. Alippi and G. Storti-Gajani, “Simple approximation of sigmoidal functions: realistic design of digital neural networks capable of learning,” in Proceedings of the IEEE 1991 International Sympoisum on Circuits and Systems, pp. 1505–1508, IEEE, 1991.
  50. H. Amin, K. M. Curtis, and B. R. Hayes-Gill, “Piecewise linear approximation applied to nonlinear function of a neural network,” IEE Proceedings—Circuits, Devices and Systems, vol. 144, no. 6, pp. 313–317, 1997. View at Publisher · View at Google Scholar
  51. A. Tisan, S. Oniga, and C. Gavrincea, “Hardware implementation of a MLP network with on-chip learning,” in Proceedings of the 5th WSEAS International Conference on Data Networks, Communications & Computers, pp. 162–167, Bucharest, Romania, October 2006.
  52. Y. Lee and S.-B. Ko, “FPGA implementation of a face detector using neural networks,” in Proceedings of the Canadian Conference on Electrical and Computer Engineering (CCECE '06), pp. 1914–1917, IEEE, May 2006. View at Publisher · View at Google Scholar · View at Scopus
  53. D. E. Khodja, A. Kheldoun, and L. Refoufi, “Sigmoid function approximation for ANN implementation in FPGA devices,” in Proceedings of the 9th WSEAS International Conference on Circuits, Systems, Electronics, Control, and Signal Processing, Stevens Point, Wis, USA, 2010.
  54. M. A. Sartin and A. C. R. Da Silva, “Approximation of hyperbolic tangent activation function using hybrid methods,” in Proceedings of the 8th International Workshop on Reconfigurable and Communication-Centric Systems-on-Chip (ReCoSoC '13), pp. 1–6, IEEE, Darmstadt, Germany, July 2013. View at Publisher · View at Google Scholar · View at Scopus
  55. M. A. Sartin and A. C. da Silva, “ANN in hardware with floating point and activation function using hybrid methods,” Journal of Computers, vol. 9, no. 10, pp. 2258–2265, 2014. View at Publisher · View at Google Scholar
  56. F. Ortega-Zamorano, J. M. Jerez, G. Juarez, J. O. Perez, and L. Franco, “High precision FPGA implementation of neural network activation functions,” in Proceedings of the IEEE Symposium on Intelligent Embedded Systems (IES '14), pp. 55–60, IEEE, 2014.
  57. S. Saranya and B. Elango, “Implementation of PWL and LUT based approximation for hyperbolic tangent activation function in VLSI,” in Proceedings of the 3rd International Conference on Communication and Signal Processing (ICCSP '14), pp. 1778–1782, IEEE, April 2014. View at Publisher · View at Google Scholar · View at Scopus
  58. G. M. Lozito, L. Bozzoli, and A. Salvini, “Microcontroller based maximum power point tracking through FCC and MLP neural networks,” in Proceedings of the 6th European Embedded Design in Education and Research Conference (EDERC '14), pp. 207–211, IEEE, September 2014. View at Publisher · View at Google Scholar · View at Scopus
  59. C.-W. Lin and J.-S. Wang, “A digital circuit design of hyperbolic tangent sigmoid function for neural networks,” in Proceedings of the IEEE International Symposium on Circuits and Systems (ISCAS '08), pp. 856–859, IEEE, May 2008. View at Publisher · View at Google Scholar · View at Scopus
  60. H. K. Kwan, “Simple sigmoid-like activation function suitable for digital hardware implementation,” Electronics Letters, vol. 28, no. 15, pp. 1379–1380, 1992. View at Publisher · View at Google Scholar · View at Scopus
  61. M. Zhang, S. Vassiliadis, and J. G. Delgado-Frias, “Sigmoid generators for neural computing using piecewise approximations,” IEEE Transactions on Computers, vol. 45, no. 9, pp. 1045–1049, 1996. View at Publisher · View at Google Scholar · View at Scopus
  62. A. H. Namin, K. Leboeuf, R. Muscedere, H. Wu, and M. Ahmadi, “Efficient hardware implementation of the hyperbolic tangent sigmoid function,” in Proceedings of the IEEE International Symposium on Circuits and Systems (ISCAS '09), pp. 2117–2120, IEEE, May 2009. View at Publisher · View at Google Scholar · View at Scopus
  63. M. Carrasco, F. Mancilla-David, F. R. Fulginei, A. Laudani, and A. Salvini, “A neural networks-based maximum power point tracker with improved dynamics for variable dc-link grid-connected photovoltaic power plants,” International Journal of Applied Electromagnetics and Mechanics, vol. 43, no. 1-2, pp. 127–135, 2013. View at Publisher · View at Google Scholar · View at Scopus
  64. F. Mancilla-David, F. Riganti-Fulginei, A. Laudani, and A. Salvini, “A neural network-based low-cost solar irradiance sensor,” IEEE Transactions on Instrumentation and Measurement, vol. 63, no. 3, pp. 583–591, 2014. View at Publisher · View at Google Scholar · View at Scopus
  65. F. Riganti-Fulginei, A. Laudani, A. Salvini, and M. Parodi, “Automatic and parallel optimized learning for neural networks performing MIMO applications,” Advances in Electrical and Computer Engineering, vol. 13, no. 1, pp. 3–12, 2013. View at Publisher · View at Google Scholar · View at Scopus
  66. F. R. Fulginei, A. Salvini, and M. Parodi, “Learning optimization of neural networks used for MIMO applications based on multivariate functions decomposition,” Inverse Problems in Science and Engineering, vol. 20, no. 1, pp. 29–39, 2012. View at Publisher · View at Google Scholar · View at Zentralblatt MATH · View at Scopus
  67. Y. LeCun, J. S. Denker, S. A. Solla et al., “Optimal brain damage,” in Advances in Neural Information Processing Systems (NIPs), pp. 598–605, Morgan Kaufmann, 1989. View at Google Scholar
  68. B. Hassibi and D. G. Stork, Second Order Derivatives for Network Pruning: Optimal Brain Surgeon, Morgan Kaufmann Publishers, 1993.
  69. L. Prechelt and Fakultat Fur Informatik, “Proben1: a set of neural network benchmark problems and benchmarking rules,” Tech. Rep. 21/94, 1994. View at Google Scholar