Table of Contents Author Guidelines Submit a Manuscript
Computational Intelligence and Neuroscience
Volume 2014, Article ID 746376, 6 pages
http://dx.doi.org/10.1155/2014/746376
Research Article

A Novel Single Neuron Perceptron with Universal Approximation and XOR Computation Properties

1Department of Computer Engineering, Torbat-e-Jam Branch, Islamic Azad University, Torbat-e-Jam, Iran
2Electrical and Computer Engineering Departments, Center of Excellence on Soft Computing and Intelligent Information Processing, Ferdowsi University of Mashhad, Iran

Received 9 February 2014; Accepted 7 April 2014; Published 28 April 2014

Academic Editor: Cheng-Jian Lin

Copyright © 2014 Ehsan Lotfi and M.-R. Akbarzadeh-T. This is an open access article distributed under the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.

Abstract

We propose a biologically motivated brain-inspired single neuron perceptron (SNP) with universal approximation and XOR computation properties. This computational model extends the input pattern and is based on the excitatory and inhibitory learning rules inspired from neural connections in the human brain’s nervous system. The resulting architecture of SNP can be trained by supervised excitatory and inhibitory online learning rules. The main features of proposed single layer perceptron are universal approximation property and low computational complexity. The method is tested on 6 UCI (University of California, Irvine) pattern recognition and classification datasets. Various comparisons with multilayer perceptron (MLP) with gradient decent backpropagation (GDBP) learning algorithm indicate the superiority of the approach in terms of higher accuracy, lower time, and spatial complexity, as well as faster training. Hence, we believe the proposed approach can be generally applicable to various problems such as in pattern recognition and classification.

1. Introduction

In various computer applications such as pattern recognition, classification, and prediction, a learning module can be implemented by various approaches including statistical, structural, and neural approaches. Among these methods, artificial neural networks (ANNs) are inspired by physiological workings of the brain. They are based on mathematical model of single neural cell (neuron) named single neuron perceptron (SNP) and try to resemble the actual networks of neurons in the brain. As computational models, SNP has particular characteristics such as the ability to learn and generalize. Although the multilayer perceptron (MLP) can approximate any functions [1, 2], traditional SNP is not universal approximator. MLP can learn through the error backpropagation algorithm (EBP), whereby the error of output units is propagated back to adjust the connecting weights within the network. In MLP architecture, by increasing the number of neurons in input layer or (and) the number of neurons in output layer or (and) the number of neurons in hidden layer(s), the number of learning parameters and the algorithm computational complexity are significantly increased. This problem is usually referred to as the curse of dimensionality [3, 4]. So many researchers have tried to propose more powerful single layer architectures and faster algorithms such as functional link networks (FLNs) and Levenberg-Marquardt (LM) and its modified and extended versions [520].

In contrast to the MLP, SNP and FLNs do not impose high computational complexity and are far from the curse of dimensionality. But because of disregarding the universal approximation property, SNP and FLNs are not very popular in the applications. In contrast to the previse knowledge about SNP, this paper aims to propose a novel SNP model that can solve the XOR problem and we show that it can be universal approximator. Proposed SNP can solve XOR problem only if additional nonlinear operator is used. As illustrated in the next section, the SNP universal approximation property can simply be archived by extending the input patterns and using the nonlinear operator max. Like functional link networks (FLNs) [21], the proposed SNP does not include hidden units or expand the input vector, but guarantees universal approximation. FLNs are single-layer neural networks that can be considered as an alternative approach in the data mining to overcome the complexities associated with MLP [22] but they do not guarantee universal approximation.

The paper is organized as follows. Proposed SNP and universal approximation theorem are proposed in Section 2. Section 3 presents the numerical results, where the proposed SNP is compared with backpropagation MLP. There are various versions of backpropagation algorithms. In classification problems, we compare with gradient descent backpropagation (GDBP) [23], that is, the standard basic algorithm. Finally, conclusions are made in Section 4.

2. Proposed Single Neuron Perceptron

Figure 1 shows the proposed SNP. In the figure, the model is presented as -inputs single-output architecture. The variable is the input pattern and the variable is related target applied in the learning process (3). Let us extend the input pattern as follows:

746376.fig.001
Figure 1: Proposed SNP.

Actually, operation increases the input dimension to .

So, the new input pattern has elements. In Figure 1, the input pattern is illustrated by vector and the calculated by the following formula is the final output: where is activation function and , and b are adjustable weights. So, error can be achieved as follows: and the learning weights can be adjusted by the following excitatory learning rule: and then by the following inhibitory rule: where is target, is output of network, is related error, and is the learning rate. Also can be trained by

It should be added that max operation applied on the input pattern and also in the learning phase has been motivated from computational models of limbic system in the brain [2426]. Limbic system is an emotional processor in the mammalian brain [2729]. In these models [2426], the max operator prepares the output and input of main parts of limbic system.

In summary, the feedforward computation and backward learning algorithm of proposed SNP, in an online form and with tansig activation function, is as in Algorithm 1.

alg1
Algorithm 1: Proposed SNP algorithm.

In the algorithm, can be picked empirically or changed adaptively during the learning process according to the adaptive learning [30, 31].

The proposed SNP solves the XOR problem. Consider 2−1 architecture with hardlim activation function and by using the following weights: , , , and ; thus, where hardlim is calculated by the following formula:

Since is in the form of (2), so based on SNP can approximate the XOR function. The proposed model has a lower computational complexity than other methods such as spiking neural networks [32] that solved XOR problem. The computational complexity of proposed SNP is ; this is while it profits from very simple questions adjusting the weights.

In the next section, we prove that SNP is a universal approximator and can approximate all real continuous functions.

2.1. Universal Approximation Theorem

Let us ignore the activation function from the model and rewrite (2) like this

Consider as the set of all equations in form (9) and as a submetric; then is ametric space [33]. The following theorem shows that is dense in , where is the set of all real continues functions defined on .

SNP Universal Approximation Theorem. For any given real continuous function on the compact set and arbitrary , there exists such that We use the following Stone-Weierstrass theorem to prove the theorem.

Stone-Weierstrass Theorem (see [33, 34]). Let be a set of real continuous functions on compact set . If is algebra,that is, the set is closed under scalar multiplication (the closing under addition and multiplication is not necessary for real continuous functions [34]), separates points on , that is, for every such that , there exists such that ; and vanishes at no point of , that is, for each , there exists such that , then the uniform closure of consists of all real continuous functions on ; that is, is dense in .

SNP Universal Approximation Proof. First, we prove that is algebra. Let , for arbitrary : which is given in form (8). Thus, and is an algebra.

Next, we prove that separates the points on U. We prove this by constructing a required E, for arbitrary such that , and we choose such that , for  and .

Thus, Therefore, separates the point on .

Finally, we prove that vanishes at no point of . We choose , for , , , and .

Since, for all , and , then, for all , there exists such that .

So, SNP independently from activation function is universal approximator.

3. Numerical Results

One parameter that related to computational complexity of a learning method is the number of learning weights in each epoch. The lower number of learning weights concludes lower number of computations and lower computational complexity. To evaluate the number of proposed SNP learning weights with respect to the MLP, we propose a measure named the reducing ratio of number of weights (Rw) as follows:

The Rw is a measure that can be used to compare the computational complexity of proposed SNP and MPL. The higher Rw shows SNP has a lower number of learning weights. Thus, it has a lower number of computations and so has a lower computational complexity. Additionally, in the classification problems, the accuracy can be a proper performance measure to evaluate the algorithms. This measure is generally expressed as follows:

For all learning scenarios listed below, the training set contained 70% while the testing set contained 15% of the data and the remaining was used for the validation set. Input patterns have been normalized between . Output targets are binary digits (i.e., the single class is labeled by digits “1” and “0,” the two classes are labeled as “01” and “10,” and the three classes are labeled as “001,” “010,” and “100,” and). Also the initial weights were randomly selected between .

Here and prior to entering comparative numerical studies, let us analyze the computational complexity. Regarding the proposed learning algorithm, the algorithm adjusts weights for each learning sample, where is number of input attributes. In contrast, computational time is for MLP, where is number of hidden neurons (the lowest is 2). Additionally, GDBP MLP compared here is based on derivative computations which impose high complexity, while the proposed method is derivative free. So, the proposed method has lower computational complexity and higher efficiency with respect to the MLP. This improved computing efficiency can be important for online predictions, especially when the time interval of observations is small.

To test and assess the SNP in classification, 6 single class datasets have been downloaded from UCI (University of California, Irvine) Data Center. In all datasets, the target labeling was binary. Table 1 shows the information related to the datasets that include the number of attributes and instances. Additionally, the SNP and MLP architectures and the number of learning weights and Rw are presented in the table too. As illustrated in Table 1, SNP reduces the number of learning weights approximately about 50% for each dataset.

tab1
Table 1: Datasets and related learning information.

In the proposed SNP algorithm, we consider . And the learning parameters values are shown in Table 1. The activation function was tansig and the stop criterion in learning process was the maximum epochs, which means the maximum number of epochs has been reached. The maximum and minimum values of each dataset were determined and the scaled data (between 0 and 1) were used to adjust the weights. The training was repeated 10 times and the average of accuracy in test set was recorded. Figure 2 presents the accuracy average and the confidence interval obtained from SNP and MLP. It is obvious that SNP is more accurate than MLP with GDBP algorithm in some datasets. The results indicated in Figure 2 are based on student’s -test with 95% confidence.

746376.fig.002
Figure 2: Comparisons between SNP and MLP.

Although, according to Figure 2, it seems that GDBP is better in some cases, what is very important in the results is number of learning epochs. Table 2 shows the learning epoch comparisons. According to Table 2, MLP needs many epochs to reach the results of SNP. It is the main feature of proposed SNP, fast learning with lower computational complexity, that makes it suitable for usage in various applications and especially in online problems.

tab2
Table 2: Number of learning epoch comparison.

4. Conclusion

In this paper, we prove that a single neuron perceptron (SNP) can solve XOR problem and can be a universal approximator. These features can be achieved by extending input pattern and by using max operator. SNP with this extension ability is a novel computational model of neural cell that is learnt by excitatory and inhibitory rules. This new SNP architecture works with fewer numbers of learning weights. Specifically, it only generates learning weights and only requires operations during each training iteration, where is size of input vector. Furthermore, the universal approximation property is theoretically proved for this architecture. The source code of proposed algorithm is accessible from http://www.bitools.ir/projects.html. In numerical studies, SNP was utilized to classify 6 UCI datasets. The comparisons between proposed SNP and backpropagation MLP present the following conclusions. Firstly, the number of learning parameters of SNP is much lower with respect to the standard MLP. Secondly, in classification problems, the performance of supervised excitatory and inhibitory learning algorithm is higher than gradient descent backpropagation (GDBP). Thirdly, lower computational complexity caused from the fewer learning parameters and faster training of proposed SNP make it suitable for real time classification. In short, SNP is a universal approximator with a simple structure and is motivated by neurophysiological knowledge of the human’s brain. We believe, based on the multiple case studies as well as the theoretical results in this report, that SNP can be effectively used in pattern recognition and classification problems.

Conflict of Interests

The authors declare that there is no conflict of interests regarding the publication of this paper.

References

  1. D. E. Rumelhart, G. E. Hinton, and R. J. Williams, “Learning representations by back-propagating errors,” Nature, vol. 323, no. 6088, pp. 533–536, 1986. View at Google Scholar · View at Scopus
  2. K. Hornik, M. Stinchcombe, and H. White, “Multilayer feedforward networks are universal approximators,” Neural Networks, vol. 2, no. 5, pp. 359–366, 1989. View at Google Scholar · View at Scopus
  3. S. Bengio and Y. Bengio, “Taking on the curse of dimensionality in joint distributions using neural networks,” IEEE Transactions on Neural Networks, vol. 11, no. 3, pp. 550–557, 2000. View at Publisher · View at Google Scholar · View at Scopus
  4. M. Verleysen and D. François, “The curse of dimensionality in data mining and time series prediction,” in Computational Intelligence and Bioinspired Systems, vol. 3512, pp. 758–770, Springer, Berlin, Germany, 2005. View at Publisher · View at Google Scholar
  5. B. Verma, “Fast training of multilayer perceptrons,” IEEE Transactions on Neural Networks, vol. 8, no. 6, pp. 1314–1320, 1997. View at Publisher · View at Google Scholar · View at Scopus
  6. M. T. Hagan and M. B. Menhaj, “Training feedforward networks with the Marquardt algorithm,” IEEE Transactions on Neural Networks, vol. 5, no. 6, pp. 989–993, 1994. View at Publisher · View at Google Scholar · View at Scopus
  7. J.-X. Peng, K. Li, and G. W. Irwin, “A new Jacobian matrix for optimal learning of single-layer neural networks,” IEEE Transactions on Neural Networks, vol. 19, no. 1, pp. 119–129, 2008. View at Publisher · View at Google Scholar · View at Scopus
  8. J.-M. Wu, “Multilayer Potts perceptrons with Levenberg-Marquardt learning,” IEEE Transactions on Neural Networks, vol. 19, no. 12, pp. 2032–2043, 2008. View at Publisher · View at Google Scholar · View at Scopus
  9. B. M. Wilamowski and H. Yu, “Improved computation for levenbergmarquardt training,” IEEE Transactions on Neural Networks, vol. 21, no. 6, pp. 930–937, 2010. View at Publisher · View at Google Scholar · View at Scopus
  10. B. M. Wilamowski and H. Yu, “Neural network learning without backpropagation,” IEEE Transactions on Neural Networks, vol. 21, no. 11, pp. 1793–1803, 2010. View at Publisher · View at Google Scholar · View at Scopus
  11. K. Y. Huang, L. C. Shen, K. J. Chen, and M. C. Huang, “Multilayer perceptron learning with particle swarm optimization for well log data inversion,” in Proceedings of the International Joint Conference on Neural Networks (IJCNN '12), pp. 1–6, Brisbane, Australia, June 2012. View at Publisher · View at Google Scholar
  12. N. Ampazis and S. J. Perantonis, “Two highly efficient second-order algorithms for training feedforward networks,” IEEE Transactions on Neural Networks, vol. 13, no. 5, pp. 1064–1074, 2002. View at Publisher · View at Google Scholar · View at Scopus
  13. C.-T. Kim and J.-J. Lee, “Training two-layered feedforward networks with variable projection method,” IEEE Transactions on Neural Networks, vol. 19, no. 2, pp. 371–375, 2008. View at Publisher · View at Google Scholar · View at Scopus
  14. S. Ferrari and M. Jensenius, “A constrained optimization approach to preserving prior knowledge during incremental training,” IEEE Transactions on Neural Networks, vol. 19, no. 6, pp. 996–1009, 2008. View at Publisher · View at Google Scholar · View at Scopus
  15. Y. Liu, J. A. Starzyk, and Z. Zhu, “Optimized approximation algorithm in neural networks without overfitting,” IEEE Transactions on Neural Networks, vol. 19, no. 6, pp. 983–995, 2008. View at Publisher · View at Google Scholar · View at Scopus
  16. G.-B. Huang, Q.-Y. Zhu, and C.-K. Siew, “Real-time learning capability of neural networks,” IEEE Transactions on Neural Networks, vol. 17, no. 4, pp. 863–878, 2006. View at Publisher · View at Google Scholar · View at Scopus
  17. A. Bortoletti, C. di Fiore, S. Fanelli, and P. Zellini, “A new class of quasi-Newtonian methods for optimal learning in MLP-networks,” IEEE Transactions on Neural Networks, vol. 14, no. 2, pp. 263–273, 2003. View at Publisher · View at Google Scholar · View at Scopus
  18. V. V. Phansalkar and P. S. Sastry, “Analysis of the back-propagation algorithm with momentum,” IEEE Transactions on Neural Networks, vol. 5, no. 3, pp. 505–506, 1994. View at Publisher · View at Google Scholar · View at Scopus
  19. A. Khashman, “A modified backpropagation learning algorithm with added emotional coefficients,” IEEE Transactions on Neural Networks, vol. 19, no. 11, pp. 1896–1909, 2008. View at Publisher · View at Google Scholar · View at Scopus
  20. A. Khashman, “Modeling cognitive and emotional processes: a novel neural network architecture,” Neural Networks, vol. 23, no. 10, pp. 1155–1163, 2010. View at Publisher · View at Google Scholar · View at Scopus
  21. A. Sierra, J. A. Macias, and F. Corbacho, “Evolution of functional link networks,” IEEE Transactions on Evolutionary Computation, vol. 5, no. 1, pp. 54–65, 2001. View at Publisher · View at Google Scholar · View at Scopus
  22. B. B. Misra and S. Dehuri, “Functional link artificial neural network for classification task in data mining,” Journal of Computer Science, vol. 3, no. 12, pp. 948–955, 2007. View at Publisher · View at Google Scholar
  23. M. T. Hagan, H. B. Demuth, and M. H. Beale, Neural Network Design, PWS, Boston, Mass, USA, 1996.
  24. E. Lotfi and M.-R. Akbarzadeh-T, “Adaptive brain emotional decayed learning for online prediction of geomagnetic activity indices,” Neurocomputing, vol. 126, pp. 188–196, 2014. View at Publisher · View at Google Scholar
  25. E. Lotfi and M.-R. Akbarzadeh-T, “Brain emotional learning based pattern recognizer,” Cybernetics and Systems, vol. 44, no. 5, pp. 402–421, 2013. View at Publisher · View at Google Scholar
  26. E. Lotfi and M.-R. Akbarzadeh-T, “Supervised brain emotional learning,” in Proceedings of the International Joint Conference on Neural Networks (IJCNN '12), pp. 1–6, Brisbane, Australia, June 2012. View at Publisher · View at Google Scholar
  27. J. E. LeDoux, “Emotion circuits in the brain,” Annual Review of Neuroscience, vol. 23, pp. 155–184, 2000. View at Publisher · View at Google Scholar · View at Scopus
  28. J. Le Doux, The Emotional Brain, Simon and Schuster, New York, NY, USA, 1996.
  29. D. Goleman, Emotional Intelligence; Why It Can Matter More Than IQ, Bantam, New York, NY, USA, 2006.
  30. B. Widrow and M. E. Hoff, “Adaptive switching circuits,” in 1960 IRE WESCON Convention Record, pp. 96–104, IRE, New York, NY, USA, 1960. View at Google Scholar
  31. B. Widrow and S. D. Sterns, Adaptive Signal Processing, Prentice-Hall, New York, NY, USA, 1985.
  32. S. Ferrari, B. Mehta, G. di Muro, A. M. J. VanDongen, and C. Henriquez, “Biologically realizable reward-modulated hebbian training for spiking neural networks,” in Proceedings of the IEEE International Joint Conference on Neural Networks (IJCNN '08), pp. 1780–1786, Hong Kong, June 2008. View at Publisher · View at Google Scholar · View at Scopus
  33. L.-X. Wang and J. M. Mendel, “Fuzzy basis functions, universal approximation, and orthogonal least-squares learning,” IEEE Transactions on Neural Networks, vol. 3, no. 5, pp. 807–814, 1992. View at Publisher · View at Google Scholar · View at Scopus
  34. W. Rudin, Principles of Mathematical Analysis, McGraw-Hill, New York, NY, USA, 1976.