The Role of Nonpolynomiality in Uniform Approximation by RBF Networks of Hankel Translates

Marrero, Isabel

doi:https://doi.org/10.1155/2019/1845491

Journal of Function Spaces

On this page

Abstract Introduction Data Availability Conflicts of Interest References Copyright Related Articles

Research Article | Open Access

Volume 2019 | Article ID 1845491 | https://doi.org/10.1155/2019/1845491

The Role of Nonpolynomiality in Uniform Approximation by RBF Networks of Hankel Translates

Isabel Marrero¹

Academic Editor: Yoshihiro Sawano

Received22 Sept 2018

Revised20 Feb 2019

Accepted28 Feb 2019

Published01 Apr 2019

Abstract

Given and , let the space (respectively, ) consist of all those continuous functions on (respectively, ) such that the limit exists and is finite; is endowed with the uniform norm Assume defines an absolutely regular Hankel-transformable distribution. Then, the linear span of dilates and Hankel translates of is dense in for all if, and only if, , where

Dedicated to Professor Fernando Pérez González on the occasion of his retirement

1. Introduction and Motivation

1.1. RBFNNs

The radial basis function (RBF) method is nowadays one of the primary tools for interpolating multidimensional scattered data. Its simple form and ability to accurately approximate an underlying function have made the method increasingly popular in several different types of applications, some of which include cartography, medical imaging, the numerical solution of partial differential equations, and neural networks (see, e.g., [1] and references therein).

Radial basis function neural networks (RBFNNs) as such were introduced in the 1980s by Broomhead and Lowe [2] and soon applied to problems of supervised learning such as regression, classification, and time series prediction [3, 4]. This type of network falls within the general class of nonlinear, single hidden layer feedforward neural networks. Given , the family of RBFNNs consists of all those functions of the formwhere (i) is the number of kernel nodes in the hidden layer(ii) is the vector of weights from the th kernel node to the output nodes(iii) is an input vector(iv) is a radially symmetric kernel function of a unit in the hidden layer(v) and are the centroid and smoothing factor (or width) of the th kernel node , respectively(vi) is the so-called activation function, which characterizes the kernel shape, often a Gaussian

The smoothing factors may be the same in all kernel nodes of a RBFNN or may vary across them. Park and Sandberg [5, 6] proved that under mild conditions on the kernel (or the activation function ) both classes of RBFNNs (with either the same or varying smoothing factors across nodes) have the universal approximation property, meaning that they are dense in suitable spaces of continuous or integrable functions. Chen and Chen [7] considered RBFNNs with a continuous activation function in the hidden layer defining a tempered distribution in and proved that the necessary and sufficient condition for such networks to uniformly approximate every continuous function on compacta is that is not an even polynomial. Nonpolynomiality is straightforwardly seen to be a necessary condition for these approximations and has been found necessary and sufficient for other types of networks to possess the universal approximation property as well, cf. [8–12]. In this paper we aim to extend the result in [7] to RBFNNs of Hankel translates. The precise meaning of this extension will be clarified in due course.

1.2. The Hankel Transformation and the Hankel Translation

Let . The Hankel integral transformation is usually defined by where and denotes the Bessel function of the first kind and order .

Aiming to obtain a distributional extension of , Zemanian introduced new spaces of test and generalized functions. The space [13, 14] consists of all those smooth, complex-valued functions such that When topologized by the family of norms , becomes a Fréchet space where is an automorphism provided that . Then the generalized Hankel transformation , defined by transposition on the dual of , is an automorphism of when this latter space is endowed with either its or its strong topology. For and , Zemanian [15] also introduced the space of all those smooth functions such that andEndowed with the topology generated by the family of seminorms , becomes a Fréchet space. The strict inductive limit of the family satisfies , with continuous embedding. Since is dense in , it turns out that can be regarded as a subspace of , the dual of .

The study of the Hankel #-convolution in spaces of generalized functions was initiated by Sousa Pinto [16], only on compactly-supported distributions and for . In a series of papers [17–19], Betancor and the author investigated systematically the generalized #-convolution in wider spaces of distributions, allowing . In this context, the Hankel convolution of is defined as the function where the Hankel translate of is given by Here, for , is the so-called Delsarte kernel. Note that , , is symmetric in , andwhere . Therefore, for any we have The formula and the exchange formula hold pointwise. The Hankel translation is defined on by transposition. The Hankel convolution of and is defined [19, Definition 3.1] by The formulasandhold in the sense of equality in (cf. [19, Proposition 3.5]).

The space of all those smooth functions with the property that to every there corresponds satisfyingwas characterized as the space of multipliers of and [20, Theorems 2.3 and 2.9].

The generalized Hankel transformation establishes an isomorphism between and the subspace of consisting of the Hankel convolution operators on and , which is a homeomorphism under the natural topologies of and [19, Propositions 4.2 and 5.2]. The distribution given bysatisfies and , cf. [19, Proposition 4.7] and [21, Proposition 3].

For the operational rules of the Hankel transformation and further properties of the Hankel translation and Hankel convolution that will be required, in particular those involving the Bessel differential operator the reader is mainly referred to [14, 17, 19]. Here we will highlight the following [14, Equation 5.5(8)]:

If and (a.e. ) is an integrable radial function, then its -dimensional Fourier transform is also radial and becomes a 1-dimensional Hankel transform of order [22, Theorem IV.3.3]: Actually, since it turns out that, on radial univariate—even—functions, the Fourier transformation, which reduces to a Fourier-cosine transformation, coincides with the Hankel transform of order ; similarly, the Hankel translation and Hankel convolution of order can be seen to coincide (modulo a multiplicative constant) with the usual translation and convolution on (cf. [23, Example 3.2]). Thus for the Hankel translation and the Hankel convolution provide strict generalizations of the usual translation and convolution operators, inasmuch as arbitrary orders are allowed.

1.3. RBFNNs of Hankel Translates

Motivated by the fact that the Hankel transformation is best adapted to deal with radial functions, Arteaga and the author [24–27] have proved that the Hankel transformation and the Hankel convolution are suitable tools for the description and analysis of a RBF interpolation scheme by functions of the form where , is a complex function defined on (the so-called basis function), is a Müntz monomial, denotes the Hankel translation operator of order , and are complex coefficients.

In analogy to the standard case (1), we set the family of RBFNNs of Hankel translates of order to consist of all those functions which can be represented aswhere is the number of kernel nodes in the hidden layer, for , , is the weight from the th kernel node to the output node, and are, respectively, the centroid and the smoothing factor of the th kernel node. Further, is a kernel function of a unit in the hidden layer which, in this case, coincides with the activation function and, as above, denotes the Hankel translation operator, while is a dilation operator. Note that, for and , (23) becomes (1).

An investigation on the universal approximation capabilities of a closely related class of RBFNNs defined on the nonnegative real axis has been carried out in several papers by Arteaga and the author [28–30]. It should be remarked that the results in the present paper can be derived neither from [24–27], where only the interpolation problem is addressed, nor from [28–30], where RBFNNs are constructed using the Bessel-Kingman hypergroup translation (or Delsarte translation) instead of the Hankel one, and where the universal approximation property, which is studied mainly in spaces of integrable functions, requires in turn integrability of the basis function.

1.4. Objectives

In the sequel we assume and consider the following spaces:(i)Given , will denote the linear space of all those continuous functions on such that the limit exists and is finite. When endowed with the norm becomes a Banach space. In fact, the map is an isometry from onto , the space of all continuous functions on with the uniform norm.(ii)The linear space consists of all those continuous functions on such that the limit (24) exists and is finite. Endowed with the topology generated by the family of seminorms , where becomes a Fréchet space. Note that sequential convergence in is equivalent to convergence in for all .(iii)The space consists of all those smooth functions on such that the limits exist and are finite. Endowed with the topology generated by the family of seminorms , where becomes a Fréchet space.

Our aim here is to find necessary and sufficient conditions on the basis function for the family of RBFNNs to have the universal approximation property. More precisely, the above mentioned result in [7] is extended to the Hankel setting in the following way. Given , a necessary and sufficient condition for to be dense in is nonmembership in the class of Müntz polynomials generated by . This is the content of Theorem 9 in Section 3. In Section 2 we introduce the concept and give a characterization of zero-supported -distributions (Theorem 5), which is used in the proof of Theorem 9 and might be interesting in its own right.

2. Zero-Supported Hankel Distributions

Definition 1. Suppose . We say that the support of is , in symbols , if for all with , equivalently, if for all such that , for some .

Proposition 2. Assume and satisfies , for some . Then .

Proof. Let . Since , necessarily Therefore, This yields the desired conclusion.

Recall that if, and only if, the restrictions of to every are continuous. By (4), this means that to each there corresponds and such that

Definition 3. If, in (33), one will do for all (not necessarily with the same ), then the smallest such is called the order of . Otherwise, is said to have infinite order.

Remark 4. Note that every with has finite order. Indeed, fix with , and choose such that . By Proposition 2, . Now (33) yields and satisfying On the other hand, the Leibniz formula gives such that Thus as asserted.

Theorem 5. Assume , , and has order . Then there are constants such that Here is the functional defined bywith (cf. (8)). Conversely, every distribution of this form has for its support, unless .

Proof. It is clear that . This establishes the converse.
To prove the nontrivial half of the theorem, consider a that satisfies Our objective is to prove that . Since given , there exists such that implies . The mean value theorem yields such that If , , an induction process then shows that, for some , Choose such that for some , and define Fix and . By the Leibniz formula, On the other hand [14, Equation 5.2(6)], for suitable , with . Consequently, Since has order , there is a constant such that And since , from Proposition 2 we infer The arbitrariness of shows that . Hence vanishes on the intersection of the null spaces of the functionals , and the desired representation follows from [31, Lemma 3.9].

Remark 6. Note that the functionals (38) can be written (modulo constant factors) as derivatives of the identity for the Hankel convolution (16). In fact, we have

3. Nonpolynomiality of the Activation Function

We begin by establishing two auxiliary results.

Lemma 7. Let . (i)The dilation operator , defined by , is continuous.(ii)The translation operator , defined by is continuous.(iii)If , then .

Proof. Given , it is apparent that ; indeed, the function is clearly continuous, and the limit exists and is finite. Similarly, therefore, for any ,Equation (53) proves continuity of and establishes (i).
To prove (ii), first of all we pick and show that is well defined. In fact, using (8) we may writeNext, we want to see that is continuous on . To this end, fix . Since given there exists such that and imply Therefore, for we obtain Since the function is continuous at , given there exists such that and imply . Moreover, if and with , then . Again by (8), for we thus have Consequently, , thus proving that . Now, the continuity of the translation operator can be deduced from (54): given , Finally, part (iii) derives immediately from (i) and (ii). The proof is complete.

In what follows, will mean that the function defines an absolutely regular -distribution (cf. [32, Theorem 3.7]). In other words, is an integrable function on whenever , and

Lemma 8. Assume and let . The following holds:

Proof. The identity in (61) may be easily verified by means of the operational rule (18). Equation (62) is a consequence of the fact that commutes with Hankel translations on (cf. [33]). Finally, (63) derives from (23) and the identity which can be checked through (13).

Theorem 9. Let and . Then is dense in if, and only if, .

Proof. First suppose is such that , so that . A combination of (61) and (62) gives Bearing in mind (63) and [24, Theorem 2.19], it turns out that, for every , one has and . The subspace formed by such functions being finite-dimensional (hence closed), it cannot be dense in , which prevents from being dense in .
For the converse, suppose that is not dense in . By the Hahn-Banach and Riesz representation theorems, there is a nonzero Radon measure , with and , such thatThe Hankel-Stieltjes transform of the measure gives rise to a multiplier of . Indeed, Hence By (15), this proves that or, equivalently, .
Given , we want to show that and . To this end, choose so that [32, Theorem 3.7], and pick an arbitrary . ThenAt this point we apply the following special case of Peetre’s inequality (cf. [34, Lemma 5.2]) to inferNowIf , thenIf , thenA combination of (71), (73), (74), (75), and (76) finally yields where denotes a suitable positive constant.
Consequently, and the Fubini theorem, along with [21, Definition 2 and Proposition 3], can be applied to write thus showing that .
Invoking [19, Definition 4.4] and taking into account (66) lead to or, equivalently from (14),Since , there exist and such that Given , let . Then Choose any with and set Clearly, and Equation (80) allows us to conclude In other words, . From Remark 4 and Theorem 5 we get that , which means [24, Theorem 2.19] that .
The proof is now complete.

4. Final Remarks

(i)RBFNNs of Hankel translates, as defined in this paper, admit only one-dimensional inputs. In order to allow for multidimensional inputs one should consider the multidimensional Hankel translation, defined by iteration of the one-dimensional translation operator with respect to each of the variables while the others are kept fixed (see, e.g., [35] and references therein). The proof of the above results for the multidimensional case could well be the subject of a forthcoming paper.(ii)According to Theorem 9 and [32, Theorem 3.7 and Corollary 3.10], any continuous function for which there exists so that is bounded on , or is integrable on , can be used as an activation function yielding universal approximation. A paradigmatic example is the Gaussian (iii)By considering RBFNNs of Hankel translates, a new parameter is introduced which in practice leaves a greater variety of manageable kernels at our disposal. This could be useful in handling mathematical models built upon a class of radial basis functions depending on the order whose performance might be improved by finely tuning , without increasing the number of centroids [36, 37].

Data Availability

This research is not based on any experimental data.

Conflicts of Interest

The author declares that she has no conflicts of interest.

References

M. D. Buhmann, Radial Basis Functions: Theory and Implementations, vol. 12, Cambridge Monographs on Applied and Computational Mathematics, Cambridge University Press, 2003.
View at: Publisher Site | MathSciNet
D. S. Broomhead and D. Lowe, “Multivariable functional interpolation and adaptive networks,” Complex Systems, vol. 2, no. 3, pp. 321–355, 1988.
View at: Google Scholar | MathSciNet
R. P. Lippmann, “Pattern classification using neural networks,” IEEE Communications Magazine, vol. 27, no. 11, pp. 47–64, 1989.
View at: Publisher Site | Google Scholar
S. Renals and R. Rohwer, “Phoneme classification experiments using radial basis functions,” in Proceedings of the International Joint Conference on Neural Networks I, pp. 461–467, Washington, DC, USA, June 1989.
View at: Publisher Site | Google Scholar
J. Park and I. W. Sandberg, “Universal approximation using radial basis function networks,” Neural Computation, vol. 3, no. 2, pp. 246–257, 1991.
View at: Publisher Site | Google Scholar
J. Park and I. W. Sandberg, “Approximation and radial-basis-function networks,” Neural Computation, vol. 5, no. 2, pp. 305–316, 1993.
View at: Publisher Site | Google Scholar
T. Chen and H. Chen, “Approximation capability to functions of several variables, nonlinear functionals, and operators by radial basis function neural networks,” IEEE Transactions on Neural Networks and Learning Systems, vol. 6, no. 4, pp. 904–910, 1995.
View at: Publisher Site | Google Scholar
T. Chen and H. Chen, “Universal approximation to nonlinear operators by neural networks with arbitrary activation functions and its application to dynamical systems,” IEEE Transactions on Neural Networks and Learning Systems, vol. 6, no. 4, pp. 911–917, 1995.
View at: Publisher Site | Google Scholar
M. Leshno, V. Y. Lin, A. Pinkus, and S. Schocken, “Multilayer feedforward networks with a nonpolynomial activation function can approximate any function,” Neural Networks, vol. 6, no. 6, pp. 861–867, 1993.
View at: Publisher Site | Google Scholar
H. N. Mhaskar and C. A. Micchelli, “Approximation by superposition of sigmoidal and radial basis functions,” Advances in Applied Mathematics, vol. 13, no. 3, pp. 350–373, 1992.
View at: Publisher Site | Google Scholar | MathSciNet
A. Pinkus, “TDI-subspaces of and some density problems from neural networks,” Journal of Approximation Theory, vol. 85, pp. 269–287, 1996.
View at: Google Scholar
S. Sonoda and N. Murata, “Neural network with unbounded activation functions is universal approximator,” Applied and Computational Harmonic Analysis, vol. 43, no. 2, pp. 233–268, 2017.
View at: Publisher Site | Google Scholar | MathSciNet
A. H. Zemanian, “A distributional Hankel transformation,” SIAM Journal on Applied Mathematics, vol. 14, pp. 561–576, 1966.
View at: Publisher Site | Google Scholar | MathSciNet
A. H. Zemanian, Generalized Integral Transformations, Interscience Publishers, 1968.
View at: MathSciNet
A. H. Zemanian, “The Hankel transformation of certain distributions of rapid growth,” SIAM Journal on Applied Mathematics, vol. 14, pp. 678–690, 1966.
View at: Publisher Site | Google Scholar | MathSciNet
J. de Sousa Pinto, “A generalised Hankel convolution,” SIAM Journal on Mathematical Analysis, vol. 16, no. 6, pp. 1335–1346, 1985.
View at: Publisher Site | Google Scholar | MathSciNet
J. Betancor and I. Marrero, “The Hankel convolution and the Zemanian spaces and ,” Mathematische Nachrichten, vol. 160, pp. 277–298, 1993.
View at: Google Scholar
J. J. Betancor and I. Marrero, “Structure and convergence in certain spaces of distributions and the generalized Hankel convolution,” Mathematica Japonica, vol. 38, no. 6, pp. 1141–1155, 1993.
View at: Google Scholar | MathSciNet
I. Marrero and J. J. Betancor, “Hankel convolution of generalized functions,” Rendiconti di Matematica e delle sue Applicazioni, Serie VII, vol. 15, no. 3, pp. 351–380, 1995.
View at: Google Scholar | MathSciNet
J. J. Betancor and I. Marrero, “Multipliers of Hankel transformable generalized functions,” Commentationes Mathematicae, vol. 33, no. 3, pp. 389–401, 1992.
View at: Google Scholar | MathSciNet
J. J. Betancor and I. Marrero, “On the topology of the space of Hankel convolution operators,” Journal of Mathematical Analysis and Applications, vol. 201, no. 3, pp. 994–1001, 1996.
View at: Publisher Site | Google Scholar | MathSciNet
E. M. Stein and G. Weiss, Introduction to Fourier Analysis on Euclidean Spaces, Princeton University Press, 1971.
View at: MathSciNet
G. Gigante, “Transference for hypergroups,” Collectanea Mathematica, vol. 52, no. 2, pp. 127–155, 2001.
View at: Google Scholar | MathSciNet
C. Arteaga and I. Marrero, “A scheme for interpolation by Hankel translates of a basis function,” Journal of Approximation Theory, vol. 164, no. 12, pp. 1540–1576, 2012.
View at: Publisher Site | Google Scholar | MathSciNet
C. Arteaga and I. Marrero, “Density in spaces of interpolation by Hankel translates of a basis function,” Journal of Function Spaces and Applications, vol. 2013, Article ID 813502, 9 pages, 2013.
View at: Publisher Site | Google Scholar | MathSciNet
C. Arteaga and I. Marrero, “Direct form seminorms arising in the theory of interpolation by Hankel translates of a basis function,” Advances in Computational Mathematics, vol. 40, no. 1, pp. 167–183, 2014.
View at: Publisher Site | Google Scholar | MathSciNet
C. Arteaga and I. Marrero, “Interpolation by Hankel translates of a basis function: inversion formulas and polynomial bounds,” The Scientific World Journal, vol. 2014, Article ID 242750, 13 pages, 2014.
View at: Publisher Site | Google Scholar
C. Arteaga and I. Marrero, “Universal approximation by radial basis function networks of Delsarte translates,” Neural Networks, vol. 46, pp. 299–305, 2013.
View at: Publisher Site | Google Scholar
C. Arteaga and I. Marrero, “Approximation in weighted p-mean by RBF networks of Delsarte translates,” Journal of Mathematical Analysis and Applications, vol. 414, no. 1, pp. 450–460, 2014.
View at: Publisher Site | Google Scholar | MathSciNet
C. Arteaga and I. Marrero, “Wiener's tauberian theorems for the Fourier-Bessel transformation and uniform approximation by RBF networks of Delsarte translates,” Journal of Mathematical Analysis and Applications, vol. 431, no. 1, pp. 482–493, 2015.
View at: Publisher Site | Google Scholar | MathSciNet
W. Rudin, Functional Analysis, McGraw-Hill, 2nd edition, 1991.
I. Marrero, “Regular and absolutely regular Hankel-transformable distributions,” Mathematische Nachrichten, vol. 263/264, pp. 154–170, 2004.
View at: Publisher Site | Google Scholar | MathSciNet
J. J. Betancor, “A new characterization of the bounded operators commuting with Hankel translation,” Archiv der Mathematik, vol. 69, no. 5, pp. 403–408, 1997.
View at: Publisher Site | Google Scholar | MathSciNet
J. Barros-Neto, An Introduction to The Theory of Distributions, Krieger, 1981.
J. Dziuban'ski, M. Preisner, and B. Wróbel, “Multivariate Hörmander-type multiplier theorem for the Hankel transform,” Journal of Fourier Analysis and Applications, vol. 19, pp. 417–437, 2013.
View at: Google Scholar
H. Corrada, K. Leeb, B. Klein et al., “Examining the relative influence of familial, genetic and environmental covariate information in flexible risk models,” Proceedings of the National Academy of Sciences of the United States of America, vol. 106, no. 20, pp. 8128–8133, 2009.
View at: Google Scholar
S. H. Javaran, N. Khaji, and A. Noorzad, “First kind Bessel function (J-Bessel) as radial basis function for plane dynamic analysis using dual reciprocity boundary element method,” Acta Mechanica, vol. 218, no. 3-4, pp. 247–258, 2011.
View at: Publisher Site | Google Scholar

Copyright

Copyright © 2019 Isabel Marrero. This is an open access article distributed under the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.

PDF Download Citation

Download other formats

Order printed copies

Views

461

Downloads

718

Citations