Abstract

We study the convergence behavior of regularized regression based on reproducing kernel Banach spaces (RKBSs). The convex inequality of uniform convex Banach spaces is used to show the robustness of the optimal solution with respect to the distributions. The learning rates are derived in terms of the covering number and -functional.

1. Introduction

Recently, there is an increasing research interest in learning with abstract functional spaces, and considerable work has been done in [13] and so on.

Letbe a normed vector space consisting of real functions on a compact distance space, and let be a given positive number. Let be a finite set of samples drawn independently and identically (i.i.d.) according to a distributiononThen, the regularized learning scheme associating with a given hypothesis spaceand the least square loss is

whereis a given real number. The unknown Borel probability distributioncan be decomposed intoand, where is the conditional probability of at and is the marginal probability on .

The regression function corresponding to the least square loss is

which satisfies

When the hypothesis spacesin (1) are reproducing kernel Banach spaces, we call it the RKBSs based on regularized regression learning defined by [4, 5] recently. The represented theorem related closely to regularized learning is studied in case thatis an RKBS, and the discussions are extended to the generalized semi-inner-product RKBSs in [6].

In the present paper, we will provide an investigation on the learning rates of scheme (1) whenis an RKBS with uniform convexity. The paper is organized as follows. In Section 2, we show the main results of the present paper. The robustness is studied in Section 3, and the sample errors are bounded in Section 4. The approximation error boils down to a -functional. The learning rates are bounded in Section 5.

For a given real number , we denote bythe class of-measurable functionssatisfying .

We say if there is a constantsuch that. We say if bothand.

2. Notions and Results

To state the results of the present paper, we first introduce some notions as follows.

2.1. The RKBSs

We denote bythe Banach space with dual spaceand norm. Forand , we write.

A reproducing kernel Banach space (RKBS) onis a reflexive Banach space of real functions onwhose dual spaceis isometric to a Banach spaceof functions on , and the point evaluations are continuous functions on bothand . It was shown by Theorem  2 of [4] that ifis an RKBS on, then, there exists uniquely a function called the reproducing kernel ofsatisfying the following:(i), , ;(ii), , ,  .(iii)The linear span ofis dense in, namely,

(iv)The linear span ofis dense in, namely,

(v)For allthere holds.

Whenis an RKHS,is indeed the reproducing kernel in the usual sense (see [7]).

Sinceis a reflective Banach space, we have

A way of producing reproducing kernel spaces in spaces by the idempotent integral operators was provided in [8]. In the present paper, we provide a method to construct RKBSs by orthogonal function series.

Example 1. Letbe a given closed interval and let,be a sequence of continuous functions onsatisfying the following:(i)for;(ii)andare orthonormal (in) when;(iii)is dense infor.
Letbe a given positive real number sequence satisfying . Define
and the functional classonby
where. We define the spaceforin an analogous way.

We have the following proposition.

Proposition 2. Define a bivariate operation onandby
Then, is a reproducing kernel Banach space with reproducing kernel

Proof. Let andbe defined in an analogous way. Then, bothandare Banach spaces andand.

By (9) we knowandare isometric isomorphisms. Therefore,are Banach spaces.

Since , we have for that

By the same way, we have for any that ; that is, the reproducing property holds.

2.2. The Uniform Convexity

In this subsection, we focus on some notions in convex analysis and Banach geometry theory.

Let be a convex function. Then,

We callthe subdifferential ofat. If, then, we calla subgradient ofat.

A well-known result is thatis a minimal value point of a convex functiononif and only if(see [9]).

A Banach spaceis called-uniform convex if there are constants,  such that the modulus defined by

satisfiesIn particular, any Hilbert spaces are 2-uniform convex Banach spaces.

Define Then, by (28) in Corollary  1 of [10] we knowis-uniform convex if and only if there is a positive constantsuch that for all and all there holds

In [1114] we know that, for a given , the space , the Lebesgue spacesand the Sobolev spaceare-uniform convex. Also, letandbe defined as in Section 2.1. Then, by the fact thatand are isometric isomorphisms, we knowis-uniform convex ifand-uniform convex if. Therefore, we knowis a-uniform convex Banach space, whereis 2 ifand its value isif.

2.3. Main Results

Letbe a distance space and. The covering number is defined to be the minimal positive integer numbersuch that there existsdisk inwith radiuscovering.

We say a compact subsetin a distance space has logarithmic complexity exponentif there is a constantsuch that the closed ball of radiuscentered at origin, that is, , satisfies

Now we are in a position to present the main results of this paper.

Theorem 3. Letbe an RKBS with-uniform convexity and a reproducing kernel which is uniform continuous onin terms of the norm, that is, . is a uniform continuous function on, and there is a constantsuch thatholds for all. Letbe the unique minimizer of scheme (1). If, then for anythere holds
where
is a-functional,and

The covering number involved in (16) has been studied widely (see [1519]). In this paper, we assume has the logarithmic complexity.

Theorem 4. Under the conditions of Theorem 3, if and has logarithmic complexity with exponent , then for any , with confidence , there holds
whereis defined in (15).

We now give some remarks on Theorems 3 and 4.(i)In Theorem 3, we require that the kernelis uniform continuous and uniform bounded on. In fact, a large class of real bivariate functions satisfies these conditions. For example, if the function sequencedefined in Example 1 is uniformly bounded, that is, holds for alland all, then, kernel is continuous on which turns out that is uniform continuous on . Therefore, shows that is uniform continuous and bounded with norm .(ii)By the definition of , we know that if then, . It is bounded if .(iii)If is a reproducing kernel Hilbert space, then, ,  . Moreover, if , then, we have by (19) that

(iv)We can show a way of bounding the decay rates of for . Let. Then, we have the following Fourier expansion:Define an operator sequence by Then, for a given positive integerwe haveand where we have used the generalized Bessel inequality (see [20]): Also, By (25) and (23) we knowholds for all positive integersand, in this case, One can choose suitablesuch that it depends upon the sample numberand obtain the decay rates whenThere are many choices for the type of operator (22). For example, the Bernstein-Durrmeyer operators (see, e.g., [2123]) and the de la Valle-Poussin sum operators are such types (see [24]). This method was first provided by [25] and was extended in [26, 27].(v)We know from [19] that the RKHSs with logarithmic complexity with exponent exist. By Corollary  4.1 and Theorem  2.1 of [16] we know that ifsatisfy then, the covering number ofmay attain the decay of complexity exponent. In a recent paper (see [28]), Guntuboyina and Sen showed that the set of all convex functions defined onthat are uniform bounded has the logarithmic complexity exponentin the-metric.

3. Robustness

Robustness is a quantitative description of the solutions on the distributions.

Define the-control integral regularized model corresponding to (1) by

whereis defined in (3). Then,is influenced by the distributions. For any bounded-measurable functionon, we define the empirical measureas follows:

Then,We give the following theorem.

Theorem 5. Letbe an RKBS with-uniform convexity and the reproducing kernel, and let andbe the solutions of scheme (27) with respect to distributionsand,   respectively. Then,
whereis the constant defined in (14).

Theorem 5 shows howinfluences the unique solution.

To prove Theorem 5, we need the following lemmas.

Lemma 6. Under the conditions of Theorem 5, there holds
where the pointinmeansfor any.

Proof. We restate the following statement.
Letbe a Banach space, be a real function. We sayis Gateaux differentiable atif there is ansuch that for anythere holds
and writeBy [29] we know that ifis convex onand is Gateaux differentiable at, then,
By equality
we have for any that
Sinceis a convex function on, we know (30) holds.

Lemma 7. Take. Then, under the conditions of Theorem 5, there hold the following.(i)There exists uniquely a minimizerof the problem (27) and (ii)There is asuch that

Proof. The uniqueness of the minimizer can be obtained by the fact that (27) is a strict convex optimization problem. By the definition of we have
We then have (34).

Proof of (35). Sinceis the unique solution of (27), we have
Notice that bothandare convex functions abouton. We have
By (30), we know that (37) leads to
Therefore, there issuch that (35) holds.

Lemma 8. Letbe an RKBS satisfying the conditions of Theorem 3. Then,

Proof. The reproducing property and (16) give
Then, the factgives (40).

Lemma 9. Let be the reproducing kernel of ,   and is uniform continuous about on in norm, be a given real number. Then, the ball is a compact subset of .

Proof. Sinceis a compact distance space, so is . Sinceis uniform continuous aboutin norm, we know that for anythere is asuch that for all with , we have
and for anyholds
By (43), we know that is a closed, bounded, and equicontinuous set. Therefore, is a compact set of .

Proof of Theorem 5. By the definition ofand (30) we know
Also, by (44) and the definitions of and we have
Sinceis-uniform convex, we have by (14) and the definition ofthat
Combining (46) with (45), we have
It follows that
We then have (29).

4. Sample Error

We give the following sample error bounds.

Theorem 10. Letbe an RKBS satisfying the conditions of Theorem 3.is the solution of scheme (27) with respect toandis the solution of (1). Then, for all there hold
where

To show Theorem 10, we first give a lemma.

Lemma 11 (see [15]). Letbe a family of functions from a probability spacetoand a distance on. Letbe of full measure and constantssuch that(i) for all and all (ii) for all and all , where
Then, for all,

Proof of Theorem 10. Take into (29). Then,
By (7) and the reproducing property, we have
Since
and (40), we have
Define
Then,
By (52), we have for allthat
By (53), (56), and (59), we know
which gives
It follows that
That is,
We then have (49).

5. Learning Rates

Proof of Theorem 3. We know from [30] that for anythere holds
Sinceis a compact set, we have by (40) thatTherefore,
By (65) we have
which gives for any that
By (49) and above inequality we have
or
Since,we know
By (69) and above inequality we have (16).

To show Theorem 4, we need two lemmas.

Lemma 12 (see [31]). Let,  and Then, the equation
has a unique positive zero. In addition,

Proof of Theorem 4. Sincehas logarithmic complexity exponent, we have by (15) a constantsuch that
Then, by (16) we have
Take
Then,
By Lemma 12, we know that the unique solutionof (75) satisfies By (74) and (77), we have (19).

Acknowledgments

This work was supported partially by the National Natural Science Foundation of China under Grant nos. 10871226, 61179041, 11271199. The authors thank the reviewers for giving many valuable suggestions and comments which make the paper presented in a better form.