About this Journal Submit a Manuscript Table of Contents
Journal of Probability and Statistics
Volume 2012 (2012), Article ID 593036, 18 pages
http://dx.doi.org/10.1155/2012/593036
Research Article

A Criterion for the Fuzzy Set Estimation of the Regression Function

Departamento de Matemáticas, Universidad de Oriente, Cumaná 6101, Venezuela

Received 1 May 2012; Accepted 30 June 2012

Academic Editor: A. Thavaneswaran

Copyright © 2012 Jesús A. Fajardo. This is an open access article distributed under the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.

Abstract

We propose a criterion to estimate the regression function by means of a nonparametric and fuzzy set estimator of the Nadaraya-Watson type, for independent pairs of data, obtaining a reduction of the integrated mean square error of the fuzzy set estimator regarding the integrated mean square error of the classic kernel estimators. This reduction shows that the fuzzy set estimator has better performance than the kernel estimations. Also, the convergence rate of the optimal scaling factor is computed, which coincides with the convergence rate in classic kernel estimation. Finally, these theoretical findings are illustrated using a numerical example.

1. Introduction

The methods of kernel estimation are among the nonparametric methods commonly used to estimate the regression function 𝑟, with independent pairs of data. Nevertheless, through the theory of point processes (see e.g, Reiss [1]) we can obtain a new nonparametric estimation method, which is based on defining a nonparametric estimator of the Nadaraya-Watson type regression function, for independent pairs of data, by means of a fuzzy set estimator of the density function. The method of fuzzy set estimation introduced by Falk and Liese [2] is based on defining a fuzzy set estimator of the density function by means of thinned point processes (see e.g, Reiss [1], Section 2.4); a process framed inside the theory of the point processes, which is given by the following: ̂𝜃𝑛=1𝑛𝑎𝑛𝑛𝑖=1𝑈𝑖,(1.1) where 𝑎𝑛>0 is a scaling factor (or bandwidth) such that 𝑎𝑛0 as 𝑛, and the random variables 𝑈𝑖, 1𝑖𝑛, are independent with values in {0,1}, which decides whether 𝑋𝑖 belongs to the neighborhood of 𝑥0 or not. Here 𝑥0 is the point of estimation (for more details, see Falk and Liese [2]). On the other hand, we observe that the random variables that define the estimator ̂𝜃𝑛 do not possess, for example, precise functional characteristics in regards to the point of estimation. This absence of functional characteristics complicates the evaluation of the estimator ̂𝜃𝑛 using a sample, as well as the evaluation of the fuzzy set estimator of the regression function if it is defined in terms of ̂𝜃𝑛.

The method of fuzzy set estimation of the regression function introduced by Fajardo et al. [3] is based on defining a fuzzy set estimator of the Nadaraya-Watson type, for independent pairs of data, in terms of the fuzzy set estimator of the density function introduced in Fajardo et al. [4]. Moreover, the regression function is estimated by means of an average fuzzy set estimator considering pairs of fixed data, which is a particular case if we consider independent pairs of nonfixed data. Note that the statements made in Section 4 in Fajardo et al. [3] are satisfied if independent pairs of nonfixed data are considered. This last observation is omitted in Fajardo et al. [3]. It is important to emphasize that the fuzzy set estimator introduced in Fajardo et al. [4], a particular case of the estimator introduced by Falk and Liese [2], of easy practical implementation, will allow us to overcome the difficulties presented by the estimator ̂𝜃𝑛 and satisfy the almost sure, in law, and uniform convergence properties over compact subsets on .

In this paper we estimate the regression function by means of the nonparametric and fuzzy set estimator of the Nadaraya-Watson type, for independent pairs of data, introduced by Fajardo et al. [3], obtaining a significant reduction of the integrated mean square error of the fuzzy set estimator regarding the integrated mean square error of the classic kernel estimators. This reduction is obtained by the conditions imposed on the thinning function, a function that allows to define the estimator proposed by Fajardo et al. [4], which implies that the fuzzy set estimator has better performance than the kernel estimations. The above reduction is not obtained in Fajardo et al. [3]. Also, the convergence rate of the optimal scaling factor is computed, which coincides with the convergence rate in classic kernel estimation of the regression function. Moreover, the function that minimizes the integrated mean square error of the fuzzy set estimator is obtained. Finally, these theoretical findings are illustrated using a numerical example estimating a regression function with the fuzzy set estimator and the classic kernel estimators.

On the other hand, it is important to emphasize that, along with the reduction of the integrated mean square error, the thinning function, introduced through the thinned point processes, can be used to select points of the sample with different probabilities, in contrast to the kernel estimator, which assigns equal weight to all points of the sample.

This paper is organized as follows. In Section 2, we define the fuzzy set estimator of the regression function and we present its properties of convergence. In Section 3, we obtain the mean square error of the fuzzy set estimator of the regression function, Theorem 3.1, as well as the optimal scale factor and the integrated mean square error. Moreover, we establish the conditions to obtain a reduction of the constants that control the bias and the asymptotic variance regarding the classic kernel estimators; the function that minimizes the integrated mean square error of the fuzzy set estimator is also obtained. In Section 4 a simulation study was conducted to compare the performances of the fuzzy set estimator with the classical Nadaraya-Watson estimators. Section 5 contains the proof of the theorem in the Section 3.

2. Fuzzy Set Estimator of the Regression Function and Its Convergence Properties

In this section we define by means of fuzzy set estimator of the density function introduced in Fajardo et al. [4] a nonparametric and fuzzy set estimator of the regression function of Nadaraya-Watson type for independent pairs of data. Moreover, we present its properties of convergence.

Next, we present the fuzzy set estimator of the density function introduced by Fajardo et al. [4], which is a particular case of the estimator proposed in Falk and Liese [2] and satisfies the almost sure, in law, and uniform convergence properties over compact subset on .

Definition 2.1. Let 𝑋1,,𝑋𝑛 be an independent random sample of a real random variable 𝑋 with density function 𝑓. Let 𝑉1,,𝑉𝑛 be independent random variables uniformly on [0,1] distributed and independent of 𝑋1,,𝑋𝑛. Let 𝜑 be such that 0<𝜑(𝑥)𝑑𝑥< and 𝑎𝑛=𝑏𝑛𝜑(𝑥)𝑑𝑥, 𝑏𝑛>0. Then the fuzzy set estimator of the density function 𝑓 at the point 𝑥0 is defined as follows: ̂𝜗𝑛𝑥0=1𝑛𝑎𝑛𝑛𝑖=1𝑈𝑥0,𝑏𝑛𝑋𝑖,𝑉𝑖=𝜏𝑛𝑥0𝑛𝑎𝑛,(2.1) where 𝑈𝑥0,𝑏𝑛𝑋𝑖,𝑉𝑖=𝟙[0,𝜑((𝑋𝑖𝑥0)/𝑏𝑛)]𝑉𝑖.(2.2)

Remark 2.2. The events {𝑋𝑖=𝑥}, 𝑥, can be described in a neighborhood of 𝑥0 through the thinned point process 𝑁𝜑𝑛𝑛()=𝑛𝑖=1𝑈𝑥0,𝑏𝑛𝑋𝑖,𝑉𝑖𝜀𝑋𝑖(),(2.3) where 𝜑𝑛(𝑥)=𝜑𝑥𝑥0𝑏𝑛𝑈=𝑥0,𝑏𝑛𝑋𝑖,𝑉𝑖=1𝑋𝑖=𝑥,(2.4) and 𝑈𝑥0,𝑏𝑛(𝑋𝑖,𝑉𝑖) decides whether 𝑋𝑖 belongs to the neighborhood of 𝑥0 or not. Precisely, 𝜑𝑛(𝑥) is the probability that the observation 𝑋𝑖=𝑥 belongs to the neighborhood of 𝑥0. Note that this neighborhood is not explicitly defined, but it is actually a fuzzy set in the sense of Zadeh [5], given its membership function 𝜑𝑛. The thinned process 𝑁𝜑𝑛𝑛 is therefore a fuzzy set representation of the data (see Falk and Liese [2], Section 2). Moreover, we can observe that 𝑁𝜑𝑛𝑛̂𝜗()=𝑛(𝑥0) and the random variable 𝜏𝑛(𝑥0) is binomial (𝑛,𝛼𝑛(𝑥0)) distributed with 𝛼𝑛𝑥0𝑈=𝔼𝑥0,𝑏𝑛𝑋𝑖,𝑉𝑖𝑈=𝑥0,𝑏𝑛𝑋𝑖,𝑉𝑖𝜑=1=𝔼𝑛.(𝑋)(2.5) In what follows we assume that 𝛼𝑛(𝑥0)(0,1).

Now, we present the fuzzy set estimator of the regression function introduced in Fajardo et al. [3], which is defined in terms of ̂𝜗𝑛(𝑥0).

Definition 2.3. Let ((𝑋1,𝑌1),𝑉1),,((𝑋𝑛,𝑌𝑛),𝑉𝑛) be independent copies of a random vector ((𝑋,𝑌),𝑉), where 𝑉1,,𝑉𝑛 are independent random variables uniformly on [0,1] distributed, and independent of (𝑋1,𝑌1),,(𝑋𝑛,𝑌𝑛). The fuzzy set estimator of the regression function 𝑟(𝑥)=𝔼[𝑌𝑋=𝑥] at the point 𝑥0 is defined as follows: ̂𝑟𝑛𝑥0=𝑛𝑖=1𝑌𝑖𝑈𝑥0,𝑏𝑛𝑋𝑖,𝑉𝑖𝜏𝑛𝑥0if𝜏𝑛𝑥00,0if𝜏𝑛𝑥0=0.(2.6)

Remark 2.4. The fact that 𝑈(𝑥,𝑣)=𝟙[0,𝜑(𝑥)](𝑣), 𝑥, 𝑣[0,1], is a kernel when 𝜑(𝑥) is a density does not guarantee that ̂𝑟𝑛(𝑥0) is equivalent to the Nadaraya-Watson kernel estimator. With this observation the statement made in Remark 2 by Fajardo et al. [3] is corrected. Moreover, the fuzzy set representation of the data (𝑋𝑖,𝑌𝑖)=(𝑥,𝑦) is defined over the window 𝐼𝑥0× with thinning function 𝜓𝑛(𝑥,𝑦)=𝜑((𝑥𝑥0)/𝑏𝑛)𝟙(𝑦), where 𝐼𝑥0 denotes the neighborhood of 𝑥0. In the particular case |𝑌|𝑀, 𝑀>0, the fuzzy set representation of the data (𝑋𝑖,𝑌𝑖)=(𝑥,𝑦) comes given by 𝜓𝑛(𝑥,𝑦)=𝜑((𝑥𝑥0)/𝑏𝑛)𝟙[𝑀,𝑀](𝑦).

Consider the following conditions.  (C1) Functions 𝑓 and 𝑟 are at least twice continuously differentiable in a neighborhood of 𝑥0. (C2)𝑓(𝑥0)>0.  (C3) Sequence 𝑏𝑛 satisfies: 𝑏𝑛0,𝑛𝑏𝑛/log(𝑛),as𝑛. (C4) Function 𝜑 is symmetrical regarding zero, has compact support on [𝐵,𝐵], 𝐵>0, and it is continuous at 𝑥=0 with 𝜑(0)>0. (C5) There exists 𝑀>0 such that |𝑌|<𝑀  𝑎.𝑠. (C6) Function 𝜙(𝑢)=𝔼[𝑌2𝑋=𝑢] is at least twice continuously differentiable in a neighborhood of 𝑥0. (C7)𝑛𝑏5𝑛0, as 𝑛. (C8) Function 𝜑() is monotone on the positives. (C9)𝑏𝑛0 and 𝑛𝑏2𝑛/log(𝑛), as 𝑛. (C10) Functions 𝑓 and 𝑟 are at least twice continuously differentiable on the compact set [𝐵,𝐵]. (C11) There exists 𝜆>0 such that inf𝑥[𝐵,𝐵]𝑓(𝑥)>𝜆.

Next, we present the convergence properties obtained in Fajardo et al. [3].

Theorem 2.5. Under conditions (C1)–(C5), one has ̂𝑟𝑛𝑥0𝑥𝑟0𝑎.𝑠.(2.7)

Theorem 2.6. Under conditions (C1)–(C7), one has 𝑛𝑎𝑛̂𝑟𝑛𝑥0𝑥𝑟0𝑁0,Var𝑌𝑋=𝑥0𝑓𝑥0.(2.8) The “” symbol denotes convergence in law.

Theorem 2.7. Under conditions (C4)–(C5) and (C8)–(C11), one has sup𝑥[𝐵,𝐵]||̂𝑟𝑛||(𝑥)𝑟(𝑥)=𝑜(1).(2.9)

Remark 2.8. The estimator ̂𝑟𝑛 has a limit distribution whose asymptotic variance depends only on the point of estimation, which does not occur with kernel regression estimators. Moreover, since 𝑎𝑛=𝑜(𝑛1/5) we see that the same restrictions are imposed for the smoothing parameter of kernel regression estimators.

3. Statistical Methodology

In this section we will obtain the mean square error of ̂𝑟𝑛, as well as the optimal scale factor and the integrated mean square error. Moreover, we establish the conditions to obtain a reduction of the constants that control the bias and the asymptotic variance regarding the classic kernel estimators. The function that minimizes the integrated mean square error of ̂𝑟𝑛 is also obtained.

The following theorem provides the asymptotic representation for the mean square error (MSE) of ̂𝑟𝑛. Its proof is deferred to Section 5.

Theorem 3.1. Under conditions (C1)–(C6), one has 𝔼̂𝑟𝑛(𝑥)𝑟(𝑥)2=1𝑛𝑏𝑛𝑉𝐹(𝑥)+𝑏4𝑛𝐵2𝐹𝑎(𝑥)+𝑜4𝑛+1𝑛𝑎𝑛,(3.1) where 𝑉𝐹(𝑥)=𝜙(𝑥)𝑟2(𝑥)1𝑓(𝑥)=𝑐𝜑(𝑥)𝑑𝑥1(𝑥),𝐵𝜑(𝑥)𝑑𝑥𝐹(1𝑥)=2𝑔𝜑(𝑢)𝑑𝑢(2)(𝑥)𝑓(2)(𝑥)𝑟(𝑥)𝑢𝑓(𝑥)2=𝑐𝜑(𝑢)𝑑𝑢2(𝑢𝑥)2𝜑(𝑢)𝑑𝑢2,𝜑(𝑢)𝑑𝑢(3.2) with 𝑎𝑛=𝑏𝑛𝜑(𝑥)𝑑𝑥.(3.3)

Next, we calculate the formula for the optimal asymptotic scale factor 𝑏𝑛 to perform the estimation. The integrated mean square error (IMSE) of ̂𝑟𝑛 is given by the following: IMSÊ𝑟𝑛=1𝑛𝑏𝑛𝑉𝐹(𝑥)𝑑𝑥+𝑏4𝑛𝐵2𝐹(𝑥)𝑑𝑥.(3.4) From the above equality, we obtain the following formula for the optimal asymptotic scale factor 𝑏𝑛𝜑=𝑐𝜑(𝑢)𝑑𝑢1(𝑢)𝑑𝑢𝑛𝑢2𝜑(𝑢)𝑑𝑢2𝑐2(𝑢)2𝑑𝑢1/5.(3.5) We obtain a scaling factor of order 𝑛1/5, which implies a rate of optimal convergence for the IMSE[̂𝑟𝑛] of order 𝑛4/5. We observe that the optimal scaling factor order for the method of fuzzy set estimation coincides with the order of the classic kernel estimate. Moreover, IMSÊ𝑟𝑛=𝑛4/5𝐶𝜑,(3.6) where 𝐶𝜑=54𝑐1(𝑢)𝑑𝑢4𝑢2𝜓(𝑢)𝑑𝑢2𝑐2(𝑢)2𝑑𝑢𝜑(𝑢)𝑑𝑢41/5,(3.7) with 𝜓(𝑥)=𝜑(𝑥).𝜑(𝑢)𝑑𝑢(3.8) Next, we will establish the conditions to obtain a reduction of the constants that control the bias and the asymptotic variance regarding the classic kernel estimators. For it, we will consider the usual Nadaraya-Watson kernel estimator ̂𝑟NW𝐾(𝑥)=𝑛𝑖=1𝑌𝑖𝐾𝑋𝑖𝑥/𝑏𝑛𝑛𝑖=1𝐾𝑋𝑖𝑥0/𝑏𝑛,(3.9) which has the mean squared error (see e.g, Ferraty et al. [6], Theorem 2.4.1) 𝔼̂𝑟NW𝐾(𝑥)𝑟(𝑥)2=1𝑛𝑏𝑛𝑉𝐾(𝑥)+𝑏4𝑛𝐵2𝐾𝑏(𝑥)+𝑜4𝑛+1𝑛𝑏𝑛,(3.10) where 𝑉𝐾(𝑥)=𝑐1(𝐾𝑥)2(𝐵𝑢)𝑑𝑢,𝐾𝑐(𝑥)=2𝑢(𝑥)2𝐾(𝑢)𝑑𝑢2.(3.11) Moreover, the IMSE of ̂𝑟NW𝐾 is given by the following: IMSÊ𝑟NW𝐾=1𝑛𝑏𝑛𝑉𝐾(𝑥)𝑑𝑥+𝑏4𝑛𝐵2𝐾(𝑥)𝑑𝑥.(3.12) From the above equality, we obtain the following formula for the optimal asymptotic scale factor 𝑏𝑛NW𝐾=𝐾2𝑐(𝑢)𝑑𝑢1(𝑢)𝑑𝑢𝑛𝑢2𝐾(𝑢)𝑑𝑢2𝑐2(𝑢)2𝑑𝑢1/5.(3.13) Moreover, IMSÊ𝑟NW𝐾=𝑛4/5𝐶𝐾,(3.14) where 𝐶𝐾=54𝑐1(𝑢)𝑑𝑢4𝐾2(𝑢)𝑑𝑢4𝑢2𝐾(𝑢)𝑑𝑢2𝑐2(𝑢)2𝑑𝑢1/5.(3.15)

The reduction of the constants that control the bias and the asymptotic variance, regarding the classic kernel estimators, are obtained if for all kernel 𝐾𝐾𝜑(𝑢)𝑑𝑢2(𝑢)𝑑𝑢1,𝑢2𝑢𝜓(𝑢)𝑑𝑢2𝐾(𝑢)𝑑𝑢.(3.16)

Remark 3.2. The conditions on 𝜑 allows us to obtain a value of 𝐵 such that 𝐵𝐵𝐾𝜑(𝑢)𝑑𝑢>2(𝑢)𝑑𝑢1.(3.17) Moreover, to guarantee that 𝑢2𝑢𝜓(𝑢)𝑑𝑢2𝐾(𝑢)𝑑𝑢,(3.18) we define the function 𝜓(𝑥)=𝜑(𝑥),𝜑(𝑢)𝑑𝑢(3.19) with compact support on [𝐵,𝐵][𝐵,𝐵]. Next, we guarantee the existence of 𝐵. As 1<𝐾𝜑(𝑢)𝑑𝑢2([],𝑢)𝑑𝑢,𝜑(𝑥)0,1(3.20) we have 𝑥2𝜓(𝑥)𝑥2𝐾2(𝑢)𝑑𝑢.(3.21) Observe that for each 𝑢𝐶(0,2𝐾(𝑢)𝑑𝑢] exists 𝐵=33𝐶2𝐾2,(𝑢)𝑑𝑢(3.22) such that 𝐶=𝐵𝐵𝐾2𝑥(𝑢)𝑑𝑢2𝑢𝑑𝑥2𝐾(𝑢)𝑑𝑢.(3.23) Combining (3.21) and (3.23), we obtain 𝐵𝐵𝑢2𝑢𝜓(𝑢)𝑑𝑢2𝐾(𝑢)𝑑𝑢.(3.24) In our case we take 𝐵𝐵.

On the other hand, the criterion that we will implement to minimizing (3.6) and obtain a reduction of the constants that control the bias and the asymptotic variance regarding the classic kernel estimation, is the following Maximizing𝜑(𝑢)𝑑𝑢,(3.25) subject to the conditions 𝜑2(5𝑢)𝑑𝑢=3;𝑢𝑢𝜑(𝑢)𝑑𝑢=0;2𝑣𝜑(𝑢)𝑑𝑢=0,(3.26) with 𝑢[𝐵,𝐵], 𝜑(𝑢)[0,1], 𝜑(0)>0 and 𝑢𝑣2𝐾𝐸(𝑢)𝑑𝑢, where 𝐾𝐸 is the Epanechnikov kernel 𝐾𝐸3(𝑥)=41𝑥2𝟙[1,1](𝑥).(3.27) The Euler-Lagrange equation with these constraints is 𝜕𝜕𝜑𝜑+𝑎𝜑2𝑥+𝑏𝑥𝜑+𝑐2𝜑𝑣=0,(3.28) where 𝑎, 𝑏, and 𝑐 the three multipliers corresponding to the three constraints. This yields 𝜑(𝑥)=116𝑥252𝟙[25/16,25/16](𝑥).(3.29)

The new conditions on 𝜑, allows us to affirm that for all kernel 𝐾IMSÊ𝑟𝑛IMSÊ𝑟NW𝐾.(3.30) Thus, the fuzzy set estimator has the best performance.

4. Simulations

A simulation study was conducted to compare the performances of the fuzzy set estimator with the classical Nadaraya-Watson estimators. For the simulation, we used the regression function given by Härdle [7] as follows: 𝑌𝑖=1𝑋𝑖+𝑒(200(𝑋𝑖0.5)2)+𝜀𝑖,(4.1) where the 𝑋𝑖 were drawn from a uniform distribution based on the interval [0,1]. Each 𝜀𝑖 has a normal distribution with 0 mean and 0.1 variance. In this way, we generated samples of size 100, 250, and 500. The bandwidths was computed using (3.5) and (3.13). The fuzzy set estimator and the kernel estimations were computed using (3.29), and the Epanechnikov and Gaussian kernel functions. The IMSE values of the fuzzy set estimator and the kernel estimators are given in Table 1.

tab1
Table 1: IMSE* values of the estimations for the fuzzy set estimator and the kernel estimators.

As seen from Table 1, for all sample sizes, the fuzzy set estimator using varying bandwidths have smaller IMSE values than the kernel estimators with fixed and different bandwidth for each estimator. In each case, it is seen that the fuzzy set estimator has the best performance. Moreover, we see that the kernel estimation computed using the Epanechnikov kernel function shows a better performance than the estimations computed using the Gaussian kernel function.

The graphs of the real regression function and the estimations of the regression functions computed over a sample of 500, using 100 points and 𝑣=0.2, are illustrated in Figures 1 and 2.

593036.fig.001
Figure 1: Estimation of 𝑟 with ̂𝑟𝑛 and ̂𝑟NW𝐾𝐸.
593036.fig.002
Figure 2: Estimation of 𝑟 with ̂𝑟𝑛 and ̂𝑟NW𝐾𝐺.

5. Proof of Theorem 3.1

Proof. Throughout this proof 𝐶 will represent a positive real constant, which can vary from one line to another, and to simplify the annotation we will write 𝑈𝑖 instead of 𝑈𝑥,𝑏𝑛(𝑋𝑖,𝑉𝑖). Let us consider the following decomposition 𝔼̂𝑟𝑛(𝑥)𝑟(𝑥)2=Var̂𝑟𝑛+𝔼(𝑥)̂𝑟𝑛(𝑥)𝑟(𝑥)2.(5.1) Next, we will present two equivalent expressions for the terms to the right in the above decomposition. For it, we will obtain, first of all, an equivalent expression for the expectation. We consider the following decomposition (see e.g, Ferraty et al. [6]) ̂𝑟𝑛(𝑥)=̂𝑔𝑛(𝑥)𝔼̂𝜗𝑛̂𝜗(𝑥)1𝑛̂𝜗(𝑥)𝔼𝑛(𝑥)𝔼̂𝜗𝑛+̂𝜗(𝑥)𝑛̂𝜗(𝑥)𝔼𝑛(𝑥)2𝔼̂𝜗𝑛(𝑥)2̂𝑟𝑛(𝑥).(5.2) Taking the expectation, we obtain 𝔼̂𝑟𝑛=𝔼(𝑥)̂𝑔𝑛(𝑥)𝔼̂𝜗𝑛𝐴(𝑥)1𝔼̂𝜗𝑛(𝑥)2+𝐴2𝔼̂𝜗𝑛(𝑥)2,(5.3) where 𝐴1=𝔼̂𝑔𝑛̂𝜗(𝑥)𝑛̂𝜗(𝑥)𝔼𝑛,𝐴(𝑥)2̂𝜗=𝔼𝑛̂𝜗(𝑥)𝔼𝑛(𝑥)2̂𝑟𝑛.(𝑥)(5.4) The hypotheses of Theorem 3.1 allow us to obtain the following particular expressions for 𝔼[̂𝑔𝑛(𝑥)] and ̂𝜗𝔼[𝑛(𝑥)], which are calculated in the proof of Theorem 1 in Fajardo et al. [3]. That is 𝔼̂𝑔𝑛(𝑥)=𝔼𝑌𝑈𝑎𝑛𝑎=𝑔(𝑥)+𝑂2𝑛,𝔼̂𝜗𝑛𝑈(𝑥)=𝔼𝑎𝑛𝑎=𝑓(𝑥)+𝑂2𝑛.(5.5) Combining the fact that ((𝑋𝑖,𝑌𝑖),𝑉𝑖), 1𝑖𝑛, are identically distributed, with condition (C3), we have 𝐴1=Cov̂𝑔𝑛̂𝜗(𝑥),𝑛=1(𝑥)𝑛𝑎𝑛𝔼𝑌𝑈𝑎𝑛1𝑛𝔼𝑌𝑈𝑎𝑛𝔼𝑈𝑎𝑛=1𝑛𝑎𝑛[]1𝑔(𝑥)+𝑜(1)𝑛[]=1𝑔(𝑥)+𝑜(1)][𝑓(𝑥)+𝑜(1)𝑛𝑎𝑛1𝑔(𝑥)+𝑜𝑛𝑎𝑛.(5.6) On the other hand, by condition (C5) there exists 𝐶>0 such that |̂𝑟𝑛(𝑥)|𝐶. Thus, we can write ||𝐴2||̂𝜗𝐶𝔼𝑛(̂𝜗𝑥)𝔼𝑛(𝑥)2=𝐶𝑛𝑎2𝑛𝔼𝑈2[𝑈])(𝔼2=𝐶𝑛𝑎𝑛𝔼[𝑈]𝑎𝑛[𝑈]{1𝔼}.(5.7) Note that 𝛼𝑛(𝑥)𝑎𝑛̂𝜗=𝔼𝑛𝑎(𝑥)=𝑓(𝑥)+𝑂2𝑛.(5.8) Thus, we can write ||𝐴2||𝐶𝑛𝑎𝑛𝑎𝑓(𝑥)+𝑂2𝑛[𝑈]{1𝔼}.(5.9) Note that by condition (C1) the density 𝑓 is bounded in the neighborhood of 𝑥. Moreover, condition (C3) allows us to suppose, without loss of generality, that 𝑏𝑛<1 and by (2.5) we can bound (1𝔼[𝑈]). Therefore, 𝐴21=𝑂𝑛𝑎𝑛.(5.10)
Now, we can write 𝐴1𝔼̂𝜗𝑛(𝑥)2=1𝑓2𝑥01+𝑜(1)𝑛𝑎𝑛𝑔𝑥01+𝑜𝑛𝑎𝑛𝐴=𝑜(1),2𝔼̂𝜗𝑛(𝑥)2=1𝑓2𝑂1(𝑥)+𝑜(1)𝑛𝑎𝑛1=𝑂𝑛𝑎𝑛1+𝑜𝑛𝑎𝑛1=𝑂𝑛𝑎𝑛.(5.11) The above equalities, imply that 𝔼̂𝑟𝑛=𝔼(𝑥)̂𝑔𝑛(𝑥)𝔼̂𝜗𝑛1(𝑥)+𝑜(1)+𝑂𝑛𝑎𝑛=𝔼̂𝑔𝑛(𝑥)𝔼̂𝜗𝑛1(𝑥)+𝑂𝑛𝑎𝑛.(5.12) Once more, the hypotheses of Theorem 3.1 allow us to obtain the following general expressions for ̂𝜗𝔼[𝑛(𝑥)] and 𝔼[̂𝑔𝑛(𝑥)], which are calculated in the proofs of Theorem 1 in Fajardo et al. [3, 4], respectively. That is 𝔼̂𝜗𝑛𝑎(𝑥)=𝑓(𝑥)+2𝑛2𝜑(𝑢)𝑑𝑢3𝑓𝑢(𝑥)2+𝑎𝜑(𝑢)𝑑𝑢2𝑛2𝜑(𝑢)3𝑢2𝑓𝜑(𝑢)𝑥+𝛽𝑢𝑏𝑛𝑓𝔼(𝑥)𝑑𝑢,(5.13)̂𝑔𝑛𝑎(𝑥)=𝑔(𝑥)+2𝑛2𝜑(𝑢)𝑑𝑢3𝑔𝑢(𝑥)2+𝑎𝜑(𝑢)𝑑𝑢2𝑛2𝜑(𝑢)𝑑𝑢3𝑢2𝜑𝑔(𝑢)𝑥+𝛽𝑢𝑏𝑛𝑔(𝑥)𝑑𝑢.(5.14)
By conditions (C1) and (C4), we have that 𝑢2𝑔𝜑(𝑢)𝑥+𝛽𝑢𝑏𝑛𝑔(𝑢𝑥)𝑑𝑢=𝑜(1),2𝑓𝜑(𝑢)𝑥+𝛽𝑢𝑏𝑛𝑓(𝑥)𝑑𝑢=𝑜(1).(5.15)
Then 𝔼̂𝑟𝑛=𝑔𝑏(𝑥)(𝑥)+2𝑛𝜑𝑔/2(𝑢)𝑑𝑢𝑢(𝑥)2𝜑(𝑢)𝑑𝑢𝑏𝑓(𝑥)+2𝑛𝑓/2𝜑(𝑢)𝑑𝑢𝑢(𝑥)21𝜑(𝑢)𝑑𝑢+𝑂𝑛𝑎𝑛=𝐻𝑛1(𝑥)+𝑂𝑛𝑎𝑛.(5.16) Next, we will obtain an equivalent expression for 𝐻𝑛(𝑥). Taking the conjugate, we have 𝐻𝑛1(𝑥)=𝐷𝑛𝑔𝑏(𝑥)(𝑥)𝑓(𝑥)+2𝑛𝑢2𝜑(𝑢)𝑑𝑢2𝜑𝑔(𝑢)𝑑𝑢(𝑥)𝑓(𝑥)𝑓+𝑏(𝑥)𝑔(𝑥)𝑛2𝜑(𝑢)𝑑𝑢2𝑓(𝑥)𝑔𝑢(𝑥)2𝜑(𝑢)𝑑𝑢2=1𝐷𝑛𝑏(𝑥)𝑔(𝑥)𝑓(𝑥)+2𝑛𝑢2𝜑(𝑢)𝑑𝑢2𝑔𝜑(𝑢)𝑑𝑢(𝑥)𝑓(𝑥)𝑓𝑎(𝑥)𝑔(𝑥)+𝑜2𝑛,(5.17) where 𝐷𝑛(𝑥)=𝑓2𝑏(𝑥)2𝑛𝑓𝑢(𝑥)2𝜑(𝑢)𝑑𝑢2𝜑(𝑢)𝑑𝑢2.(5.18) By condition (C3), we have 1𝐷𝑛=1(𝑥)𝑓2(𝑥)+𝑜(1).(5.19)
So that, 𝐻𝑛1(𝑥)=𝑓2𝑏(𝑥)+𝑜(1)𝑔(𝑥)𝑓(𝑥)+2𝑛𝑢2𝜑(𝑢)𝑑𝑢2𝑔𝜑(𝑢)𝑑𝑢(𝑥)𝑓(𝑥)𝑓𝑎(𝑥)𝑔(𝑥)+𝑜2𝑛𝑏=𝑟(𝑥)+2𝑛𝑢2𝜑(𝑢)𝑑𝑢2𝑔𝜑(𝑢)𝑑𝑢(𝑥)𝑓(𝑥)𝑟(𝑥)𝑎𝑓(𝑥)+𝑜2𝑛.(5.20) Now, we can write 𝔼̂𝑟𝑛=𝑏(𝑥)𝑟(𝑥)2𝑛𝑢2𝜑(𝑢)𝑑𝑢2𝜑𝑔(𝑢)𝑑𝑢(𝑥)𝑓(2)(𝑥)𝑟(𝑥)𝑎𝑓(𝑥)+𝑜2𝑛1+𝑂𝑛𝑎𝑛.(5.21) By condition (C3), we have 𝔼̂𝑟𝑛=𝑏(𝑥)𝑟(𝑥)2𝑛𝑢2𝜑(𝑢)𝑑𝑢2𝜑𝑔(𝑢)𝑑𝑢(𝑥)𝑓(𝑥)𝑟(𝑥)𝑎𝑓(𝑥)+𝑜2𝑛+𝑜(1)=𝑏2𝑛𝐵𝐹𝑎(𝑥)+𝑜2𝑛,(5.22) where 𝐵𝐹𝑔(𝑥)=(𝑥)𝑓(𝑥)𝑟(𝑥)𝑢𝑓(𝑥)2𝜑(𝑢)𝑑𝑢2.𝜑(𝑢)𝑑𝑢(5.23) Therefore, 𝔼̂𝑟𝑛(𝑥)𝑟(𝑥)2=𝑏4𝑛𝐵2𝐹(𝑥)+2𝑏2𝑛𝐵𝐹𝑎(𝑥)𝑜2𝑛𝑎+𝑜4𝑛=𝑏4𝑛𝐵2𝐹𝑎(𝑥)+𝑜4𝑛𝑎+𝑜4𝑛=𝑏4𝑛𝐵2𝐹𝑎(𝑥)+𝑜4𝑛.(5.24) Next, we will obtain an expression for the variance in (5.1). For it, we will use the following expression (see e.g., Stuart and Ord [8]) Var̂𝑔𝑛(𝑥)̂𝜗𝑛=(𝑥)Var̂𝑔𝑛(𝑥)𝔼̂𝜗𝑛(𝑥)2+𝔼̂𝑔𝑛(𝑥)2𝔼̂𝜗𝑛(𝑥)4̂𝜗Var𝑛(𝑥)2𝔼̂𝑔𝑛(𝑥)Cov̂𝑔𝑛̂𝜗(𝑥),𝑛(𝑥)𝔼̂𝜗𝑛(𝑥)3.(5.25) Since that ((𝑋𝑖,𝑌𝑖),𝑉𝑖) are i.i.d and the (𝑋𝑖,𝑉𝑖) are i.i.d, 1𝑖𝑛, we have Var̂𝑔𝑛=1(𝑥)𝑛𝑎2𝑛1Var(𝑌𝑈)=𝑛𝑎𝑛𝔼1𝑎𝑛𝑌2𝑈1𝑛𝔼1𝑎𝑛𝑌𝑈2,̂𝜗(5.26)Var𝑛=1(𝑥)𝑛𝑎𝑛2Var𝑛𝑖=1𝑈𝑖=1𝑛𝑎𝑛2𝑛𝛼𝑛(𝑥)1𝛼𝑛,(𝑥)(5.27) the last equality because 𝑛𝑖=1𝑈𝑖 is binomial (𝑛,𝛼𝑛(𝑥0)) distributed. Remember that 𝔼𝑌𝑈𝑎𝑛𝑎=𝑔(𝑥)+𝑂2𝑛.(5.28) Moreover, the hypothesis of Theorem 3.1 allow us to obtain the following expression 𝔼𝑌2𝑖𝑈𝑖𝑎𝑛𝑎=𝜙(𝑥)𝑓(𝑥)+𝑂2𝑛,(5.29) which is calculated in the proof of Lemma 1 in Fajardo et al. [3]. By condition (C3), we have Var̂𝑔𝑛=1(𝑥)𝑛𝑎𝑛1(𝜙(𝑥)𝑓(𝑥)+𝑜(1))𝑛(𝑔(𝑥)+𝑜(1))2=1𝑛𝑎𝑛1𝜙(𝑥)𝑓(𝑥)+𝑜𝑛𝑎𝑛.(5.30) Remember that 𝔼̂𝜗𝑛=1(𝑥)𝑎𝑛𝔼[𝑈]=𝛼𝑛(𝑥)𝑎𝑛𝑥=𝑓0+𝑜(1).(5.31) Thus, ̂𝜗Var𝑛=1(𝑥)𝑛𝑎𝑛𝛼𝑛(𝑥)𝑎𝑛1𝑛𝛼𝑛(𝑥)𝑎𝑛2=1𝑛𝑎𝑛1(𝑓(𝑥)+𝑜(1))𝑛(𝑓(𝑥)+𝑜(1))2=1𝑛𝑎𝑛1𝑓(𝑥)+𝑜𝑛𝑎𝑛,1𝔼̂𝜗𝑛(𝑥)𝑘=1𝑓𝑘(𝑥)+𝑜(1),(5.32) for 𝑘=2,3,4. Finally, we saw that Cov̂𝑔𝑛̂𝜗(𝑥),𝑛=1(𝑥)𝑛𝑎𝑛1𝑔(𝑥)+𝑜𝑛𝑎𝑛.(5.33) Therefore, Var̂𝑔𝑛(𝑥)𝔼̂𝜗𝑛(𝑥)2=1𝑓21(𝑥)+𝑜(1)𝑛𝑎𝑛1𝜙(𝑥)𝑓(𝑥)+𝑜𝑛𝑎𝑛=1𝑛𝑎𝑛𝜙(𝑥)1𝑓(𝑥)+𝑜𝑛𝑎𝑛,𝔼(5.34)̂𝑔𝑛(𝑥)2𝔼̂𝜗𝑛(𝑥)4̂𝜗Var𝑛=1(𝑥)𝑓4𝑔(𝑥)+𝑜(1)2×1(𝑥)+𝑜(1)𝑛𝑎𝑛1𝑓(𝑥)+𝑜𝑛𝑎𝑛=1𝑛𝑎𝑛𝑔2(𝑥)𝑓31(𝑥)+𝑜𝑛𝑎𝑛,2𝔼(5.35)̂𝑔𝑛(𝑥)𝔼̂𝜗𝑛(𝑥)3Cov̂𝑔𝑛̂𝜗(𝑥),𝑛=21(𝑥)𝑓3[]×1(𝑥)+𝑜(1)𝑔(𝑥)+𝑜(1)𝑛𝑎𝑛1𝑔(𝑥)+𝑜𝑛𝑎𝑛=2𝑛𝑎𝑛𝑔2(𝑥)𝑓3(1𝑥)+𝑜𝑛𝑎𝑛.(5.36) Thus,Var̂𝑟𝑛=1(𝑥)𝑛𝑏𝑛𝑉𝐹1(𝑥)+𝑜𝑛𝑎𝑛,(5.37) where 𝑉𝐹(𝑥)=𝜙(𝑥)𝑟2(𝑥)1𝑓(𝑥).𝜑(𝑥)𝑑𝑥(5.38) We can conclude that,𝔼̂𝑟𝑛(𝑥)𝑟(𝑥)2=1𝑛𝑏𝑛𝑉𝐹(𝑥)+𝑏4𝑛𝐵2𝐹1(𝑥)+𝑜𝑛𝑎𝑛𝑎+𝑜4𝑛=1𝑛𝑏𝑛𝑉𝐹(𝑥)+𝑏4𝑛𝐵2𝐹𝑎(𝑥)+𝑜4𝑛+1𝑛𝑎𝑛,(5.39) where 𝐵𝐹𝑢(𝑥)=2𝜑(𝑢)𝑑𝑢2𝑔𝜑(𝑢)𝑑𝑢(𝑥)𝑓(𝑥)𝑟(𝑥).𝑓(𝑥)(5.40)

Acknowledgment

The author wants to especially thank the referees for their valuable suggestions and revisions. He also thanks Henrry Lezama for proofreading and editing the English text.

References

  1. R.-D. Reiss, A Course on Point Processes, Springer Series in Statistics, Springer, New York, NY, USA, 1993. View at Publisher · View at Google Scholar
  2. M. Falk and F. Liese, “Lan of thinned empirical processes with an application to fuzzy set density estimation,” Extremes, vol. 1, no. 3, pp. 323–349, 1999. View at Publisher · View at Google Scholar
  3. J. Fajardo, R. Ríos, and L. Rodríguez, “Properties of convergence of an fuzzy set estimator of the regression function,” Journal of Statistic, vol. 3, no. 2, pp. 79–112, 2010.
  4. J. Fajardo, R. Ríos, and L. Rodríguez, “. Properties of convergence of an fuzzy set estimator of the density function,” Brazilian Journal of Probability and Statistics, vol. 26, no. 2, pp. 208–217, 2012.
  5. L. A. Zadeh, “Fuzzy sets,” Information and Computation, vol. 8, pp. 338–353, 1965. View at Zentralblatt MATH
  6. F. Ferraty, V. Núnez Antón, and P. Vieu, Regresión No Paramétrica: Desde la Dimensión Uno Hasta la Dimensión Infinita, Servicio Editorial de la Universidad del País Vasco, 2001.
  7. W. Härdle, Applied Nonparametric Regression., New Rochelle, Cambridge, Mass, USA, 1990.
  8. A. Stuart and J. K. Ord, Kendall's Advanced Theory of Statistics, vol. 1, Oxford University Press, New York, NY, USA, 1987.