Abstract

In this article, we propose a generalized class of exponential factor type estimators for estimation of the finite population distribution function (PDF) using an auxiliary variable in the form of the mean and rank of the auxiliary variable exist. The expressions of the bias and mean square error of the estimators are computed up to the first order approximation. The proposed estimators provide minimum mean square error as compared to all other considered estimators. Three real data sets are used to check the performance of the proposed estimators. Moreover, simulation studies are also carried out to observe the performances of the proposed estimators. The proposed estimators confirmed their superiority numerically as well as theoretically by producing efficient results as compared to all other competing estimators.

1. Introduction

In survey sampling, it is well-known fact that the suitable use of auxiliary information improves the precision of estimators by taking advantage of the correlation between the study variable and the auxiliary variable. Several estimators exist in the literature for estimating various population parameters including mean, median, and total, but little attention has been paid to estimate the distribution function (DF). Some important references to the population mean using the auxiliary information include [121]. In their work, several authors have suggested improved ratio, product, and regression type estimators for estimating the finite population mean.

The problem of estimating a finite population (DF) arises when we are interested to find out the proportion of certain values that are less than or equal to the threshold value. There are situations where estimating the cumulative distribution function is assessed as essential. For example, for a nutritionist, it is interesting to know the proportion of the population that consumed 25 percent or more of the calorie intake from saturated fat. Similarly, a soil scientist may be interested in knowing the proportion of people living in a developing country below the poverty line. In certain situations, the need of a cumulative distribution function is much more important. Some important work in the field of distribution function (DF) includes [22], which suggested an estimator for estimating the DF that requires information both on the study variable and the auxiliary variable. On similar lines, [23] proposed ratio and difference-type estimators for estimating the DF using the auxiliary information. Ahmad and Abu-Dayyeh [24] estimated the DF using the information on multiple auxiliary variables. Rueda et al. [25] used a calibration approach for estimating the DF. Singh et al. [26] considered the problem of estimating the DF and quantiles with the use of auxiliary information at the estimation stage, [27] considered a generalized class of estimators for estimating the DF in the presence of nonresponse, [28] suggested finite population distribution function estimation with dual use of auxiliary information under simple and stratified random sampling. Moreover, two new estimators were proposed for estimating the DF in simple and stratified sampling using the auxiliary variable and rank of the auxiliary variable.

The rest of the article is composed as follows: in Section 2, some notations and symbols are given. In Section 3, the existing estimators for estimating the DF are given. In Section 4, we define two new generalized exponential factor type estimators. Section 6 discusses the numerical study of the proposed class of estimators. We also conduct a simulation study for the support of our proposed generalized family of estimators in Section 7. Section 8 gives the concluding remarks.

2. Notations and Symbols

A finite population of distinct and identified units is considered. To estimate the finite population distribution function (DF), a sample of size units is drawn from a population using simple random sampling without replacement (SRSWOR). Let , and be the values of the study variable , the auxiliary variable and rank of the auxiliary variable , respectively. Let and be the indicator variables based on and , respectively.

Let and be the sampled distribution functions corresponding to the population distribution function and , respectively. Let , , and be the sample means corresponding to population means , , and , respectively. Let , , , be the population variances of , , and , respectively. Let , , , be the coefficients of variations of ,, , and , respectively. Let , ,

be the population covariances between , , , and , respectively. Let , , , , , be the correlation coefficients between and , and , and , and , and , respectively. Let , be the population coefficients of multiple determination of on and , on and , respectively.

To obtain the biases and mean squared errors (MSEs) of the adapted and proposed estimators of , we consider the following relative error terms. Let

Such that for .

3. Existing Estimators

In this section, we briefly review some existing estimators of .(1)The conventional unbiased mean per unit estimator of isThe variance of is given by the following equation:(2)The traditional ratio estimator of isThe bias and MSE of , to the first order of approximation, respectively, are given by the following equation:The ratio estimator performs better than , in terms of MSE, if .(3)Reference [29] suggested the usual product estimator of :The bias and MSE of , to the first order of approximation are given by the following equation:The product estimator is better than , in terms of MSE, if .(4)The conventional difference estimator of iswhere is an unknown constant. The minimum variance of at the optimum value of , is given by the following equation:(5)Reference [4] suggested an improved difference-type estimator of , given by the following equation:where and are unknown constants.The optimum values of and areThe bias and minimum MSE of , to the first order of approximation, respectively, are given by the following equation:(6)Reference [30] suggested the exponential ratio-type and product-type estimators are given by the following equation:The biases and MSEs of and , to the first order of approximation, respectively, are given by the following equation:(7)Reference [14] suggested a generalized class of ratio-type exponential estimators as follows:where and are unknown constants.

The optimum values of and , determined by minimizing (26) are given by the following equation:

The bias of , are given by the following equation:

The minimum MSE of at the optimum values of and is given by the following equation:

4. Proposed Estimator

Using the appropriate auxiliary information during the estimation stage or at the design stage improves an estimator’s efficiency. The sample distribution function of the auxiliary variable has already been employed to increase the efficiency and quality of estimators. The study of [20] suggested using the rank of the auxiliary variable as an additional auxiliary variable to improve the precision of a population distribution function estimator. In this article, we used two auxiliary variables to estimate the finite distribution function; we need additional auxiliary information on the sample mean and sample distribution function of the auxiliary variable, as well as the sample distribution function of the study variable. In literature, auxiliary information using the distribution function has been rarely attempted, therefore we are motivated towards it. The principal advantage of our proposed generalized class of estimator is that it is more flexible, and efficient than the existing estimators.

4.1. First Proposed Estimator

On the lines of [31], we suggest a generalized class of exponential factor type estimators which contains many stable and efficient estimators. By combining the idea of [30, 31], the first estimator is given by the following equation:where

Substituting different values of (i = 1, 2, 3, 4) in Equation (32), we can generate many more types of estimators from our general proposed class of estimators, given in Table 1.

By solving given in (32) up-to first order of approximation, we have the following equation:where

Or

Using (39), the bias and MSE of are given by the following equation:

Differentiate (40) with respect to and , we get the optimum values of and i.e.,

Substituting the optimum values of and in Equation 20, we get minimum of and is given by the following equation:where

is the coefficient of multiple determination of on and .

4.2. Second Proposed Estimator

To increase the efficiency of the estimators both at the design stage as well as at the estimation stage, we utilize the auxiliary information. When there exists a correlation between the study variable and the auxiliary variable, then the rank of the auxiliary variable is also correlated with the study variable. The rank of the auxiliary variable can be treated as a new auxiliary variable, and this information may also be used to increase the precision of the estimators. Based on the idea of rank, we propose a second new class of factor type estimators of the finite population distribution function. The estimator is given by the following equation:where

Substituting different values of (i = 1, 2, 3, 4) in Equation (45), we can generate many more different types of estimators from our general proposed class of estimators, given in Table 2.

Solving given in (45) in terms of errors, we have the following equation:where

With first order approximation, we have the following equation:

Using (52) the bias and MSE of are given by the following equation:

Differentiate Equation (54) with respect to and for minimum , we get the optimum values of and ,

Substituting the optimum values of and in Equation 25, we get minimum of is given by the following equation:where

is the coefficient of multiple determination of on and .

5. Theoretical Comparison

In this section, the adapted and proposed estimators of F(ty) are compared in terms of the minimum mean square error.(1)From (7) and (43), if. or if.(2)From (10) and (43), if. or if.(3)From (13) and (43), if. or if.(4)From (15) and (43), if. or if.(5)From (10) and (43), if. or if.(6)From (23) and (43),if. or if.(7)From (25) and (43), if. or if.(8)From (31) and (43), if. or if.(9)From (7) and (56), if. or if.(10)From (10) and (56), if. or if.(11)From (13) and (56), if. or if.(12)From (15) and (56), if. or if.(13)From (19) and (56), if. or if.(14)From (23) and (56), if. or if.(15)From (25) and (56), if. or if.(16)From (31) and (56), if. or if.

6. Numerical Study

In this section, we conduct a numerical study to investigate the performances of the adapted and proposed DF estimators. For this purpose, three populations are considered. The summary statistics of these populations are reported in Tables 35. The percentage relative efficiency of the estimator with respect to is given by the following equation::

where .

The PREs of the distribution function estimators, computed from three populations, are given in Tables 69

Population 1. (Source: [32]) : number of teachers and : number of students.

Population 2. (Source: [32]) : Number of teachers and : number of Schools.

Population 3. (Source: [6]) : the estimated number of fish caught by marine recreational fishermen in the year 1995 and : the estimated number of fish caught by marine recreational fishermen in the year 1994.
In Tables 69 we use and for indicator functions of and , respectively. Here,when we used as indicator function of and as indicator function of X, we get PRE in Table 6, and when we used as indicator function of and as indicator function of X, we get PRE given in Table 7. And, similarly when we used as indicator function of and as indicator function of X, we get PRE in Table 8, and when we used as indicator function of and as indicator function of X, we get PRE given in Table 9.
Here we take three data sets for numerical illustration, respectively.
In Tables 69, we observe that the proposed class of estimators are more precise than the existing estimators in terms of PREs.

7. Simulation Study

A simulation study is conducted to obtain the efficiency of the suggested estimators under simple random sampling when the auxiliary variables and rank of the auxiliary variable are used. We have generated three populations of size 1,000 from a multivariate normal distribution with different covariance matrices. All populations have different correlations, i.e., Population I is negatively correlated, Population II is positively correlated, and Population III has a strong positive correlation between X and Y variables. Population averages and covariance matrices are given as follows.

7.1. Population I

7.2. Population II

7.3. Population III

The Percentage Relative Efficiency (PRE) is calculated as follows:

In this study, we consider the generated population for summarizing the simulation procedures. The simulation results of PRE are given in Tables 1013.

In Tables 1013, it can be seen that the proposed estimators perform better than all existing estimators. The percent relative efficiency shows that the second proposed family of estimators with simple random sampling yields the best result when the variables and have a positive correlation. Overall, we can conclude that the performance of the family of suggested estimators is better than all existing estimators.

8. Concluding Remarks

In this article, we proposed a generalized class of exponential factor type estimators, utilizing the supplementary information in the form of the mean and rank of the auxiliary variable for estimating the finite population distribution function. The expressions for biases and mean squared errors of the proposed generalized class of estimators are derived up to the first order of approximation. The proposed estimators and are compared to all existing estimators numerically and theoretically. Based on the simulation studies as well as on the real data sets, it is observed that the proposed class of estimators performed better than their existing counterparts and should be preferable over the existing estimators available in the literature.

Data Availability

The data used to support the findings of this study are included within the text.

Conflicts of Interest

The authors declare that there are no conflicts of interest regarding the publication of this paper.