Abstract

In this paper, we proposed two new families of estimators using the supplementary information on the auxiliary variable and exponential function for the population distribution functions in case of nonresponse under simple random sampling. The estimations are done in two nonresponse scenarios. These are nonresponse on study variable and nonresponse on both study and auxiliary variables. As we have highlighted above that two new families of estimators are proposed, in the first family, the mean was used, while in the second family, ranks were used as auxiliary variables. Expression of biases and mean squared error of the proposed and existing estimators are obtained up to the first order of approximation. The performances of the proposed and existing estimators are compared theoretically. On these theoretical comparisons, we demonstrate that the proposed families of estimators are better in performance than the existing estimators available in the literature, under the obtained conditions. Furthermore, these theoretical findings are braced numerically by an empirical study offering the proposed relative efficiencies of the proposed families of estimators.

1. Introduction

It is a well-known phenomenon that the known auxiliary information in the study of sample survey gives us an efficient estimate of population parameters, i.e., the population mean and population distribution function, under some essential conditions. This information (auxiliary) may be used for drawing a random sample using SRSWR or SRSWOR. Also, simple random sampling can be improved using the following sampling methods.

Stratification, systematic, nonresponse sampling, and probability proportional sampling schemes are used for estimating the population parameter. Auxiliary information gives us some sort of techniques by means of the ratio, product, regression, and other methods. In a practical situation, one of the important issues in surveys is that it suffers from nonresponse. Nonresponse is a common problem which may crawl with sampling survey. Nonresponse has many ways of occurrence. Examples are linguistic problems, illness, nonresponse, nonacceptance, process of return address misguided, and capture by another person. Research has labelled that various types of nonresponse may have different effects on estimators. A lot of work has been done on the estimation of population mean under nonresponse to control the nonresponse bias and to increase the efficiency of the estimators by different authors. The problem of nonresponse in sample surveys is more common and more prevalent in mail surveys than in special interview surveys. Hansen and Hurwitz [1] assumed that a part of sample of earlier nonrespondents to be recommunicated with a more expensive system; they attempted the first effort by mail questionnaire and performed the second attempt by a personal interview. However, Hansen and Hurwitz [1] have not used any kind of supplementary information to increase the efficiency of the estimator. For the first time, the author of [2] used the auxiliary information for estimating the population mean. Cochran [3] used the auxiliary information for estimating the population mean under nonresponse. Then, work on nonresponse extended by many authors (cf., [47]) recommends various types of estimators for estimation of population mean and distribution function using the secondary information under nonresponse. Okafor and Lee [8] presented ratio and regression estimation with partial sampling of the nonrespondents for estimating the population mean. Furthermore, the authors of [9, 10] proposed estimators for estimating population mean using multiauxiliary information in different directions and Zhao et al. [11] used the idea of robust estimation of the distribution function and quantiles with nonignorance missing data.

Also, for estimating population mean under the two-phase sampling strategy in the presence of nonresponse, the authors of [1215] have made significant contributions. Diana and Perri [16] suggested a class of estimators in two-phase sampling with subsampling of nonrespondents in estimating the finite population mean. In this paper, we introduce the use of sample distribution functions of the study variable and auxiliary variable along with the mean of the auxiliary variable and also the ranks of the auxiliary variable for estimating the population distribution function.

Extensive literature has been published on estimation of population mean under nonresponse; however, no effort has been dedicated to the development of efficient methods for population cumulative distribution function. In survey sampling, the statisticians are often interested in proportion size of the study variable, i.e., proportion of units in population with values less than or equal to a specified value of ; for instance, we may be interested to know the proportion of the population in which 31% or more people are educated.

Motivated by , , and average of and , two new families of estimators are proposed for estimating distribution function in the presence of nonresponse. By numerical results, we will show that the proposed family of estimators is more precise than the existing estimators.

We planned the paper as follows: In Section 2, some notations are introduced. In Section 3, the existing estimators are reviewed briefly. Two new families of estimators are introduced in Section 4, respectively. The existing and proposed estimators are compared (theoretically and numerically) in Sections 5 and 6. In Section 7, the concluding remarks of the paper are discussed.

2. Notations

Consider a finite population of distinct units, which is partitioned into respondents and nonrespondents groups with sizes and , respectively, for estimating the CDF, where . A sample of size has been drawn from this population by simple random sampling (SRSWOR), out of which units respond and do not respond. It is assumed that the sample size is drawn from the response group of and is drawn from the nonresponse group of . Moreover, a sample of size is drawn by simple random sampling (SRSWOR) from , and this time response is obtained from all units. Let and be the study and auxiliary variables, respectively. Let be used for the ranks of the and and be the indicator variables based on and . Furthermore, and and and are the population and sample distribution functions of and , respectively. Similarly, let and and and be the population and sample means of and ,respectively.Furthermore, and are the population distribution functions of and for the nonresponse group and and are the population means of and for the nonresponse group, respectively.

Here, ( and ) and ( and ), where and are the population means of . Similarly, and are the population second quartiles of , respectively.

To obtain the bias and MSE of the proposed estimator, we consider the following error terms. Let

Here, , , and and are the notations used for CDFs, mean, and mean of ranks when there are no responses on both study and auxiliary variables. And, , , and are the notations used for CDF, mean, and mean of ranks when there are no responses on only auxiliary variable, shown in Table 1.

Let for and for , where is the mathematical expectation of . Letwhere . Here,

Here,where it is the coefficient of multiple determination of I(Y ≤ y) on I(X ≤ x) and X with situation-I. Also,is the coefficient of multiple determination of I(Y ≤ y) on I(X ≤ x) and X with situation-II. And,is the coefficient of multiple determination of I(Y ≤ y) on I(X ≤ x) and Z with situation-I. Finally,is the coefficient of multiple determination of I(Y ≤ y) on I(X ≤ x) and Z with situation-II. Here, , , , , , and are the population variances of , , , and for the response group, respectively.

Similarly, , , , and are the population variances of , , , and for the nonresponse group, respectively.

, , , and are the population coefficient of variations for the response group, and , , , and are the population coefficient of variations for the nonresponse group.

, , , , and are the population covariances for the response group.

, , , , and are the population covariances for the nonresponse group.

Similarly, , , , , and are the population correlation coefficients for the response group, respectively.

, , , , and are the population correlation coefficients for the nonresponse group.

Let , where and for . Also, denote the sample distribution function of responding units out of units and denote the sample distribution function of responding units out of nonresponse units.

The existing Hansen and Hurwitz [1] unbiased estimator of with its variance is

Similarly, the unbiased estimators for , , and and their corresponding variances are

In practice, we use three situations, occurring under nonresponse, but here, we use two situations which mostly occur, namely, nonresponse on both the study variable and the auxiliary variable (say situation-I) and nonresponse just on study variable only (say situation-II). For notational convenience, we follow the notations given in Table 1.

3. Existing Estimators

In this section, some estimators of finite population mean exist for estimating the finite CDF under nonresponse; the biases and MSEs of these existing estimators are derived under the first order of approximation.(1)Cochran’s [17] existing ratio estimator of isThe bias and MSE of , to the first order of approximation, are(2)Murthy’s [18] existing product estimator of isThe bias and MSE of , to the first order of approximation, are(3)The existing regression estimator of iswhere is an unknown constant. Here, is an unbiased estimator of . The minimum variance of at the optimum value isHere, (15) may be written as(4)Rao’s [19] existing difference-type estimator of iswhere and are unknown constants. The bias and MSE of , to the first order of approximation, areThe optimum values of and , determined by minimizing (18), areThe minimum MSE of at the optimum values of and isHere, (20) may be written as(5)Grover and Kaur’s [20] existing generalized class of ratio-type exponential estimator of iswhere and are unknown constants. The bias and MSE of , to the first order of approximation, are

The optimum values of and , determined by minimizing (15), are

The simplified minimum MSE of at the optimum values of and is

Here, (25) may be written aswhich shows that is more precise than .

4. Proposed Estimators

On the lines of , , and average of and , the first proposed family of estimators for estimating is given bywhere , , and are unknown constants and and are either two real numbers or functions of known population parameters of , such as , (coefficient of kurtosis), and .

The estimator can also be written as

Simplifying (28) and keeping terms only up to the second power of s, we can write

The bias and MSE of , to the first order of approximation, respectively, are

The optimum values of , , and , determined by minimizing (29), are

The simplified minimum MSE of at the optimum values of , , and iswhere .

It can be seen that is more precise than .

On similar lines, the second proposed family of estimators for estimating is given bywhere , , and are unknown constants and and are either two real numbers or functions of known population parameters of , such as , (coefficient of kurtosis), and .

The estimator can also be written as

Simplifying (34) and keeping terms only up to the second power of s, we can write

The bias and MSE of , to the first order of approximation, are

The optimum values of , , and , determined by minimizing (36), are

The simplified minimum MSE of at the optimum values of , , and iswhere .

It can be seen that is more precise than .

In Table 2, we put some members of the Grover and Kaur [20] and proposed families of estimators with selected choices of and .

5. Efficiency Comparisons

In this section, the adapted and proposed estimators of are compared in terms of the minimum MSEs.(i)From (8) and (32),(ii)From (11) and (32)(iii)From (13) and (32),(iv)From (16) and (32),(v)From (21) and (32),(vi)From (26) and (32),(vii)From (8) and (38),(viii)From (11) and (38),(ix)From (13) and (38),(x)From (16) and (38),(xi)From (21) and (38),(xii)From (26) and (38),

The proposed families of estimators are always more precise than the adapted estimators as conditions (i)–(xii) are always true.

6. Empirical Study

In this section, we conduct a numerical study to see the performance of the existing and proposed distribution function estimators. For this purpose, three populations are considered. The summary statistics of these populations are reported in Tables 35. The percentage relative efficiency PRE of an estimator with respect to is where .

The PREs of distribution function estimators, computed from three populations, are given in Tables 6 and 7.Population I (source: [21]).: duration of sleep of persons with age more than 50 years: the age of persons in years. The proportion of the non-response units in the given population is considered to be the last 25% unitsPopulation II (source: [22]).: the eggs produced in 1990 (millions): the price per dozen (cents) in 1990. The proportion of the non-response units in the given population is considered to be the last 25% unitsPopulation III (source: [22]).: the eggs produced in 1990 (millions): the price per dozen (cents) in 1991. The proportion of the non-response units in the given population is considered to be the last 25% units

From the numerical results, presented in Tables 6 and 7, it is observed that the PREs of all families of estimators change with the choices of and . It is further noted that the proposed families of estimators are more precise than the existing distribution function estimators of Hansen and Hurwitz [1]; Cochran [17]; Murthy [18]; Rao [19]; and Grover and Kaur [20], in terms of PRE under both situations.

7. Concluding Remarks

In this paper, we have proposed two new families of estimators for estimating the finite population distribution function. The proposed estimators needed supplementary data on the sample mean and ranks of the auxiliary variable. The biases and mean squared error of the proposed families of estimators were derived using the first order of approximation. Based on theoretical as well as numerical comparative studies, it is concluded that the proposed families of estimators are more precise than their existing counterparts under situation-I and situation-II. So, we recommend using the sample mean and ranks of the auxiliary variable with the proposed families of estimators for estimating the finite population distribution function.

Data Availability

The data used to support the numerical findings of this study are available from the corresponding author upon request. The data can also be obtained upon searching the given sources of data.

Conflicts of Interest

The authors declare that there are no conflicts of interest regarding the publication of this paper.

Acknowledgments

The first three authors acknowledge the support of the National Social Science Fund of China (17BTJ010) and the National Fund for Shanxi “1331 Project” Key Innovative Research Team.