Abstract

In survey sampling, information on auxiliary variables related to the main variable is often available in many practical problems. Since the mid-twentieth century, researchers have taken a keen interest in the use of auxiliary information, due to its usefulness in estimation methods. In this article, our main objective is to discover the problem associated with estimation of the finite population distribution function, using the known auxiliary variable, which occurs as the sample distribution function and the rank of the auxiliary variable. A new family of the finite population distribution function estimators is proposed in the stratified sampling scheme. The mathematical equations for the bias and mean square error have been obtained for each proposed estimator, along with the efficiency conditions. Besides theoretical efficiency comparison, an empirical study has also been conducted to analyze the performance of estimators. A simulation study is also performed to observe the efficiency of the proposed estimators. The implementation of the proposed sampling scheme is illustrated by a practical example.

1. Introduction

In the literature of survey sampling, in certain situations, the use of the auxiliary variable may increase the precision of estimators in estimating the population parameters of interest. Researchers have already found out how to obtain estimates for unknown population parameters such as mean, median, variance, and standard deviation that possess maximum statistical properties. For that purpose, a representative part of population is needed. (i) When population is homogeneous, then preferably one can utilize the idea of simple random sampling (SRS). (ii) On the other hand, when population is heterogeneous, then stratified random sampling is useful.

When the study and the auxiliary variables are correlated, then the rank of the auxiliary variable, distribution function, median, etc. are also correlated with the study variable. When there exists a positive or negative correlation between the study and the auxiliary variables, ratio and product estimators can improve the precision of estimates. By making use of the auxiliary information the researcher can explore these research findings by looking in, Ahmad and Shabbir [1]; Kadilar and Cingi [2]; Grover and Kaur [3]; Haq et al. [4]; and Al-Marzouki et al. [5].

The dual use of the auxiliary information to estimate finite population distribution function is rarely used. The issue of estimating the finite population distribution function arises when our interest lies in finding out the proportion of the values of the study variable which is less than or equal to some threshold. In certain situations, the need of cumulative distribution function is much more important. Many authors have estimated the distribution function by using information on single or more auxiliary variables. First of all, Chamber and Dunstan [6] suggested a procedure for estimating the finite population distribution function. Kuk [7] presented a classical as well as a prediction approach in estimating the distribution function from a survey data. Researchers can investigate articles related to distribution function (DF) such as research studies conducted by Chamber et al. [8]; Dorfman [9]; Diana [10]; Rao [11]; Diana and Perri [12]; Ahmad and Abu-Dayyeh [13]; Rueda and Arcos [14]; Singh and Kumar [15]; Dorfman [16]; Diana and perri [12]; Husaain et al. [17]. They proposed two new estimators for estimating the finite population distribution function based on simple and stratified random sampling schemes using supplementary information.

Taking motivation from , , and average of and , we proposed new family of estimators for estimating finite population distribution function under stratified sampling scheme.

The rest of the article is organized as follows. Section 2 presents the notation and symbols of stratified random sampling. In Section 3, the existing estimators were studied. We proposed a new family of estimators for estimating finite population distribution function under stratified random sampling in Section 4. Empirical study is conducted in Section 5. We also conduct a simulation study for the support of our proposed family of estimators under stratified random sampling in Section 6. Finally, conclusion of this paper is drawn in Section 7.

2. Notations in Stratified Random Sampling

Let be a finite population of units, which is divided into homogeneous strata, where the size of h stratum is , for , in such a manner that . Let and be the characteristics of the study variable (Y) and the auxiliary variable (X), respectively, where and . A sample of size is drawn from variable such that , where is the sample size.

Let and , and be the population and sample distribution function, respectively, of and under stratified random sampling, where , , , , and . Let and be the population and sample means of under stratified random sampling, respectively, where , .

Let , , , , , , , , , , , and . Similarly, let be the coefficient of multiple determination of on and .

To find the properties of the existing and proposed estimators of , we consider the following relative error terms under stratified random sampling.

Let , , and , such that for , where is the mathematical expectation of .

Let where Here,where .

3. Existing Estimators

In this section, some estimators of finite population mean are adapted for estimating the finite CDF under stratified random sampling.(1)The traditional unbiased estimator of isThe variance of is(2)Cochran [18] ratio estimator of is given byThe bias and MSE of are given by(3)The usual product estimator of isThe bias and MSE of are given by(4)Following Bahl and Tuteja [19], exponential estimators of , respectively, areThe bias and MSE of and are given by(5)The regression type estimator of is given bywhere is constant. Here, is an unbiased estimator of . The minimum variance of at the optimum value is :We can also write (11)as(6)The usual difference estimator of iswhere and are unknown constants. The bias and MSE of are given asThe optimum values of and , determined by minimizing (14), areThe minimum MSE of at the optimum values of and isEquation (16) may also be written as(7)Singh et al. [20] generalized ratio type exponential estimator of aswhere and are known constants. The properties of are given aswhere .(8)Grover and Kaur [21] generalized class of ratio type exponential estimator of aswhere and are constants. The bias and MSE of , to the first order of approximation, are

The optimum values of and , determined by minimizing (21), are given as

The minimum MSE of at the optimum values of and is given as

Here, (23) may be written aswhich shows that is more precise than .

4. Proposed Family of Estimators

The use of auxiliary variables may enhance the accuracy of an estimator either at the design stage or at the estimation stage. When a correlation exists between the study variable and the auxiliary variable, the order of the auxiliary variable is also correlated to the study variable. Thus, the rank of the auxiliary variable can be treated as a new auxiliary variable, and it is helpful in increasing the efficiency of an estimator. On the lines of , , and average of and , we proposed a new family of estimator say :where , , and are unknown constants, and are either real numbers or functions of known population parameters of , which may be , (coefficient of kurtosis), , etc.

The estimator can also be written as

Expressing (26) up to first order of approximation,

The properties of are given as

The optimal values of , , and are given by

The minimum MSE of at the optimum values of , , and iswhere .

Table 18 shows some members of the proposed family of estimators for different choices of and (Table 1).

5. Empirical Study

To show the dominance of the proposed estimators over the existing estimators, we conduct a numerical study to investigate the performances of the existing and proposed CDF estimators. For this purpose, three populations are considered. The datasets are given inTables 24. We use the following expression to find the percentage relative efficiency (PRE).where  =  , , , , , , (Tables 24).Population I (source: Koyuncu and Kadilar [22]):: number of teachers.: number of teachers in 2007 for 923 districts.Population II (source: Koyuncu and Kadilar [22]):: number of teachers.: number of classes in 2007 for 923 districts.Population III (source: Kadilar and Cingi [23]):: apple production in 1999.: number of apples in 1999.

From the numerical outcomes, given in Tables 58, it is further noticed that the proposed family of estimators with different values of and performs more efficiently than existing estimators.

6. Simulation Study

We have generated three populations of size 1,000 from multivariate normal distribution with different covariance matrices. All the populations have different correlations. Population I is negatively correlated, population II is positively correlated, and population III has strong positive correlation between X and Y variables. The population means and covariance matrices are given below:Population I: and ., , .Population II:, , .Population III:

and .

Relative efficiency (PRE) is calculated aswhere  = stR, stP, stBT, R, stBT, P, stReg, stR, D, stGK.

The results of MSE and PRE are given in Table 26 and 27. Here we can only point out the best results of MSEs and PREs in these tables when if (Tables 9 and 10).

7. Conclusion

In this paper, we have proposed a new family of estimators to estimate the finite population distribution function (DF). Using simulation studies and actual datasets, it is observed that the proposed class of estimators gives better results than the existing estimates. Therefore, we recommend the use of the proposed estimators for future study. Based on the real datasets and simulation results, it can be seen that the proposed estimators perform better than all existing estimators. The percentage relative efficiency shows that the proposed family of estimators in stratified random sampling gives the best result when and variables have strong positive correlation [2429].

Data Availability

The data used to support the findings of this study are included within the article.

Conflicts of Interest

The authors declare that they have no conflicts of interest.

Acknowledgments

The first author wishes to thank Mr Sardar Hussain for the contributions.