Abstract

In this article, we propose an improved estimator for finite population variance based on stratified sampling by using the auxiliary variable as well as the rank of the auxiliary variable. Expressions for the bias and the mean square error of the estimators are derived up to the first order of approximation. Four real data sets are used to measure the performances of estimators. Moreover, a simulation study is also conducted to observe the efficiency of the proposed variance estimator. The theoretical and numerical results show that the proposed estimator under stratified random sampling is more efficient as compared to the existing estimators.

1. Introduction

The need for supplementary information in survey sampling has long been acknowledged as producing efficient estimators of population parameters such as the mean, median, mode, quartiles, interquartile, percentile, coefficient of variation, and proportion. A wide amendment of methods for retaining supplementary information is described in the literature on survey sampling. The ratio, product, and regression type estimators take advantage of the correlation between the study variable and the auxiliary variable. These estimators perform better when the correlation between the study variable and the auxiliary variable exists and is often used to increase the precision of estimators. When the correlation between the study variable and the supplementary information occurs, the rank of the supplementary information is also correlated with the study variable, and therefore, this rank can be used as an essential factor for increasing the accuracy of an estimator. When the population variance of the supplementary information is known in advance, researchers have developed ratio and product-type estimators for variance estimation. In trades such as agriculture, medicine, biology, and industry, where we encounter populations that are likely to be skewed, a variance estimate for finite population parameters is considered. Variations can occur in reality in our daily lives in various fields such as environmental, genetic, and economic studies. For example, an agriculturist needs a suitable understanding of the variation in weather factors, particularly from time to time or from place to place, to be able to plan on where, how and when to plant his crop. A physician needs a full understanding of variation in the degree of human blood pressure, body temperature, and pulse rate for an adequate prescription. An industrialist desires continual knowledge of the level of variation in people’s response to his product to be capable of knowing whether to reduce or increase his price or improve the quality of his product.

The issue of estimating the population variance has been broadly argued by various authors. Das and Tripathi [1] considered the use of the auxiliary variable in estimating the finite population variance. Isaki [2] discussed the variance estimation using ratio and regression-type estimators. Some important references regarding variance estimation include Garcia and Cebrian [3]; Arcos et al. [4]; Shabbir and Gupta [5]; Singh and Solanki [6]; Adichwal et al. [7, 8], Ahmad et al. [9]; Shabbir and Gupta [10]; Singh and Khalid [11, 12] and Zaman and Bulut [13].

In stratified random sampling, a population is divided into a number of nonoverlapping groups or subgroups called strata. These groups are completely homogeneous, and the sample is taken independently from each stratum. Stratification increases precision when the variance among the strata is much larger than the variances within the strata. Some important references under stratified random sampling are Rao and Shao [14], Kadilar and Cingi [15, 16], Singh and Vishwakarma [17], Koyuncu and Kadilar [18], Shabbir and Gupta [19], Ozel et al. [20], Sidelel et al. [21], Ahmad and Shabbir [22], Hussain et al. [23], Shehzad et al. [24], Singh et al. [25], Zaman [26], Zaman and Bulut [27], and Ahmad et al. [28].

In Section 2, we discuss some notations and symbols of population variance under stratified random sampling. In Section 3, we review some adopted existing estimators. A proposed estimator is given in Section 4. The efficiency comparisons are given in Section 5. In Section 6, we discuss the numerical investigation. The simulation study is given in Section 7. The discussion of the article is discussed in Section 8. The conclusion of the paper is given in Section 9.

2. Notations and Symbols

Consider a finite population , having N units into L strata. Let , , and be the characteristics of the study variable (y), auxiliary variable (x), and rank of the auxiliary variable , respectively, in stratum h such that .

We draw a random sample of size from a population such that . Let , , and be the population means corresponding to the sample means , , and , respectively, in each stratum.

For y, x, and the rank of x, we take the values of , , and for the unit of the stratum.

Let and be the sample variance, corresponding to the population variances , and .

To derive the bias and mean square error, we define the following error terms:where .where

3. Existing Estimators

In this section, we adopt some variance estimators under stratified random sampling, which are available in the literature.(i)The traditional unbiased estimator is given byThe variance of is given by(ii)Isaki [2] suggested a ratio estimator which is given byThe bias and MSE of are given by(iii)The usual difference-type estimator is given byWhere is the unknown constant.The optimum value of is given byThe minimum variance at the optimal value of is given byWhere(iv)Rao [29] proposed a difference-type estimator which is given bywhere, and are the unknown constants and are given byThe minimum bias and minimum MSE at the optimal values of and are given by(v)Singh et al. [30] proposed exponential ratio and product-type estimators which are given byThe bias and MSE of and are given by(vi)Shabbir and Gupta [5] proposed an estimator in stratified random sampling which is given bywhere and are suitably chosen constants having optimum valuesThe minimum MSE of at the optimum values of and is given by

4. Proposed Estimator in Stratified Random Sampling

In survey sampling, the estimation of the finite population variance under stratified random sampling has customary very little consideration. The use of supplementary information can increase the performance of the estimator in survey sampling. When the study variable and the supplementary information are correlated with each other, then the rank of the supplementary information is also correlated with the study variable. As a result, the rank supplementary information can be observed as new supplementary information, and this information may aid in improving the efficiency of estimators. The main advantage of our proposed variance estimator under stratified random sampling is that it is more flexible and efficient than the existing estimators. Taking motivation from Shabbir and Gupta [10], we propose an improved variance estimator under stratified random sampling given by

After expanding (23), we have

The bias of is given bywhere , , and are the unknown constants. The optimum values are

The minimum mean square error at the optimum values of , , and is given bywhere

5. Efficiency Comparison

We compare the proposed estimator with its existing counterparts.(1)By taking equations (5) and (27), we get(2)By taking (7) and (27), we get(3)By taking (10) and (27), we obtain(4)By taking (14) and (27), we obtain(5)By taking (17) and (27), we obtain(6)By taking (18) and (27), we obtain(7)By taking (21) and (27), we obtain

6. Numerical Study

To show the performance of our proposed estimator, we conduct a numerical study using four real data sets. We compare the performances of our proposed variance estimator with existing counterparts in terms of percentage relative efficiency. The summary statistics are given in Tables 18. The conditional values are given in Table 9. The biases are given in Table 10. The MSEs and efficiency values are given in Tables 11 and 12. To obtain the percentage relative efficiency (PRE), we used the following expression:where .Population 1: (Source: Murthy [31]): Production of a factory: Number of employees = Rank of the X variablePopulation 2: (Source: Singh and Chaudhary [32]): Area under wheat in the region in 1974: Area under wheat in the region in 1973 = Rank of the X variablePopulation 3: (Source: Turkey [33]) = Apple production in 1999,= Number of apple trees in 1999, = Rank of the X variablePopulation 4: (Source: Koyuncu and Kadilar [18]) = The number of teachers = The number of classes in both primary and secondary schools in Turkey in 2007 for 923 districts in six regions = Rank of the X variable

7. Simulation Study

A simulation study is performed to determine the efficiency of the estimators for variance stratified random sampling. Three populations are generated from normal distribution by using the R language program. The first population is generated for equal strata and the second one is generated for unequal strata, and the third one is generated for equal strata of a small sample size. The population details are given as follows.

7.1. Population I
7.2. Population II
7.3. Population III

The percentage relative efficiency (PRE) is calculated as follows:where , prop.

The mean square error and percentage relative efficiency values are given in Tables 13 and 14.

8. Discussion

We used four real data sets to obtain the percentage relative efficiency of all estimators under variance stratified random sampling. The data descriptions of these populations are presented in Tables 18. All the conditional values of different estimators using four real data sets are given in Table 9. Bias values of the proposed and existing estimators using real data sets are presented in Table 10. The mean square error and percentage relative efficiency of estimators based on real data sets are given in Tables 11 and 12. Similarly, MSE and PRE of the simulation study are given in Tables 13 and 14. From the numerical results, it is observed that the proposed estimator is appreciable in terms of the minimum mean square error and higher percentage relative efficiency (PRE) as compared to existing estimators. The larger gain in efficiency is observed by using the proposed estimator over some existing estimators under stratified random sampling. The results integrated with this study are very sound and quite illuminating. Thus, it is recommended that the proposed estimator is useful in practice.

9. Conclusion

Several estimators for estimating the finite population variances are constructed on the basis of the auxiliary variable. We proposed an improved estimator using the dual supplementary information for a variance estimator in stratified sampling. The proposed estimator is the particular class of estimators given by Shabbir and Gupta [10]. The bias and MSE of the existing and proposed estimator are derived up to the first order of approximation. The performance of the proposed estimators is the best as compared to existing counterparts in terms of efficiency. Four real data sets are used to assess the efficiency of the proposed estimator over its existing counterparts. Moreover, a simulation study is also carried out to check the robustness and generalizability of the proposed variance estimator. Based on the numerical findings, the proposed estimator may be preferable for use in practical situations. The possible extension of this current work is to develop an improved class of estimators under nonresponse, distribution function, and measurement error for estimating the finite population variance under stratified random sampling.

Data Availability

All the data used for this study can be found inside the manuscript.

Conflicts of Interest

The authors declare that they have no conflicts of interest.