Abstract

Auxiliary variable is commonly used in survey sampling to improve the precision of estimates. Whenever there is auxiliary information available, we want to utilize it in the method of estimation to obtain the most efficient estimator. In this paper using multiauxiliary information we have proposed estimators based on geometric and harmonic mean. It was also shown that estimators based on harmonic mean and geometric mean are less biased than Olkin (1958) and Singh (1967) estimators under certain conditions. However, the MSE of Olkin (1958) estimator and geometric and harmonic estimators are same up to the first order of approximations.

1. Introduction

The problem of estimating the population mean in the presence of an auxiliary variable has been widely discussed in finite population sampling literature. Kadilar and Cingi [1], Singh et al. [2], Singh and Vishwakarma [3], Singh et al. [4], and Koyuncu and Kadilar [5] proposed estimators in stratified random sampling. Ghosh [6] and Rao [7] have suggested estimators in stratified random sampling with multiple characteristics.

Olkin [8] has considered the use of multiauxiliary variables positively correlated with the variable under study to build up a multivariate ratio estimator of population mean π‘Œ.

In this paper, we have considered the multiauxiliary variables. Olkin’s [8] estimator is based on the weighted arithmetic mean of π‘Ÿπ‘–π‘‹π‘–β€™s and is given as π‘¦π‘Žπ‘=π‘˜ξ“π‘–=1π›Όπ‘–π‘Ÿπ‘–π‘‹π‘–,(1.1) where (i) 𝛼𝑖’s are weights such that βˆ‘π‘˜π‘–=1𝛼𝑖=1, (ii) 𝑋𝑖’s are the population means of the auxiliary variables and assumed to be known, and (iii) π‘Ÿπ‘–=𝑦𝑠𝑑/π‘₯𝑖𝑠𝑑,𝑦𝑠𝑑 is the sample mean of the study variable π‘Œ and π‘₯𝑖𝑠𝑑 are the sample means of the auxiliary variables 𝑋𝑖 based on a stratified random sample of size 𝑛 drawn from a population of size 𝑁. Let the population of size, 𝑁, be stratified into 𝐿 strata with β„Žth stratum containing π‘β„Ž units, where β„Ž=1,2,3,…,𝐿 such that βˆ‘πΏβ„Ž=1π‘β„Ž=𝑁.

Following Olkin’s [8] estimator, several other estimators using multiauxiliary variables have been proposed in recent years. Singh [9] has extended Olkin’s [8] estimator to the case where auxiliary variables are negatively correlated with the variable under study. Srivastava [10] and Rao and Mudholkar [11] have given estimators, where some of the characters are positively and others are negatively correlated with the character under study. The main objective of presenting these estimators was to reduce the bias and mean square errors.

Motivated by Singh [9, 12] and Singh et al. [13], we propose an estimator in stratified sampling as 𝑦𝑠=π‘˜ξ‘π‘–=1π‘Ÿπ‘–π‘‹π‘–.(1.2) We also propose two alternative estimators based on geometric mean and harmonic mean, as 𝑦𝑔𝑝=π‘˜ξ‘π‘–=1ξ‚€π‘Ÿπ‘–π‘‹π‘–ξ‚π›Όπ‘–,(1.3)π‘¦β„Žπ‘=ξƒ©π‘˜ξ“π‘–=1π›Όπ‘–π‘Ÿπ‘–π‘‹π‘–ξƒͺβˆ’1(1.4) such that βˆ‘π‘˜π‘–=1𝛼𝑖=1.

These estimators are based on the assumptions that the auxiliary characters are positively correlated with π‘Œ. Let πœŒπ‘–π‘—(𝑖=1,2,…,π‘˜;𝑗=1,2,…,π‘˜) be the correlation coefficient between 𝑋𝑖 and 𝑋𝑗 and 𝜌0𝑖 the correlation coefficient between π‘Œ and 𝑋𝑖.

2. BIAS and MSE of the Estimators

To obtain the bias and MSE’s of the estimators up to first order of approximation, we write 𝑦𝑠𝑑=πΏξ“β„Ž=1π‘Šβ„Žπ‘¦β„Ž=π‘Œξ€·1+𝑒0ξ€Έ,π‘₯𝑖𝑠𝑑=πΏξ“β„Ž=1π‘Šβ„Žπ‘₯π‘–β„Ž=𝑋𝑖1+𝑒𝑖,(2.1) such that 𝐸(𝑒𝑖)=0, where, 𝑦𝑠𝑑=πΏξ“β„Ž=1π‘Šβ„Žπ‘¦β„Ž,π‘₯𝑖𝑠𝑑=πΏξ“β„Ž=1π‘Šβ„Žπ‘₯π‘–β„Ž,π‘¦β„Ž=1π‘›β„Žπ‘›β„Žξ“π‘–=1π‘¦β„Žπ‘–,π‘Œβ„Ž=1π‘β„Žπ‘›β„Žξ“π‘–=1π‘Œβ„Žπ‘–,π‘Œ=π‘Œπ‘ π‘‘=πΏξ“β„Ž=1π‘Šβ„Žπ‘Œβ„Ž,where,π‘Šβ„Ž=π‘β„Žπ‘,𝐸𝑒20ξ€Έ=βˆ‘πΏβ„Ž=1π‘Š2β„Žπ›Ύβ„Žπ‘†2π‘¦β„Žπ‘Œ2𝑒,𝐸2𝑖=βˆ‘πΏβ„Ž=1π‘Š2β„Žπ›Ύβ„Žπ‘†2𝑖π‘₯β„Žπ‘‹2𝑖,𝐸𝑒0𝑒𝑖=βˆ‘πΏβ„Ž=1π‘Š2β„Žπ›Ύβ„Žπ‘†π‘–π‘₯π‘¦β„Žπ‘‹2𝑖.(2.2) Also, π‘‰ξ‚€π‘Œπ‘ π‘‘ξ‚=π‘Œ2𝐸𝑒20ξ€Έ,𝑆2π‘¦β„Ž=βˆ‘π‘β„Žπ‘–=1ξ‚€π‘¦β„Žβˆ’π‘Œβ„Žξ‚2π‘β„Ž,π‘†βˆ’12𝑖π‘₯β„Ž=βˆ‘π‘β„Žπ‘–=1ξ‚€π‘₯π‘–β„Žβˆ’π‘‹π‘–β„Žξ‚2π‘β„Žβˆ’1,𝑆𝑖π‘₯π‘¦β„Ž=βˆ‘π‘β„Žπ‘–=1ξ‚€π‘¦β„Žβˆ’π‘Œβ„Žξ‚ξ‚€π‘₯π‘–β„Žβˆ’π‘‹π‘–β„Žξ‚π‘β„Ž.βˆ’1(2.3) In the same way 𝐢0𝑖 and 𝐢𝑖𝑗 are defined.

Further, let π›Όξ…žβˆΌ=(𝛼1,𝛼2,…,π›Όπ‘˜) and 𝐢=βŒŠπΆπ‘–π‘—βŒ‹π‘˜Γ—π‘˜, (𝑖=1,2,…,π‘˜;𝑗=1,2,…,π‘˜).

Using Taylor’s series expansion, under the usual assumptions, we obtain π‘¦π‘Žπ‘=π‘˜ξ“π‘–=1π›Όπ‘–π‘Œξ€·1+𝑒0ξ€Έξ€·1+π‘’π‘–ξ€Έβˆ’1=π‘Œπ‘˜ξ“π‘–=1𝛼𝑖1+𝑒0ξ€Έξ€·1βˆ’π‘’π‘–+𝑒2π‘–βˆ’π‘’3𝑖=ξ€Έξ€»π‘Œπ‘˜ξ“π‘–=1𝛼𝑖1+𝑒0βˆ’π‘’π‘–+𝑒2π‘–βˆ’π‘’0𝑒1+𝑒0𝑒2π‘–βˆ’π‘’3𝑖+𝑒4π‘–βˆ’π‘’0𝑒3𝑖.(2.4) Subtracting π‘Œ from both sides of (2.4) and then taking expectation of both sides, we get the bias of the estimator π‘¦π‘Žπ‘ up to the first order of approximation as π΅ξ€·π‘¦π‘Žπ‘ξ€Έ=π‘Œξƒ¬π‘˜ξ“π‘–=1𝛼𝑖𝐢2π‘–βˆ’π‘˜ξ“π‘–=1𝛼𝑖𝐢0𝑖.(2.5) Subtracting π‘Œ from both sides of (2.4), taking square, and then taking expectation of both sides, we get the MSE of the estimator π‘¦π‘Žπ‘ up to the first order of approximation as ξ€·MSEπ‘¦π‘Žπ‘ξ€Έ=π‘Œ2𝐢20+π‘˜ξ“π‘–=1𝛼2𝑖𝐢2π‘–βˆ’2π‘˜ξ“π‘–=1𝛼𝑖𝐢0𝐢𝑖𝛼+2𝑖𝛼𝑗𝐢𝑖𝑗.(2.6) In the same way using the Taylor series expansion under the usual assumptions, we obtain 𝑦𝑔𝑝=π‘Œπ‘˜ξ‘π‘–=11+𝑒0βˆ’π›Όπ‘–ξ€·π‘’π‘–+𝑒0𝑒𝑖+𝛼𝑖1+𝛼𝑖2𝑒2𝑖+𝑒0𝑒2π‘–ξ€Έβˆ’π›Όπ‘–ξ€·1+𝛼𝑖2+𝛼𝑖6𝑒3𝑖+𝑒0𝑒3𝑖,+β‹―π‘¦β„Žπ‘=π‘ŒβŽ‘βŽ’βŽ’βŽ£1+𝑒0βˆ’π‘˜ξ“π‘–=1π›Όπ‘–π‘’π‘–βˆ’π‘˜ξ“π‘–=1𝛼𝑖𝑒0𝑒𝑖+ξƒ©π‘˜ξ“π‘–=1𝛼𝑖𝑒𝑖ξƒͺ2+ξƒ©π‘˜ξ“π‘–=1𝛼𝑖𝑒𝑖ξƒͺ2𝑒0βˆ’ξƒ©π‘˜ξ“π‘–=1𝛼𝑖𝑒𝑖ξƒͺ3βˆ’ξƒ©π‘˜ξ“π‘–=1𝛼𝑖𝑒𝑖ξƒͺ3𝑒0⎀βŽ₯βŽ₯⎦.+β‹―(2.7)

To calculate the bias and mean square error, we considered the terms having powers up to second degree only as the calculations become more complicated when the higher-order terms are included.

So, from (2.7), the bias and mean square error of the estimates up to 0(1/𝑛) are obtained as 𝐡𝑦𝑔𝑝=π‘Œξƒ¬ξ“π‘–π›Όπ‘–ξ€·π›Όπ‘–ξ€ΈπΆ+12𝑖2+π›Όξ“ξ“π‘–π›Όπ‘—πΆπ‘–π‘—βˆ’ξ“π‘–π›Όπ‘–πΆ0𝑖,ξ€·MSE𝑦𝑔𝑝=π‘Œ2𝐢20+π‘˜ξ“π‘–=1𝑝2𝑖𝐢2π‘–βˆ’2π‘˜ξ“π‘–=1𝑝𝑖𝐢0𝐢𝑖𝑝+2𝑖𝑝𝑗𝐢𝑖𝑗,π΅ξ€·π‘¦β„Žπ‘ξ€Έ=π‘ŒβŽ‘βŽ’βŽ’βŽ£ξƒ©π‘˜ξ“π‘–=1𝛼𝑖𝐢𝑖ξƒͺ2βˆ’π‘˜ξ“π‘–=1𝛼𝑖𝐢0π‘–βŽ€βŽ₯βŽ₯⎦,ξ€·MSEπ‘¦β„Žπ‘ξ€Έ=π‘Œ2𝐢20+π‘˜ξ“π‘–=1𝛼2𝑖𝐢2π‘–βˆ’2π‘˜ξ“π‘–=1𝛼𝑖𝐢0𝑖𝛼+2𝑖𝛼𝑗𝐢𝑖𝑗.(2.8) We see that MSE’s of these estimators are same and the biases are different. In general ξ€·MSEπ‘¦π‘Žπ‘ξ€Έξ€·=MSE𝑦𝑔𝑝=MSEπ‘¦β„Žπ‘ξ€Έ.(2.9) We know that in case of univariate the usual ratio-type 𝑦𝑅 estimator for the 𝑖th auxiliary variable is superior to the mean per unit estimator 𝑦, when 𝐢0πΆπ‘–πœŒ0𝑖>12.(2.10) Comparing the variance of 𝑦=𝐢20π‘Œ2𝑠𝑑 with the mean square error of all the three estimators, we note that the ratio estimators given in (1.1), (1.2), and (1.3) are more efficient than 𝑦.

3. Comparison of Biases

The biases may be either positive or negative. So, for comparison, we have compared the absolute biases of the estimates when these are more efficient than the sample mean. The bias of the estimator of geometric mean is smaller than that of arithmetic mean: ||π΅ξ€·π‘¦π‘Žπ‘ξ€Έ||>||𝐡𝑦𝑔𝑝||.(3.1) Squaring and simplifying (3.1), we observe that 12π‘˜ξ“π‘–=1𝛼2𝑖𝐢2π‘–βˆ’2π‘˜ξ“π‘–=1𝛼𝑖𝐢0𝑖𝛼+2𝑖𝑀𝑗𝐢𝑖𝑗+32π‘˜ξ“π‘–=1𝛼𝑖𝐢2𝑖×12π‘˜ξ“1=1𝛼𝑖𝐢2π‘–βˆ’12π‘˜ξ“π‘–=1𝛼2𝑖𝐢2π‘–βˆ’π›Όξ“ξ“π‘–π›Όπ‘—πΆπ‘–π‘—ξƒ­>0.(3.2) Thus the above inequality is true when both factors are either positive or negative. The first factor of (3.2) 12π‘˜ξ“π‘–=1𝛼2𝑖𝐢2π‘–βˆ’2π‘˜ξ“π‘–=1𝛼𝑖𝐢0𝑖𝛼+2𝑖𝛼𝑗𝐢𝑖𝑗+32π‘˜ξ“π‘–=1𝛼𝑖𝐢2𝑖(3.3) is positive, when βˆ‘π‘˜π‘–=1𝛼2𝑖𝐢2π‘–π›Όξ…žβˆΌπΆπ›ΌβˆΌ>13.(3.4) In the same way, it can be shown that the second factor of (3.2) is also positive when βˆ‘π‘˜π‘–=1𝛼2𝑖𝐢2π‘–π›Όξ…žβˆΌπΆπ›ΌβˆΌ>1.(3.5) When both factors are of (3.2) is negative, the sign of inequalities of (3.4) and (3.5) reversed.

Also comparing the square of the biases of geometric and harmonic estimator, we find that geometric, estimator is more biased than harmonic estimator.

Hence we may conclude that under the situations where arithmetic, geometric and harmonic estimator are more efficient than sample mean and the relation (3.5) or βˆ‘π‘˜π‘–=1𝛼2𝑖𝐢2π‘–π›Όξ…žβˆΌπΆπ›ΌβˆΌ<13(3.6) is satisfied, the biases of the estimates satisfy the relation ||π΅ξ€·π‘¦π‘Žπ‘ξ€Έ||>||𝐡𝑦𝑔𝑝||>||π΅ξ€·π‘¦β„Žπ‘ξ€Έ||.(3.7) Usually the weights are so chosen so as to minimize the MSE of an estimator subject to the condition π‘˜ξ“π‘–=1𝛼𝑖=1.(3.8)

4. Empirical Study

In this section, we use the data set earlier used in Koyuncu and Kadilar [5].

To illustrate the efficiency of suggested estimators, we consider the data concerning the number of teachers as the study variable (𝑦), number of students (π‘₯), and number of classes (𝑧) in both primary and secondary schools as auxiliary variables for 923 districts at 6 regions (as 1: Marmara, 2: Agean, 3: Mediterranean, 4: Central Anatolia, 5: Black Sea, and 6: East and Southeast Anatolia) in Turkey in 2007 (source: The Turkish Republic Ministry of Education). The summary statistics of the data are given in Table 1. We used the Neyman allocation for allocating the samples to different strata [14].

5. Conclusion

From Table 2, we observe that the ratio estimator based on harmonic mean is less biased. However, the mean square errors of the estimators π‘¦β„Žπ‘,π‘¦π‘Žπ‘,and𝑦𝑔𝑝 are same. Hence for this data set, we conclude that when more than one auxiliary variables are used for estimating the population parameters, it is better to use harmonic mean as an estimator in case of stratified random sampling.