Table of Contents Author Guidelines Submit a Manuscript
Journal of Probability and Statistics
Volume 2012, Article ID 214959, 12 pages
Research Article

Double Sampling with Ranked Set Selection in the Second Phase with Nonresponse: Analytical Results and Monte Carlo Experiences

1Software Development Division, Institute of Computing Training, Cuba
2Universidad de La Habana, Habana, Cuba

Received 19 May 2011; Revised 5 December 2011; Accepted 21 December 2011

Academic Editor: Man Lai Tang

Copyright © 2012 Gaajendra K. Agarwal et al. This is an open access article distributed under the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.


This paper is devoted to the study of the behavior of the use of double sampling for dealing with nonresponses, when ranked set sample is used. The characteristics of the sampling strategies are derived. The structure of the errors generated the need of studying of the optimality of the strategies by performing a set Monte Carlo experiments.

1. Introduction

The usual theory of survey sampling is developed assuming that the finite population 𝑈={𝑢1,,𝑢𝑁} is composed by individuals that can be perfectly identified. A sample 𝑠 of size 𝑛𝑁 is selected. The variable of interest 𝑌 is measured in each selected unit. Real-life surveys should deal the existence of missing observations. There are three solutions to cope with this fact: to ignore the nonrespondents, to subsample the nonrespondents, or to impute the missing values. To ignore the non responses is a dangerous decision, to sub sample is a conservative and costly solution. Imputation is often used to compensate for item nonresponse. See, for discussions on the theme, Rueda and González [1], Singh [2], for example.

Section 2 presents the problem of non response when a single sample is selected.

We consider the use of double sampling for obtaining information on an auxiliary variable 𝑋. A first large sample is selected, it is supposedly noncostly. The values of 𝑋 are used for selecting a ranked set sample (RSS), as the units are ranked using the values in the first stage sample. A selection of second sample provides a subsample from the preliminary large sample. The literature on the use of simple random double sampling (SRS) is large. Text books give the basic theory, see Singh [2] and Cochran [3]. In this paper we consider a ranked set sampling (RSS) double sampling procedure. It is presented in Section 3 where a family of estimators is considered as an RSS alternative to the proposal of Singh and Kumar [4]. An expression of the gain in accuracy due to our proposed estimator is found. The estimator is compared with simple mean and the proposal of Singh and Kumar [4]. Real-life data are used for evaluating the behavior of these alternative estimators of the population mean in Section 4.

2. The Nonresponse Problem: A Single Sample

Non responses may be motivated by a refusal of some units to give the true value of 𝑌 or by other causes. Hansen and Hurvitz in 1946 [5] proposed selecting a sub-sample among the nonrespondents, see Cochran [3]. This feature depends heavily on the proposed sub-sampling rule. Sampling rules are due to Hansen and Hurvitz [5], Srinath [6], and Bouza [7]. The existence of non responses fixes that 𝑈 is divided into two strata: 𝑈1={𝑢𝑈𝑢 responds at the first visit},𝑈2=𝑈𝑈1. Similarly 𝑠 is partitioned into 𝑠𝑖𝑈𝑖,𝑖=1,2. The procedure is a particular double sampling design described, using Hansen-Hurvitz’s rule (HHR) as follows.

Step 1. Select a sample 𝑠 from 𝑈 using srswr.

Step 2. Evaluate 𝑌 among the respondents and determine {𝑦𝑖𝑖𝑠1𝑈1, /𝑠1/=𝑛1}. Compute 𝑦1=𝑛1𝑖=1𝑦𝑖𝑛1.(2.1)

Step 3. Determine 𝑛2=𝑛2/𝐾, 𝐾>1;/𝑠2/=𝑛2 with 𝑠2= {𝑢𝑠𝑢𝑈2}.

Step 4. Select a sub-sample 𝑠2of size 𝑛2 from 𝑠2 using srswr.

Step 5. Evaluate 𝑌among the units in 𝑠2{𝑦𝑖:𝑖𝑠2𝑠2, 𝑠2𝑈2}. Compute 𝑦2=𝑛2𝑖=1𝑦𝑖𝑛2.(2.2)

Step 6. Compute the estimate of μ𝑛𝑦=1𝑛𝑦1+𝑛2𝑛𝑦2=𝑤1𝑦1+𝑤2𝑦2.(2.3)

Note that (2.1) is the mean of an srswr sample selected from 𝑈1, the response stratum, then its expected value is the mean of 𝑌 in the respondent stratum: 𝜇1. We have that the conditional expectation of (2.2) is𝐸𝑦2=𝑠𝑦2,(2.4) as (2.4) is the mean of a srswr sample selected from the non response stratum 𝑈2𝐸𝐸𝑦2𝑠=𝜇2,(2.5) and taking into account that for 𝑖=1,2𝐸(𝑛𝑖)=𝑛𝑁𝑖/𝑁=𝑛𝑊𝑖 the unbiasedness of (2.3) is easily derived.

The variance of (2.3) is deduced by using the following trick:𝑤𝑦=1𝑦1+𝑤2𝑦2+𝑤2𝑦2𝑦2,(2.6) the first term is the mean of 𝑠, then its variance is σ2/n. For the second term we have that𝑉𝑤2𝑦2𝑦2𝑠=𝑤22𝐸𝑦2𝜇2)(𝑦2𝜇2)𝑠2=𝑤22𝐸𝑦2𝜇2𝑠2+𝐸𝑦2𝜇2𝑠22𝐸𝑦2𝜇2𝑦2𝜇2.𝑠(2.7) Conditioning to a fixed 𝑛2 we have that the expectation of the third term is (𝑦2𝜇2)2. Then we have that𝑉𝑤2𝑦2𝑦2𝑠=𝑤22𝜎22𝑌𝑛2𝜎22𝑌𝑛2=𝑤22𝜎22𝑌𝐾𝑛21𝑛2,𝑤𝐸𝑉2𝑦2𝑦2=𝑊𝑠2(𝐾1)𝜎22𝑌𝑛.(2.8) Hence the expected error of (2.3) is given by the well-known expression𝐸𝑉𝑦=𝜎2𝑌𝑛+𝑊2(𝐾1)𝜎22𝑌𝑛.(2.9) Our proposal is to consider obtaining information provided by a known variable 𝑋 for using RSS.

McIntire [8] proposed the method of RSS. He noticed the existence of a gain in accuracy with respect to the use of the sample mean with respect to srswr. Dell and Clutter [9] and Takahashi and Wakimoto [10] provided mathematical support to his claims. The following procedure provides a description of RSS selection.

2.1. RSS Procedure

Step 1. Randomly select 𝑚2 units from the target population.

Step 2. Allocate the 𝑚2 selected units as randomly as possible into 𝑚 sets, each of size 𝑚.

Step 3. Without yet knowing any values for the variable of interest, rank the units within each set with respect to variable of interest. This may be based on personal professional judgment or done with concomitant variable correlated with the variable of interest.

Step 4. Choose a sample for actual quantification by including the smallest ranked unit in the first set, the second smallest ranked unit in the second set, the process is continued in this way until the largest ranked unit is selected from the last set.

Step 5. Repeat Steps 1 through 4 for 𝑟 cycles to obtain a sample of size 𝑚𝑟 for actual quantification.

The RSS sample is the sequence of order statistics (OS) 𝜉(11)𝑡,,𝜉(𝑚𝑚)𝑡, where (𝑗)𝑡 denotes the statistic of order 𝑗 in the hth sample in the cycle 𝑡=1,,𝑟. We have 𝑛=𝑚𝑟 observation and 𝑟 of them are of the 𝑖th order statistics (os), 𝑖=1,,𝑚. The RSS estimator of the mean of a variable of interest 𝜉,𝜇𝜉 is𝜇(rss)𝜉=𝑟𝑡=1𝑚𝑖=1𝜉(𝑖𝑚)𝑡𝑟𝑚,(2.10) and its variance is given by𝑉𝜇(rss)𝜉=𝑚𝑖=1𝜎2𝜉(𝑖𝑚)𝑟𝑚2=𝜎2𝜉𝑟𝑚𝑚𝑖=1Δ2(𝑖𝑚)𝑟𝑚2,(2.11) where 𝜎2𝜉(𝑖𝑚)=𝐸[𝜉(𝑖𝑚)𝐸([𝜉(𝑖𝑚))]2 and Δ(𝑖𝑚)=𝐸([𝜉(𝑖𝑚))]𝜇𝜉.

The second term of (2.11) is the gain in accuracy due to the use of RSS instead of srswr.

Bouza [11] developed an RSS alternative under non responses. The non responses in 𝑠 is 𝑛2=𝑟𝑚2. He derived that, using a subsample size 𝑚2=𝑚2/𝐾, 𝑦2rss=𝑟𝑡=1𝑚𝑗𝑖=1𝑦(𝑖𝑚2)𝑡𝑟𝑚2,(2.12) is unbiased for the mean of 𝑌 in the nr stratum.

The cross-expectation’s expected value is zero. In this case the RSS is balanced and we may express the variance of the order statistics (OS) as a function of the variance of 𝑌 in 𝑈2,𝑉(𝑦(𝑖𝑚2)𝑡), and the gains in accuracy measured by the Δ22𝑌(𝑖),𝑠 as𝑉𝑦2𝑦2rss𝑠=𝜎22𝑌1𝑛21𝑛2𝑚2𝑖=1Δ22𝑌(𝑖)𝑛2𝑚2.(2.13) Substituting 𝑛2=𝑟𝑚2/𝐾2 we obtain the following:𝑉𝑦2rss𝑦2rss=𝜎𝑠22𝑌𝑟𝐾21𝑚2𝑚2𝑖=1Δ22𝑌(𝑖𝑚2)𝑟𝑚2𝐾21𝑚2=𝑉2.(2.14) Taking the RSS estimator𝑦rss=𝑛1𝑛𝑦1rss+𝑛2𝑛𝑦2rss=𝑤1𝑦rss1+𝑤2𝑦2rss,𝐸𝑉𝑦rss=𝜎2𝑌𝑛+𝑊2(𝐾1)𝜎22𝑌𝑛Ψ(𝑌).(2.15) Then there is gain in accuracy due to the use of RSS which isΨ(𝑌)=𝑊2(𝐾1)𝐸𝑚2𝑖=1Δ22𝑌𝑖𝑚2𝑚2,(2.16) where Δ22𝑌(𝑖𝑚)=(𝐸(𝑌(𝑖𝑚)𝜇𝑌)2) is the gain in accuracy due to the use or RSS in the second stage.

3. The Nonresponse Problem: Double Sampling

We will consider that double sampling is used for obtaining a sample s* from 𝑈 using srswr. A cheap variable 𝑋 is measured in the units in s*. X is correlated with 𝑌 and we are able to compute the mean of it 𝑥in the first stage. There are non responses. In the second stage we know 𝑥𝑠=(𝑛𝑖=1𝑥𝑖)/𝑛 and 𝑥=(𝑛𝑖=1𝑥𝑖)/𝑛. Note that these estimates are used only in the estimation process.

Non responses on 𝑌 are present in the second stage sample and a subsample among the non respondents is selected. Singh and Kumar [4] considered this problem for simple random sampling. They proposed the family of estimators characterized by𝑦=𝑦𝑎𝑥+𝑏𝑎𝑥𝑠+𝑏𝛼𝑎𝑥+𝑏𝑎𝑥𝑠+𝑏𝛽,𝑦=𝑛𝑖=1𝑦𝑖𝑛.(3.1) The sampler fixes the constants αand β as well as 𝑎 and 𝑏. They can be constants or functions, a different from zero. Taking𝜀=𝑦𝜇𝑌𝜇𝑌,𝜃=𝑥𝜇𝑋𝜇𝑋,𝜗=𝑥𝑠𝜇𝑋𝜇𝑋,𝜔=𝑥𝜇𝑋𝜇𝑋.(3.2)

Proposition 3.1 (see [4]). The bias of 𝑦=𝑦𝑎𝑥+𝑏𝑎𝑥𝑠+𝑏𝛼𝑎𝑥+𝑏𝑎𝑥𝑠+𝑏𝛽(3.3) is 𝐵𝑦=𝜇𝑌𝜑1+𝜑2,(3.4) defining 𝜑1=𝛾𝜙𝛼𝐾𝑥𝑦+𝛼12𝜙𝐾+𝛽𝑥𝑦+𝛼𝜙+𝛽12𝜙𝑐2𝑥,𝜑2𝐾=𝜆𝛼𝜙𝑥2𝑦+𝛼12𝜙𝑐2𝑥2,(3.5) where 1𝛾=𝑛1𝑛𝑊,𝜆=2(𝐾1)𝑛,𝑐2𝑥=𝜎2𝑥𝜇2𝑥,𝑐2𝑥2=𝜎2𝑥2𝜇2𝑥2,𝐾𝑥𝑦=𝜇𝑥𝜎𝑥𝑦𝜇𝑦𝜎2𝑥,𝐾𝑥2𝑦=𝜇𝑥2𝑥𝜎2𝑥2𝑦𝜇𝑦𝜎2𝑥2𝑥2,𝜎𝑥𝑦=𝐸𝑋𝜇𝑥𝑌𝜇𝑌,𝜎𝑥2𝑦=𝐸𝑋𝜇𝑥𝑌𝜇𝑌𝑈2.(3.6) The variance is given by 𝑉𝑦=𝜇2𝑌𝛿1+𝛿2,(3.7) defining 𝛿1=𝛾𝑐2𝑌+(𝛼+𝛽)𝜙(𝛼+𝛽)𝜙+2𝐾𝑥𝑦𝑐2𝑥,𝛿2𝑐=𝜆2𝑦2+𝛼𝜙𝛼𝜙+2𝐾𝑥2𝑦𝑐2𝑥2+𝑐2𝑦𝑛,𝑐2𝑦=𝜎2𝑦𝜇2𝑦,𝑐2𝑦2=𝜎2𝑦2𝜇2𝑦2.(3.8)

We are going to derive the RSS counterpart of this family. The first phase sample is selected using srswr and the information on 𝑋 is used for selecting the initial sample and to subsample the non respondents. Our proposal is to use𝑦rss=𝑦rss𝑎𝑥rss+𝑏𝑎𝑥𝑠+𝑏𝛼𝑎𝑥+𝑏𝑎𝑥𝑠+𝑏𝛽,(3.9)𝑥rss is the RSS mean of 𝑋 in the second stage and𝜀rss=𝑦rss𝜇𝑌𝜇𝑌,𝜃rss=𝑥rss𝜇𝑋𝜇𝑋,𝜗=𝑥𝑠𝜇𝑋𝜇𝑋,𝜔rss=𝑥rss𝜇𝑋𝜇𝑋.(3.10) Let us represent the involved estimators by𝑦rss=𝜇𝑌1+𝜀rss,𝑥rss=𝜇𝑋1+𝜃rss,𝑥𝑠=𝜇𝑋(1+𝜗),𝑥rss=𝜇𝑋1+𝜔rss.(3.11) Due to the unbiasedness of the estimators 𝐸(𝑋rss)=0,𝑍=𝜀,𝜃,𝜗,𝜔.

Taking𝜙=𝑎𝜇𝑋𝑎𝜇𝑥+𝑏.(3.12) We can rewrite (3.9) as𝑦rss=𝜇𝑌1+𝜀rss1+𝜙𝜃rss𝛼(1+𝜙𝜗)𝛼1+𝜙𝜔rss𝛽(1+𝜙𝜗)𝛽.(3.13) Note that𝐸𝜀rss2=𝐸𝑦rss𝜇𝑌2𝜇2𝑌=𝜎2𝑌/𝑛+𝑊2(𝐾1)𝜎22𝑌/𝑛𝜇2𝑌𝑊2(𝐾1)𝐸𝑚2𝑖=1Δ22𝑌(𝑖𝑚2)/𝑚2𝜇2𝑌,𝐸𝜃rss2=𝜎2𝑥/𝑛+𝑊2(𝐾1)𝜎22𝑥/𝑛𝜇2𝑥𝑊2(𝐾1)𝐸𝑚2𝑖=1Δ22𝑥(𝑖𝑚2)/𝑛𝑚2𝜇2𝑥,𝐸(𝜗)2=𝐸(𝑥𝑠𝜇𝑋)2𝜇2𝑋=𝜎2𝑋𝑛𝜇2𝑋,𝐸𝜔rss2=𝜎2𝑥/𝑛𝑚𝑖=1Δ2𝑥(𝑖)/𝑟𝑛𝜇2𝑥.(3.14) Under the hypothesis /𝜙𝑍/<1,𝑍=𝜀rss,𝜃rss,𝜗,𝜔rss, an expansion in Taylor series of (3.13) may be worked out. Grouping conveniently we have that𝑦rss𝜇𝑌=𝜇𝑌𝜀rss𝜔+𝛽rss+𝜀rss𝜔rss𝜀rss𝜗𝜃+𝛼𝜙rss+𝜀rss𝜃rss𝜀rss𝜗(𝛼+𝛽)𝜙𝜗+𝛼𝛽𝜙2𝜗2𝜔+𝜗rss+𝜃rss+𝜗𝜔rss𝜙2𝛽2𝜗𝜔rss+𝛼2𝜗𝜃rss+𝛽(𝛽+1)𝜙22𝜗2+𝜔2rss+𝛼(𝛼+1)𝜙22𝜗2+𝜔2rss.(3.15) The cross-products for the OS 𝑍(𝑖),𝑍=𝑋,𝑌, are expressed by𝑖=1𝑍(𝑖)𝜇𝑍(𝑖)𝑍(𝑖)𝜇𝑍(𝑖)=𝑖=1𝑍(𝑖)𝜇𝑍𝜇𝑍(𝑖)𝑍(𝑖)𝜇𝑍𝜇𝑍(𝑖)=𝑖=1𝑍(𝑖)𝜇𝑍𝑍(𝑖)𝜇𝑍𝑖=1𝑍(𝑖)Δ𝑍(𝑖)+𝑍(𝑖)Δ𝑍(𝑖)Δ𝑍(𝑖)Δ𝑍(𝑖)𝜎=(1)𝑍𝑍+Ψ𝑍𝑍.(3.16) The conditional expectations of the RSS estimators are𝐸𝑥rss/𝑠𝐸=𝐸𝑥rss/𝑠/𝑠=𝑥.(3.17) Using these results we have that𝐸𝜀rss𝜃rss=𝜎𝑋𝑌+Ψ𝑋𝑌𝑛𝜇𝑥𝜇𝑦+𝑊2𝜎(𝐾1)𝑋2𝑌+Ψ𝑋2𝑌𝑛𝜇𝑥𝜇𝑦,𝐸𝜀rss𝜗=𝜎𝑋𝑌+Ψ𝑋𝑌𝑛𝜇𝑥𝜇𝑦,𝐸𝜀rss𝜔rss=𝜎𝑋𝑌+Ψ𝑋𝑌𝑛𝜇𝑥𝜇𝑦,(3.18) withΨ𝑋2𝑌=𝐸𝑚2𝑖=1𝑋(𝑖)2Δ𝑥(𝑖)2+𝑌(𝑖)2Δ𝑦(𝑖)2Δ𝑥(𝑖)2Δ𝑦(𝑖)𝑚2,Ψ𝑋𝑌=𝐸𝑚𝑖=1𝑋(𝑖)Δ𝑥(𝑖)+𝑌(𝑖)Δ𝑦(𝑖)Δ𝑥(𝑖)Δ𝑦(𝑖)𝑚.(3.19) In addition𝐸𝜔rss𝜃rss=𝜎2𝑥+Ψ𝑋𝑛𝜇2𝑥,Ψ𝑋=𝑚𝑖=1Δ2𝑥8(𝑖)𝑟𝐸𝜗𝜃rss=𝜎2𝑥𝑛𝜇2𝑥,𝐸𝜗𝜔rss=𝜎2𝑥𝑛𝜇2𝑥.(3.20) Substituting in (3.15) after some algebraic work we obtain that the bias of (3.9) is𝐵𝑦rss=𝜇𝑌𝜑1rss+𝜑2rss,(3.21) where𝜑1rss=𝛼𝐾𝛾𝜙𝑥𝑦𝑐2𝑥+Ψ𝑋𝑌𝑛𝜇𝑥𝜇𝑦+𝛼12𝜙𝑐2𝑥+Ψ𝑋𝑛𝜇2𝑥𝐾+𝛽𝑥𝑦𝑐2𝑥+Ψ𝑋𝑌𝑛𝜇𝑥𝜇𝑦𝑐+𝛼𝜙2𝑥+Ψ𝑋𝑛𝜇2𝑥+𝛽12𝜙𝑐2𝑥Ψ𝑧2𝐸=𝑚2𝑖=1Δ22𝑧(𝑖𝑚2)/𝑚2𝑛𝜇2𝑧,𝑧=𝑥,𝑦.(3.22) For a large value of 𝑛 the bias tends to zero. Then we have proved the first statement of the following proposition.

Proposition 3.2. The estimator 𝑦rss=𝑦rss((𝑎𝑥rss+𝑏)/(𝑎𝑥𝑠+𝑏))𝛼((𝑎𝑥rss+𝑏)/(𝑎𝑥𝑠+𝑏))𝛽 is asymptotically unbiased in terms of 𝑛 and its variance is given by 𝑉𝑦rss=𝜎2𝑌𝑛+𝛾𝜇2𝑌((𝛼+𝛽)𝜙)2𝑐2𝑥+2(𝛼+𝛽)𝜙𝐾𝑥𝑦𝑐2𝑥+Ψ𝑋𝑌𝜇𝑥𝜇𝑌+𝜆𝜇2𝑌2𝜎2𝑌2𝜇2𝑌2+Ψ𝑌2𝜇2𝑌2𝜎+𝛼𝜙𝛼𝜙2𝑥𝜇2𝑥+Ψ𝑥2+2𝐾𝑥2𝑌𝑐2𝑥2+Ψ𝑋2𝑌𝜇𝑥𝜇𝑌1+Ψ𝑥2+𝜎2𝑥2𝑌𝜇𝑥𝜇𝑌.(3.23) If /𝜙𝑍/<1,𝑍=𝜀rss,𝜃rss,𝜗,𝜔rss.

Proof. An expansion in Taylor series of (𝑦rss𝜇𝑌)2 may be worked out. It is, neglecting the terms of order 𝑡>2, 𝑦rss𝜇𝑌2=𝜇2𝑌𝜏1+𝜏2+𝜏3+𝜏4,(3.24) where 𝜏1=𝜀2rss+𝛼2𝜃2rss+𝛽2𝜔2rss+2𝛼𝛽𝜀rss𝜔rss𝜙2,𝜏2=𝜀2rss+(𝛼+𝛽)2𝜗2𝜙2,𝜏3=2𝜙𝛼𝜀rss𝜃rss+𝛽𝜀rss𝜔rss,𝜏4=2(𝛼+𝛽)(𝜙𝜗𝜀rss+𝜙2𝛼𝜗𝜀rss+𝛽𝜗𝜔rss.(3.25) Calculating the expected value and grouping we have that 𝐸𝑦rss𝜇𝑌2=𝜎2𝑌𝑛+𝛾𝜇2𝑌((𝛼+𝛽)𝜙)2𝑐2𝑥+2(𝛼+𝛽)𝜙𝐾𝑥𝑦𝑐2𝑥+Ψ𝑋𝑌𝜇𝑥𝜇𝑌+𝜆𝜇2𝑌2𝜎2𝑌2𝜇2𝑌2+Ψ𝑌2𝜇2𝑌2𝜎+𝛼𝜙𝛼𝜙2𝑥𝜇2𝑥+Ψ𝑥2+2𝐾𝑥2𝑌𝑐2𝑥2+Ψ𝑋2𝑌𝜇𝑥𝜇𝑌1+Ψ𝑥2+𝜎2𝑥2𝑌𝜇𝑥𝜇𝑌.(3.26)

Remark 3.3. The gain in accuracy due to the use of (3.9) in terms of the variance is 𝐺rss=𝜎𝑥2𝑦+𝛾𝜇2𝑦Ψ𝑥𝑦+2Ψ𝑥𝑦1+Ψ2+𝜆Ψ𝑥2𝜇2𝑦𝜇𝑥𝜇𝑦.(3.27)

Hence, as 𝑉(𝑦rss)=𝑉(𝑦)+𝐺 the proposed method is more precise if 𝐺<0.

This result allows to deduce the RSS counterparts of different double sampling estimators of the mean. For example,(𝛼,𝛽,𝑎,𝑏)=(1,0,1,0)Khare-Srivanstava-Tabasum-Khanestimator1,(𝛼,𝛽,𝑎,𝑏)=(0,1,1,0)Khare-Srivanstava-Tabasum-Khanestimator2,(𝛼,𝛽,𝑎,𝑏)=(1,1,1,0)Shing-Kumarratioestimator,(𝛼,𝛽,𝑎,𝑏)=(1,0,1,0)Shing-Kumarproductestimator.(3.28) See Khare and Srivastava [12, 13] and Singh and Kumar, [4, 14, 15].

4. Numerical Comparisons

We compared the behavior of the proposed RSS method with the SRS one using data from three populations. Their description is given as follows.

Population 1
A set of 244 accounts was considered. The balance of each of them in the previous semester was 𝑋 and 𝑌 was produced by an auditory. The first phase sample was provided by selecting 120 accounts and 72 non responses were reported. A new auditory was performed. The second stage sample was of size 24.

Population 2
The evaluation of radiographies provided values of 𝑋 in 350 patients with cancer. A sample of 100 provided the first phase sample and 24 of them the second phase. Y was the size of an extirpated tumor. 53 measurements were missing. The measurement of them needed a search in the pathology department.

Population 3
The height of 1270 pigs provided the information on 𝑋 in the population. 170 of them were selected at the first phase and 24 of them the second phase. 𝑌 was the weight of the pigs and 69 initial measurements were missing. The missing pig’s weight was obtained by locating them before sending them to the butchery.

The values of 𝑟 and 𝑚 were fixed conveniently for obtaining a sample of size 24. The means and variances of the os’s involved were determined by forming all the possible samples and computing them. The relative gain in accuracy due to the use of RSS was measured by𝐺𝜛=rss𝑉𝑦,(4.1) for 𝑚=3,4,6. The results are given in Table 1. They sustain that the use of RSS provides gains of accuracy larger than 10%/.

Table 1: Gain in accuracy due to the use of RSS in three populations.

A similar study was developed by generating a sample of 240 values of 𝑋 and determining𝑌=5+2𝑋+𝜀,(4.2)𝜀 was generated using the same distribution. The results are given in Table 2. Note that generally the gain in efficiency is larger when the underlying distribution is symmetric. The best results are derived when 𝑚=4 excepting the Beta distribution.

Table 2: Gain in accuracy due to the use of RSS of six populations: 𝑛=240 and 𝐾=0,10.

5. Conclusions

The accuracy of the proposed method seems to be better than the SRS method when 𝐺rss is analyzed. It can take negative values but it has been larger than zero in the experiments developed. It was around 0,1 in all the cases and using 𝑚=4 may be the best choice.


The authors thank the referees for their helpful comments which allowed improving a previous version. This paper was supported by the CONACYT Contract 10110/62/10, FON. INST. 8/10.


  1. M. Rueda and S. González, “Missing data and auxiliary information in surveys,” Computational Statistics, vol. 19, no. 4, pp. 551–567, 2004. View at Publisher · View at Google Scholar · View at Zentralblatt MATH · View at Scopus
  2. S. Singh, Advanced Sampling Theory with Applications, Kluwer Academic, Dordrecht, The Netherlands, 2003.
  3. W. G. Cochran, Sampling Techniques, Wiley and Sons, New York, NY, USA, 1971.
  4. H. P. Singh and S. Kumar, “A general procedure of estimating the population mean in the presence of non-response under double sampling using auxiliary information,” Statistics and Operations Research Transactions, vol. 33, no. 1, pp. 71–84, 2009. View at Google Scholar
  5. M. H. Hansen and W. N. Hurvitz, “The problem of non responses in survey sampling,” Journal of American Statistical Association, vol. 41, pp. 517–523, 1946. View at Publisher · View at Google Scholar
  6. K. P. Srinath, “Multiphase sampling in nonresponse problems,” Journal of the American Statistical Association, vol. 66, pp. 583–589, 1971. View at Publisher · View at Google Scholar
  7. C. N. Bouza, “Sobre el problema de la fraccion de submuestreo para el caso de las no respuestas,” Trabajos de Estadistica y de Investigacion Operativa, vol. 32, no. 2, pp. 30–36, 1981. View at Publisher · View at Google Scholar · View at Scopus
  8. G. A. McIntire, “A method for unbiased sampling using ranked sets,” Australian Journal of Agricultural Research, vol. 3, pp. 385–390, 1952. View at Publisher · View at Google Scholar
  9. T. R Dell and J. L. Clutter, “Ranked set sampling theory with order statistics background,” Biometrics, vol. 28, pp. 545–555, 1972. View at Publisher · View at Google Scholar · View at Zentralblatt MATH
  10. K. Takahashi and M. Wakimoto, “On unbiased estimates of the population mean based on the sample stratified by means ordering,” Annals of the Institute of Mathematical Statistics, vol. 20, no. 1, pp. 1–31, 1967. View at Publisher · View at Google Scholar
  11. C. N. Bouza, “Estimation of the mean in ranked set sampling with non responses,” Metrika, vol. 56, no. 2, pp. 171–179, 2002. View at Publisher · View at Google Scholar
  12. B. B. Khare and S. Srivastava, “Estimation of population mean using auxiliary character in presence of non-response,” National Academy of Science Letters, vol. 16, pp. 111–114, 1993. View at Google Scholar · View at Zentralblatt MATH
  13. B. B. Khare and S. Srivastava, “Study of conventional and alternative two phase sampling ratio product and regression estimators in presence of non-response,” Proceedings of the National Academy of Sciences, vol. 65, pp. 195–203, 1995. View at Google Scholar
  14. H. P. Singh and S. Kumar, “Estimation of mean in presence of non-response using two phase sampling scheme,” Statistical Papers, vol. 51, no. 3, pp. 559–582, 2010. View at Publisher · View at Google Scholar · View at Scopus
  15. H. P. Singh and S. Kumar, “A regression approach to the estimation of the finite population mean in the presence of non-response,” Australian & New Zealand Journal of Statistics, vol. 50, no. 4, pp. 395–408, 2008. View at Publisher · View at Google Scholar