Abstract

The population parameters are estimated using sample surveys. However, the current investigation’s goal is to estimate subpopulations parameters such as total through a calibration approach. The proposed estimator’s properties have been given under simple random sampling. In addition, the class of estimators for subpopulation total has been discussed. The proposed estimator has been evaluated theoretically and empirically in a comparative study. The results demonstrated that a high-level estimate outperforms a low-level estimate for subpopulation total using a calibration approach.

1. Introduction

Generally, the sample surveys are applied to estimate the population parameters. However, this study is interested in estimating the subpopulations total. If these subpopulations are classified through some characteristics, like socio-economic, geographical regions may also term to be the domains. The estimate of these subpopulations (domains) total has become a very popular and effective tool when framing the program and policies of the government and private sectors. Hence, subpopulations demand has been growing acceleratory for a couple of years. Purcell and Kish [1] have classified the subpopulation due to the size with respect to population. The simple classification of the subpopulations is as follows:

Major subpopulation: it comprises 1/10 of the population or more. For example, the major geographical regions like (north, east, west, south, central), 10-year age group, or major classes like occupations.

Minor subpopulation: it comprises between 1/10 to 1/100 of the population, for example, state population, single-year age, two-fold classification like education and occupations.

Mini subpopulation: it comprises between 1/100 to 1/10000 of the population, for example, the population of the counties (more than 3000 in the U.S.A) or three fold classifications like age, education, and occupation.

Rare subpopulation: it comprises less than 1/10000 of the population. For example, health services regions are classified into local regions of residence.

The subpopulation total estimate can be used to estimate the social exclusion and well-being levels. Also, some of the environmental and epidemiological issues can be solved through subpopulation estimates. The consequence of the domain estimate with the popular direct and indirect methods has been explained in the literature (see Rao [2] and Singh [3]). Rahman [4] has been given the direct and indirect estimation with the model-based ideas. The availability of units in the study subpopulation depends on the method. Hence, in such a situation we used the indirect method. In the indirect method, the sample selected from the population (which consists of the subpopulations) than the subpopulation. The surrounding units are utilized when the units in the study subpopulation may be low or near zero for some subpopulations. This is due to the restriction of the units in the study subpopulation so, utilized the surrounding subpopulations units. If the surrounding units are similar in nature to the subpopulation, then the estimate gives a precise result that is acceptable at the desired level. The indirect estimates for subpopulations parameters like total have been explained through an auxiliary variable by Rao and Molina [5]. Singh et al. [6] have estimated the poverty indicators, many socio-economic indicators, and food insecurity for subpopulations total. In the indirect method, the availability of the auxiliary variable for the subpopulation has significant importance to Tikkiwal et al. [7]. The ratio and regression estimators through the calibration approach for population total have been discussed by numerous authors, such as Särndal [8], Singh and Mohl [9], Singh et al. [10], Wu and Sitter [11]. The extension of the calibration-based mean estimation through an auxiliary variable has been discussed by Koyuncu and Kadilar [12] and Koyuncu [13]. Furthermore, the estimation of the subpopulation parameter by Khare et al. [14]. The calibrate estimator was initiated by Horvitz and Thompson [15] on the study variable , and estimate the population total through simple random sampling without replacement designs. Deville and Särndal [16] have incorporated the auxiliary variable under the restriction of minimum chi-square distance. Särndal [8] has estimated the variance of the ratio and regression estimators for the population parameter. The mean square error using model based approach has been given by Slud and Maiti [17]. Whenever, recently classes of the estimators for variance estimation have been discussed by Bhushan et al. [18, 19]. We are motivated by Sarndal [8] and hence propose the calibration estimate for the subpopulation total as well as their variance. The rest of the article is constructed as follows: firstly, the Methodology employed is explained. Also, evaluate the variance of the proposed estimator using a calibration approach with the low and high levels, respectively. Then, a class of calibration estimators is presented. Furthermore, theoretical and empirical comparisons along with Concluding Results are presented and summarized. Finally, Recommendations and Applications are outlined.

2. Methodology

Consider the subpopulation with the sizes . Select a sample from the population in which the selected sampling units of domains are . The overall sample of size is . A sample is selected by the population rather than the subpopulation. The auxiliary variable of the subpopulation should be known. Hence, the auxiliary information for the proposed estimator is utilized. In the current work, both ideas of Deville and Särndal [16], and Tikkiwal et al. [7] are used in the same context. The proposed indirect generalized regression estimator of subpopulation total is written as follows:where represent the updated weight which is close to the design weight . The utilization of the auxiliary variable can be a good option in the calibration estimator of the domain. Also, subpopulation total of the auxiliary variable is equal to the sum of the units equal to the subpopulation total of auxiliary variable . The auxiliary equation of the subpopulation total of is

The minimum chi-square distance function iswhere is a chosen constant. Utilize (2) and (3), the calibration equation can be written as follows:

Partially differentiate (4) w.r. to new weights ,

Simplifying (5), then

Substitute in the auxiliary equation (2), we obtain the value of which further substitute in (5), we obtain the new weight as follows:

The new weight substitute in (1), the proposed estimator will be

3. Variance of through Low-Level Calibration Approach

Särndal et al. [20] provided the variance of the estimator. Zaman and Bulut [21] recently discussed the variance of ratio estimators. The proposed estimator is like a regression estimator. For obtaining the variance of the population’s total estimator of Deville and Särndal [16], the sample is taken based on Yates and Grundy [22]where

The proposed estimator is an indirect estimator for subpopulation hence we take the asymptotic concept of an unbiased estimator. The variance of the subpopulation total is written as follows:where and as defined previously.

The members of the proposed estimator are as follows:(I)For , the proposed estimator in (8) reduced to the estimator, as follows:(II)For , the proposed estimator in (8) will be ratio estimator, as follows:

The variance of can be obtain with the help of (9). The variance of the proposed estimatorwhere

The variance of the ratio estimatorand

The value of is under the simple random sampling without replacement. The probability of unit is selected , the probability of unit is selected and the probability of both and units are selected . The variance of the low-level calibration of the estimator of the subpopulation total.where , and the variance of the ratio estimator that given in (15) iswhere .

The indirect regression estimator is and .

We used the idea of Deng and Wu [23] and estimated the variance of the subpopulation total. The variance of the low-level calibration of ratio estimator can be estimated for the subpopulation using equation in (15).where and write up to second order, neglect higher order due to a small value

If substitute in (19) the then variance of the ratio estimator.

We obtain the variance of the linear regression estimator which is a special case of the estimator of the variance of ratio estimator of domain total. If the subpopulation total is equivalent to the population total that is , then it reduces into the linear form of the class of estimators of (19). The variance of the regression estimator is written as follows:

The variance of low-level calibration of the regression estimator is more efficient than the ratio estimator when . However, for always variance ratio estimator of the low-level calibration approach is lower than the value of the regression estimator of the low-level calibration approach. The variance of the estimator is equal to the class of estimates of Deng and Wu [23].

4. Variance of through High Level Calibration Approach

The high-level calibration is the adjustment of the weight function of the selected units of unit, unit, and both and units. We estimate the variance and checked the variance of the high-level calibration approach of ratio and regression estimators. We consider the function instead of the then the variance of the proposed estimatorwhere is the new weight which is very close to . The simple calibration equationwhere is known variance with the auxiliary information for each subpopulation total should be known . This is an auxiliary equation (see (3)) of the Horvitz–Thompson which is written as . We estimated the value of the estimator for the subpopulation total. Utilize the information of the auxiliary variable of the census registers, previous survey value, or administrative registers. The estimation of the variance of regression and ratio estimators have been given by Singh and Srivastava [24], Srivastava and Jhajj [25], Isaki [26], and Wu [27]. Fullar [28] has given the adjustment the weight for the regression estimator. The minimum chi-square distance function of the new weights and weight of and units from the populationwhere is the chosen constant. The estimated value of under the restriction of (23) iswhere .

The optimum value of the Lagrange’s multiplier with the help of (23) and (24) is written as follows:

Substitute the value of from (26) in (22). The estimator iswhere

The members of calibration based approach of higher-level calibration approach are(I)If we put the weights within (28), then the variance of the ratio estimator through high-level iswhere and , the sample variance is an asymptotic unbiased estimate of the population variance.(II)For , the equation (28) will be the variance of the regression estimatorwhere

Equation (31) shows that the high-level calibration estimator is different from Deng and Wu [23]. The and ratio estimators are members of the class of estimators of Srivastava and Jhajj [25]where is the function such that holds the certain regularity conditions is better than the low-level calibration estimates, Srivastava and Jhajj [25] and Deng and Wu [23].

5. Class of Calibration Estimators

This section is presenting the class of estimators. A class of estimators is a collection of various estimators under certain regularity conditions that give the same variance. We assume and . The variance of a class of estimators of the sub-population total iswhere the function is of and . For example, possesses the following regularity conditions:(1)Function exists for all the values of which contain the points inbound subset of two dimensional real spaces.(2)First and second-order partial derivatives of the function exist and are also continuous and bounded.

Different members of the class of estimators are exists under regularity conditions. However, three members are

where and both unknown parameters. The value of and are depending on the estimated value. The asymptotic variance of the three estimators will be the same as Srivastava and Jhajj [25] and Singh and Singh [29]. Our proposed estimator is better than the Srivasatava and Jhajj [25] and Singh and Singh [29], hence also better than the class of estimator under the regularity restrictions 1 and 2.

5.1. Theoretical Comparison

The theoretical comparison is given due to keeping in mind, the efficiency of the high-level and low-level indirect estimated variances. The high-level variance of the ratio estimator for the subpopulation total is

The estimated variance of the ratio estimator through the low-level calibration approach for subpopulation total is written as follows:

Now, from both (37) and (35), the variance of the high-level is lower as compared to the variance of the low-level of ratio estimate for

Furthermore, a comparison of the high-level variance estimate of estimator is

The low-level variance estimate of the estimator is

With help of (37) and (38), the restriction is that last term

We can say that the high-level is more efficient than the low-level calibration approach of ratio and regression estimators.

6. Empirical Study

This section presents an empirical comparison based on the simulation. We take a real data set from Sarndal et al. [30] where the population of 1975 is considered as an auxiliary variable and population of 1985 is considered as a study variable. The subpopulation is considered from Sweden’s municipalities regions. However, we take only five regions (subpopulation) 1, 2, 3, 7, and 8 out of the existing eight subpopulations 1, 2, 3, 4, 5, 6, 7, and 8. The study subpopulations have units 25, 48, 32, 15, and 29. For the simulation, we select a random sample of approximately 10%, 20%, and 30% units from the population by study and auxiliary characters and , respectively. This process is repeated to the finite times and then obtained the estimated error. Equations (34)–(38) are utilized to obtain the variance of the ratio estimator and generalized estimator of subpopulation with low-level and high-level calibration.

The variances of the ratio estimator through high-level calibration are lower as compared to the low-level calibration estimate for all subpopulations. The high-level estimated value is lower as compared to low-level calibration with the sizes 15, 30, and 45, (4.5 to 10.43), (1.75 to 9.5), and (2.53 to 7.1) percentage for domains, respectively. It is observed that the low-level calibration estimate variances are decreasing with the sizes increase from 15 to 45. A similar pattern is also observed for the high-level calibration approach.

The estimated value of the variance of the estimator through the high level is smaller than the low-level estimate for all the subpopulations. The high-level estimated value is lower than the low-level with sizes 15, 30, and 45 in terms of percentage are (2.4 to 12.55), (1.78 to 8.8) and (2.32 to 10.41) for subpopulations, respectively. The width of the generalized is higher than the ratio estimator for the sub-populations. The through low-level calibration estimate is decreasing with the variances when their sizes increase from 15 to 45.

The variance of the ratio and regression estimators with low-level and high-level calibration approach on 15, 30, and 45 units are given. Figures 13 are showing the real view of the estimated variance of the ratio estimator on low-level and high-level 15, 30, and 45 sizes. However, Figures 46 are presenting the real view of the regression estimate variance of high and low-level calibration approach on the different sample sizes (15, 30, and 45).

7. Concluding Results

Tables 1 and 2, and Figures 16 show that the high-level calibration estimate of ratio is preferred over than low-level calibration estimate of the subpopulations 1, 2, 3, 4, and 5. The regression estimate through calibration approach ratio is more effective than the generalized estimate. The low-level estimate is poor performance than the high-level estimate in both the estimators. The regression estimate is a higher length interval of the variance for the subpopulations than the ratio estimate. The discussed estimator generalized is a member of the generalized estimate of Srivastava and Jhajj [25]. The ratio estimate of high-level calibration is superior to Deng and Wu [23]. Based on the theoretical and empirical findings, we can conclude that the proposed estimate is more efficient than the regression estimates proposed by Srivastava and Jhajj [25] and Deng and Wu [23] for subpopulations 1, 2, 3, 4, and 5.

7.1. Recommendations and Applications

The following recommendations are given as follows:(1)The recommendation points have analysis through theoretical, and empirical for the ratio and generalized estimates for subpopulation total.(2)The present estimate is utilized when the domain total of the auxiliary variable is available but the number of units in the subpopulation is small.(3)The indirect estimate value also depends on how much the subpopulation value of the auxiliary variable is closed to the estimated sample of the population.(4)The subpopulation estimates of the variance of the high-level are a better option than the low-level variance estimates of ratio and regression estimates for the subpopulations. The high-level variance estimate of the ratio estimator can be introduced for the problems related to the health-related problems, environmental issues, and welfare programs like epidemiological issues, estimates for areas that are similar to estimates for those areas which are other parts.

Data Availability

The data is included within the study for finding the results.

Conflicts of Interest

The authors declare that there are no conflicts of interest.