Abstract

The purpose of this paper is to look at how to estimate the finite population mean utilizing information from the auxiliary variable on a systematic sampling technique. By integrating the study variable’s maximum and minimum values, as well as two auxiliary variables, we offer estimators of the ratio, product, and regression types. The mathematical equations of the suggested and existing estimators are derived up to the first order of approximation. Based on real-life data sets, efficiency comparisons are carried out. The suggested ratio, product and regression estimators consistently outperform existing estimators in terms of mean square error, according to theoretical and empirical investigations.

1. Introduction

It is common practise in sampling surveys to employ auxiliary data during the estimation stage to improve the precision of population parameter estimates. In this context, several ratios, products, and regression estimators are appropriate examples. In the literature, a larger number of consistent estimators using auxiliary information for estimating the finite population mean or a total of the study variable, as well as their properties in simple random sampling, have been addressed, for example, see [1, 2] and the references cited therein. For estimating the population characteristics, there are some populations, such as forest areas for estimating timber volume and areas under various forms of cover [3], where simple random sampling or other sampling schemes are difficult to apply. In such a setting, systematic sampling produces precise results for selecting a sample from a population. The advantages of systematic sampling include the ability to select the entire sample with just one random start.

Also sampling theory, appropriate use of the auxiliary information may increase the precision of the estimators. But unfortunately, many real data sets contain values that are suddenly maximum or minimum. As a result, if any unexpected values are chosen in the sample, the estimator may generate misleading results. To handle such situations, we proposed mean estimation under a systematic sampling design in the presence of maximum and minimum values.

Assume that the population units are sequentially numbered from 1 to . We choose a unit at random from the first units and every unit to create a sample of n units. As an example, if is 12 and the first unit drawn is number 10, the following units are numbers 22, 34, 44, and so on.

The whole number is determined by the first unit chosen. A simple random sample appears to be less exact than systematic sampling. It divides the population into n strata, each of which includes the first units, second units, and so on.References [4, 5] discovered that systematic sampling is efficient and convenient in sampling. It gives estimators that are more efficient than those offered by basic random sampling under certain actual situations, in addition to its simplicity, which is very important in large-scale sampling work. The ratio and product estimators for estimating the finite population mean of the study variable were built by [6, 7]. References [715] and Javaid et al. [16], all go into great length about systematic sampling.

2. Notation and Symbols

Consider the study variable and the auxiliary variables for a finite population with units. We choose a systemic sample of size , starting with a random selection of the first unit, and subsequently selecting every unit after each interval of . We’ll use the formula , with and being positive integers. For selected systematic random sample, say , , where , are the values of unit in the selected sample for , , and variables correspondingly. For , , and , the sample mean in systematic random sampling isthose are unbiased estimators for population means for the variables ,, and , respectively. The bias and mean squared error were calculated, let us define

such that, where

The coefficients of , , and are ,, and , respectively,

andare interclass correlation in the systematic sample for the research variable as well as the auxiliary variables and ,is the correlation between the study variable and auxiliary variables .

Many genuine data sets have unexpectedly large or tiny or values. When such values appear in the estimation of a finite population mean, the results are vulnerable. The findings will be either inflated or underestimated if and exist. To deal with such a situation, [17] suggested the following unbiased estimator for the estimation of finite population mean using maximum and minimum values:The variance of , is calculated as follows:the population variance is , and the constant is . has a minimum value ofThe variance of , is calculated as follows:this is always less than the variance of .

Under a systematic sampling approach, the usual ratio estimator is

The bias and mean squared error of are provided by the following up to the first order of approximation:andwhere, and Under systematic sampling, the product estimator is as follows:The bias and MSE of are provided by the following up to the first order of approximation:

and

Under a systematic sampling procedure, the standard regression estimator for predicting the unknown population mean is as follows:

The sample regression coefficients are and , respectively. If and are the least square estimators of and , respectively, then up to the first order of approximation, the variance of the estimator is as follows:

3. Suggested Estimators

We provide a ratio, product, and regression type estimator using auxiliary variables and the study variable in a systematic sampling method, based on [18]. We also take into account the study’s minimum and maximum values, as well as the two auxiliary variables.

3.1. First Situation

When the study variable and the auxiliary variable have a positive connection, a bigger value of the auxiliary variable should be chosen, as should a greater value of the study variable. A smaller study variable value should also be chosen, as well as a smaller value of the auxiliary variable. To make use of these type of data, using auxiliary variables and the research variable, we recommend using a ratio type estimator.

or

The estimator of regression type is:where, . If the sample contains and , then . If the samples contain and , then , for all other samples.

3.2. Second Situation

When the study variable and the auxiliary variable have a negative correlation, choose the auxiliary variable with the bigger value and the study variable with the smallest value. The smaller value of the auxiliary variable is chosen, while the larger value of the study variable is chosen. The proposed product type estimator using the auxiliary variables with the study variable is given by

or

The estimator of regression type is as follows:where .If the samples contain and , and also .If the samples contain and and for all types of samples. Also , , and are unknown constants. The following relative error terms and their expectations are used to generate biases and mean squared errors.such that

Expressing (19) in terms of , we haveExpressing (27) we have the following results up to the first order of approximation:Taking both sides of (28) into consideration, we have

We have after squaring (28) and taking expectations

Differentiate (30) we have, , , and We derive the minimum MSE by substituting the optimum values of , , and in (30) of , given bywhereSimilarlyandwhere

In the case of positive correlation, the minimum of the regression estimator is provided bywhere

The population regression coefficients are and .

Similarly, in the case of negative correlation, the minimal of is

In the case of both positive and negative correlation between the study and the auxiliary variable, a general form for is

4. Comparison of Estimators

In this part, we use a systematic sampling approach to compare the suggested estimators against classic ratios, products, and regression estimators.Condition (i).By (13) and (32)ifCondition (ii).By (16) and (33)ifCondition (iii).By (18) and (37)ifWe found that if Conditions (i) –(iii) are met, the suggested estimators outperform the existing estimators.

5. Empirical Study

For numerical comparisons, we utilize two distinct data sets:Population 1: (Source: [19]).  = Imports of merchandise in millions.  = In billions, the gross national GDP.  = Cumulative price in index, all items (1967 = 100)., ,, , , , , , , , , , , , , , , , , , , , , , , .Population 2: (Source: [20])  = Production output from 40 factories.  = The total number of employees.  = Fixed capital., ,, , , , , , , , , , , , , , , , , , , , .From the results of MSEs, which are available in Table 1, the mean squared errors of the suggested estimators are lower than the existing estimators. We can also see from Table 2, that PREs of all suggested estimators are surpassing all the existing estimators. Of all the suggested estimators, the regression estimator has the best performance.

6. Conclusion

We proposed several standard ratios, products, and regression estimators using auxiliary variables in a systematic sampling method in the presence of maximum and minimum values. Under some conditions, the proposed estimators are more efficient than traditional mean ratios, products, and regression estimators. Table 1 shows that the suggested estimators outperform the standard estimators in both populations. The numerical study also supports the superiority of our proposed estimators. It is found that the new suggested estimators of the finite population mean are more precise than some of the existing estimators.

Data Availability

The data used to support the numerical findings of this study are available from the corresponding author upon request. The data can also be obtained upon searching the given sources of data.

Conflicts of Interest

The authors declare that they have no conflicts of interest.

Acknowledgments

This work was fully supported by the Department of Statistics, Quaid-i-Azam University, Islamabad, Pakistan. The first author will pay the fee for this paper.