Estimates of Inequality Indices Based on Simple Random, Ranked Set, and Systematic Sampling

Bansal, Pooja; Arora, Sangeeta; Mahajan, Kalpana K.

doi:https://doi.org/10.1155/2013/659580

International Scholarly Research Notices

On this page

Abstract Introduction Conclusion References Copyright Related Articles

Research Article | Open Access

Volume 2013 | Article ID 659580 | https://doi.org/10.1155/2013/659580

Estimates of Inequality Indices Based on Simple Random, Ranked Set, and Systematic Sampling

Pooja Bansal,¹Sangeeta Arora,¹and Kalpana K. Mahajan¹

Academic Editor: S. Lototsky, X. Dang

Received16 Jun 2013

Accepted02 Aug 2013

Published19 Sept 2013

Abstract

Gini index, Bonferroni index, and Absolute Lorenz index are some popular indices of inequality showing different features of inequality measurement. In general simple random sampling procedure is commonly used to estimate the inequality indices and their related inference. The key condition that the samples must be drawn via simple random sampling procedure though makes calculations much simpler but this assumption is often violated in practice as the data does not always yield simple random sample. Nonsimple random samples like Ranked set sampling or stratified sampling are gaining popularity for estimating these indices. The purpose of the present paper is to compare the efficiency of simple random sample estimates of inequality indices with their nonsimple random counterparts. Monte Carlo simulation technique is applied to get the results for some specific distributions.

1. Introduction

Lorenz curve [1] and the associated Gini index [2] are one of the most popular and frequently used tools to measure income inequality. Certain other variants of Lorenz curve, namely, Generalized Lorenz curve [3], Absolute Lorenz curve [4], Bonferroni curve [5], and Comic curves [3, 6] and associated inequality indices, that is, Generalized Lorenz index, Absolute Lorenz index, Bonferroni index, and Comic index, are some of the popular alternatives to Gini index used to study certain specialized features of inequality. In practice, mostly the simple random sampling procedure is used to derive the statistical inference or obtaining the sample estimates of these inequality measures. The key condition that the samples must be drawn via simple random sampling procedure though makes calculations much simpler but this assumption is often violated in practice as the data does not always yield simple random sample. For example, in India the socioeconomic data collected by the National Sample Survey Organization (NSSO) is not drawn through simple random sampling but follows two-stage stratified sampling techniques. Also in the United States, commonly used income and earnings data, such as the Current Population Survey (CPS) and the Panel Study of Income Dynamics (PSID), are all multistage random samples, where simple random sampling is not the only method applied at each stage. One may quote stratified sampling, cluster sampling, multistage cluster sampling, and so forth, as alternatives to simple random sampling while estimating inequality indices [7].

Besides, these available traditional sampling methods, the method of Ranked set sampling (RSS), have also gained popularity in the literature for estimating parameters like population mean and variance and its associated statistical inference. This RSS method of sampling is proved to be more efficient than simple random sampling [8]. Thus, the RSS technique may also be extended to obtain the sample estimates of these inequality measures, and it will be extended to compare the efficiency of RSS with simple random and other sampling estimates as applied to inequality measures. [9] have given the comparison of ranked set sample estimate of Gini index with simple random sampling for a nonstandard Lorenz curve equation using simulation technique and are not true in general as no standard distributions are considered.

The purpose of the present work is to compare the nonsimple random estimates of the inequality measures with that of simple random estimates. In particular, we have applied two nonsimple random sampling techniques—Ranked set sampling and systematic sampling methods and have obtained the sample estimates of some popular inequality indices—Gini, Bonferroni, and Absolute Lorenz index via method of Monte Carlo simulation [10] and have compared their efficiencies with those of simple random estimates. Simulated results showing the relative efficiencies have been obtained for Pareto, Exponential, and power distribution. It is interesting to note that the simple random estimates have much less efficiency as compared to their nonsimple counterparts, that is, ranked set or systematic sample estimates. We have also compared the ranked set estimates with systematic sample estimates and found that systematic sample estimates are more efficient than the ranked set sample estimates.

The present paper has been organized in the following manner. In Section 2, we have applied the estimates of some popular inequality indices—Gini Index, Bonferroni Index, and Absolute Lorenz Index for three sampling procedures, namely, simple random sampling, ranked set sampling, and systematic sampling—and applied the method of Monte Carlo method of simulation to obtain the estimates. A brief introduction of the inequality indices and their estimation in case of three sampling techniques is given in this section. Simulation study has been performed in Section 3, where we have computed the efficiencies of these estimates for some specific distributions, namely, Exponential, Pareto, and Power distributions. The table showing the various comparison of Mean Square Errors (MSE) under simple random sampling, ranked set sampling, and systematic sampling is given at the end of Section 3 along with the conclusion.

2. Monte Carlo Estimates for Different Sampling Techniques

In this section, we have discussed three sampling techniques, namely, simple random sampling, ranked set sampling, and systematic sampling procedures and obtained the Monte Carlo estimates of the three indices Gini, Bonferroni, and Absolute Lorenz.

2.1. Simple Random Sampling Procedure

The sample mean Monte Carlo procedure for the inequality indices using simple random sampling is explained below.

2.1.1. Gini Index

Gini index is defined as where is the Lorenz curve at ordinates [1] and Since is distributed uniformly over , that is, , An unbiased estimator of is sample mean where is the sample size.

Therefore, unbiased estimator of is Using (6) Algorithm for computing simple random estimate for Gini index is given as follows.(i) Generate a sequence of random numbers from .(ii) Compute .(iii) Compute .(iv) Compute .

2.1.2. Bonferroni Curve

Bonferroni index is defined as where is the Bonferroni curve defined at ordinates [5] Since is distributed uniformly over , that is, , An unbiased estimator of is sample mean where is the sample size.

Therefore, unbiased estimator of is Using (14), Algorithm for computing simple random estimate for Bonferroni index is given as follows.(i) Generate a sequence of random numbers from .(ii) Compute .(iii) Compute .(iv) Compute .

2.1.3. Absolute Lorenz Index

Absolute Lorenz index is defined as where is the Absolute Lorenz curve defined at ordinates [6] Since is distributed uniformly over , that is, , An unbiased estimator of is sample mean where is the sample size.

Therefore, unbiased estimator of is using (22) Algorithm for computing simple random estimate for Absolute Lorenz index is given as follows.(i) Generate a sequence of random numbers from .(ii) Compute .(iii) Compute .(iv) Compute .Simple random sampling estimates of Gini, Bonferroni, and Absolute Lorenz indices using Monte Carlo approach are given in the Table 1.

2.2. Ranked Set Sampling Procedure

General, ranked setsampling procedure involves the following steps [6].

(i) Select a simple random sample of units from the population and subject it to ordering on attribute of interestvia some ranking process, that is,

This judgment ranking can result from a variety of mechanisms, including expert’s opinion, visual comparisons, or the use of easy-to-obtain auxiliary variables, but it cannot involve actual measurements of the attribute of interest on sample units.

Once this judgement ranking of units in our initial random sample has been accomplished, the item judged to be smallest is included as the first item in the ranked set sample and the attribute of interest is formally measured on this unit and remaining unmeasured units in the first random sample are not considered further.

This measurement is denoted by , the smallest orderstatistics or the smallest judgement order item. It may or may not actually have the smallest attribute measurement among the sampled units. Note that are not considered further in the selection of our ranked set sample. The sole purpose of units is to help in selecting an item for measurement that represents the smaller attribute values in the population.

(ii) Following selection of , a second independent random sample of size is selected from the population and judgement ranking without formal measurement on the attribute of interest is made and the second smallest item of units in second random sample is selected and is included in the ranked set sample for measurement of attribute of interest; that is, The second measured observation is denoted by .

(iii) Similarly, from a third independent random sample, we select the unit judgement ranked to be the third smallest,, for measurement and inclusion in the rankedset sample, that is,

This process is continued until we have selected the unit judgement ranked to be largest of units in the th random sample, denoted by ; that is, for inclusion in our ranked set sample.

This entire process is referred to as a cycle and the number of observations in each random sample, in our example, is called set size.

Thus, to complete a single ranked set cycle, we need to judge rank independent random samples of size involving a total of sample units in order to obtain measured observations. These observations represent a balanced ranked set sample with set size .

In practice, the sample size is kept small to ease the visual ranking, the RSS literature suggested or 6. Therefore, in order to obtain a ranked set sample with a desired total number of measured observation , we repeat the entire cycle process independent times, yielding the data

Algorithm for obtaining the RSS estimate of Gini index via method of Monte Carlo simulation is given as follows.(a) Generate an RSS of size from .(b) Assign .(c) Compute .(d) Compute the ranked set sample estimate of Gini index as

Algorithm for obtaining the RSS estimate of Bonferroni index via method of Monte Carlo simulation is given as following.(a) Generate a RSS of size from .(b) Assign .(c) Compute .(d) Compute the ranked set sample estimate of Bonferroni index as follows.

Algorithm for obtaining the RSS estimate of Absolute Lorenz index via method of Monte Carlo simulation is given as follows.(a) Generate a RSS of size from .(b) Assign .(c) Compute .(d) Compute the ranked set sample estimate of Absolute Lorenz index as Ranked set sampling estimates of Gini, Bonferroni, and Absolute Lorenz indices using Monte Carlo approach are given in Table 2.

2.3. Systematic Sampling Procedure

Systematic sampling procedure is particular case of stratified sampling procedure where the stratas are of equal size.

2.3.1. Gini Index

Gini index is defined as Its unbiased estimator in case of stratified sampling is given as where where

Let region be divided into disjoint subregions ; that is, and where is null set.

Define Define where

Random variable is distributed according to on .

can be estimated by

Therefore, integral is estimated by where is distributed according to on where In particular, if and , we get the systematic sampling estimate of Gini index as Algorithm for computing systematic random sampling estimate is given as follows.(i) Divide the range of the cumulative distribution into intervals each of width .(ii) Generate from .(iii) Compute .(iv) Compute .(v) Compute

2.3.2. Bonferroni Index

Similarly, Bonferroni index is defined as Its stratified estimate is given as where where

Let region be divided into disjoint subregions , that is, and , where is null set.

Define Define where

Random variable is distributed according to on .

can be estimated by Therefore, integral is estimated by where is distributed according to on where In particular, if and , we get the systematic sampling estimate of Bonferroni index as Algorithm for computing systematic random sampling estimate is given as follows.(i) Divide the range of the cumulative distribution into intervals each of width .(ii) Generate from .(iii) Compute .(iv) Compute .(v) Compute

2.3.3. Absolute Lorenz Index

Absolute Lorenz index is defined as Its stratified estimate is given as where where

Let region be divided into disjoint subregions ; that is, and where is null set.

Define Define where

Random variable is distributed according to on .

can be estimated by Therefore, integral is estimated by where is distributed according to on where In particular, if and , we get the systematic sampling estimate of Absolute Lorenz index as Algorithm for computing systematic random sampling estimate is given as follows.(i) Divide the range of the cumulative distribution into intervals each of width .(ii) Generate from .(iii) Compute .(iv) Compute .(v) Compute Systematic random sampling estimates of Gini, Bonferroni, and Absolute Lorenz indices using Monte Carlo approach are given in Table 3.

3. Simulation Study

In this section, we illustrate the performance of the estimators for the three inequality indices based on the previously mentioned sampling procedures. 10000 random samples each of size 6, 8, 10, 12, 15, 20, 25, 30, 60, 80, 100, 120, 300, 400, 500, 600, 800, 1000, 1200, 1500, 2000, 2500, and 3000 were generated to compare the ranked set and systematic sampling estimates with simple random sampling estimates of the Gini index from exponential, Pareto , and Power ; Bonferroni index from exponential, Pareto , Power , and Absolute Lorenz index from exponential (3), Pareto , and rectangular distribution. The simulated ranked set samples with set size and number of repetitions, , and 500 were chosen. The mean squared errors (MSE) and efficiencies were computed for the three procedures. Simulation results are shown in Tables 7, 8, 9, 10, 11, 12, 13, 14, and 15.

The expressions for Gini index, Bonferroni, and absolute Lorenz index are given in Tables 4, 5, and 6. The relation between Pareto, Power, and Rectangular distribution can be noted from the following remark.

Remark(1)If random variable , that is, , then ; that is, .(2)The Power function distribution , with parameter and , results in rectangular distribution defined on the interval .

For simple random sampling MSEs are defined as where , , and denote the simple random sample estimates of Gini, Bonferroni, and Absolute Lorenz indices (Table 1).

Similarly, for ranked set and systematic sampling the MSEs can be defined where , , and denote the ranked set sample estimates and , , and denote the systematic random sample (SYS) estimates of Gini, Bonferroni and Absolute Lorenz indices (Tables 2 and 3).

Gini index , Bonferroni index , and Absolute Lorenz index for given distributions is given in the Tables 4–6.

Efficiencies are defined as implies that Ranked set sample estimates are better than Simple random sample estimates.

Similarly, implies that Systematic random sample (SYS) estimates are better than Simple random sample estimates. implies that Systematic random sample (SYS) estimates are better than Ranked set sample estimates.

The results have been presented in Tables 7–15 and it can be seen that the simple random estimates have much less efficiency as compared to their ranked set or systematic sample estimates. Systematic sample estimates are more efficient than the ranked set sample estimates. The same trend is prevalent through all the distributions irrespective of the inequality index being used. We also observe that relative efficiency increases as increases irrespective of the sampling technique used.

4. Conclusion

Ranked set sample estimates of inequality do improve the efficiency of all the inequality indices but compared to systematic sampling the ranked set sample estimates are less efficient. For example, in Table 7 while looking at the relative efficiencies for Gini index using different sample estimates we see that the efficiency of RSS is almost two to three times more than SRS (ref. Col. ) while efficiency of SYS is almost four to eight times more than RSS (ref. Col. ). The efficiency of SYS versus SRS is noticeably high as is evident from col. . The relative efficiency trend as given in the last three columns continues both for small as well as large sample sizes. One may conclude that SRS estimates though commonly used and simple to compute show less efficiency than their nonsimple random counterparts. One should use non-simple random estimate of inequality indices when the underlying data or practical situation so demands.

References

M. O. Lorenz, “Methods of measuring the concentration of wealth,” Publication of the American Statistical Association, vol. 9, pp. 209–219, 1905.
View at: Google Scholar
C. Gini, “Measurement of inequality of incomes,” The Economic Journal, vol. 31, pp. 124–126, 1921.
View at: Google Scholar
A. F. Shorrocks, “Ranking income distributions,” Economica, vol. 50, pp. 3–17, 1983.
View at: Google Scholar
P. Moyes, “A new concept of Lorenz domination,” Economics Letters, vol. 23, no. 2, pp. 203–207, 1987.
View at: Google Scholar
G. M. Giorgi, Concentration Index, Bonferroni, Encyclopedia of Statistical Sciences, vol. 2, John Wiley & Sons, New York, NY, USA, 1998.
S. Arora, K. Jain, and S. Pundir, “On cumulated mean income curve,” Model Assisted Statistics and Applications, vol. 1, no. 2, pp. 107–114, 2006.
View at: Google Scholar
B. Zheng, “Testing Lorenz curves with non-simple random samples,” Econometrica, vol. 70, no. 3, pp. 1235–1243, 2002.
View at: Google Scholar
G. A. Mcintyre, “A method for unbiased selective sampling, using ranked sets,” Australian Journal of Agricultural Research, vol. 2, pp. 385–390, 1952.
View at: Google Scholar
M. M. Al-Talib and A. D. Al-Nasser, “Estimation of Gini-index from continuous distribution based on ranked set sampling,” Electronic Journal of Applied Statistical Analysis, vol. 1, pp. 33–41, 2008.
View at: Google Scholar
R. Rubinstein, Simulation and the Monte Carlo Method, John Wiley & Sons, New York, NY, USA, 1981.

Copyright

Copyright © 2013 Pooja Bansal et al. This is an open access article distributed under the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.

PDF Download Citation

Download other formats

Order printed copies

Views

1585

Downloads

953

Citations