ISRN Applied Mathematics
Volume 2012, Article ID 269385, 14 pages
http://dx.doi.org/10.5402/2012/269385
Research Article

## A Gibbs Sampler for the Multidimensional Item Response Model

Section on Statistics and Measurement, Department of EPSE, Southern Illinois University Carbondale, Wham 223, MailCode 4618, Carbondale, IL 62901-4618, USA

Received 2 March 2012; Accepted 26 March 2012

Academic Editors: S. He and X. Xue

Copyright © 2012 Yanyan Sheng and Todd C. Headrick. This is an open access article distributed under the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.

#### Abstract

Current procedures for estimating compensatory multidimensional item response theory (MIRT) models using Markov chain Monte Carlo (MCMC) techniques are inadequate in that they do not directly model the interrelationship between latent traits. This limits the implementation of the model in various applications and further prevents the development of other types of IRT models that offer advantages not realized in existing models. In view of this, an MCMC algorithm is proposed for MIRT models so that the actual latent structure is directly modeled. It is demonstrated that the algorithm performs well in modeling parameters as well as intertrait correlations and that the MIRT model can be used to explore the relative importance of a latent trait in answering each test item.

#### 1. Introduction

Item response theory (IRT) is a popular approach used for describing probabilistic relationships between correct responses on a set of test items and continuous latent traits (see [14]). IRT models have also been used in other areas of applied mathematics and statistical research. Some examples include US Supreme Court decision-making processes [5], alcohol disorder analysis [69], nicotine dependency [1012], multiple-recapture population estimation [13], psychiatric epidemiology [1416], longitudinal data analysis [17, 18], latent regression models [19, 20], and missing data analysis [21].

IRT has the advantage of allowing the inference of what the items and persons have on the responses to be modeled by distinct sets of parameters. As a result, a primary concern associated with IRT research has been on parameter estimation, which offers the basis for the theoretical advantages of IRT. Specifically, of concern are the statistical complexities that can often arise when item and person parameters are simultaneously estimated (see [1, 2224]). More recent attention has focused on fully Bayesian estimation where Markov chain Monte Carlo (MCMC) simulation techniques are used (e.g., [25, 26]). Over the past decade, MCMC has been implemented in the context of IRT models where one latent trait is assumed (e.g., [3, 2729]) as well as to models where multiple traits are considered (e.g., [3036]), for a thorough review on the historical and current developments of MCMC in terms of IRT, see [37].

The compensatory multidimensional IRT (MIRT; [38]) model assumes that each item measures multiple latent traits. It differs from some other dichotomous models insofar as it has an additional source of model indeterminacy that creates difficulties when using MCMC. Some techniques have been developed to approach this problem by imposing a special structure that constrains the item slope parameters [30, 36, 39]. However, these approaches do not directly model the actual interrelation between the distinct latent traits and, thus, are limited in certain applications. In view of the above, the present aim is to derive an efficient MCMC algorithm via Gibbs sampling [40] that (a) obviates the additional source of model indeterminacy associated with the MIRT model and (b) directly models the underlying latent trait structure. The MIRT model considered herein is presented in normal ogive form as more complicated MCMC procedures would have to be adopted for the logistic form (e.g., [3, 28, 35, 36]). Further, given that parametric probability functions of correct responses are usually modeled by a normal ogive or a logistic function and noting that the logistic and normal ogive forms of the IRT models are essentially indistinguishable in terms of model fit or parameter estimates (given proper scaling, see [41]), MCMC procedures for logistic models are not considered.

The remainder of this paper is organized as follows. In Section 2, the two-parameter normal ogive (2PNO) MIRT model is outlined. In Section 3, the Gibbs sampler is derived, and the prior specifications for the model parameters are described. Section 4 gives examples of implementing the Gibbs sampling algorithm in the context of simulated and real data to demonstrate the proposed methodology.

#### 2. Preliminaries

The MIRT model is introduced by considering a test that consists of dichotomous items with each measuring latent traits. Let denote a matrix of responses to the items where () if the th person answers the th item correctly (incorrectly) for and .

Definition 2.1. The probability of the th person obtaining a correct response on the th item is defined for the 2PNO MIRT model as The vector denotes latent trait parameters associated with the th person, and the vector denotes nonnegative slope parameters where larger values of have more influence on determining a success on the th item. The intercept parameter denotes the location in the latent space where the th item is maximally informative, and denotes the unit normal cdf. The model in (2.1) is also referred to as a compensatory MIRT model [38] because a low level of in one dimension can be compensated by a high level of in another dimension.

Remark 2.2. If the vector of slope parameters in (2.1) is such that , then the MIRT model reduces to the 2PNO multi-unidimensional model as where the test involves multiple parameters of and where each item measures one of these latent variables (see [31, 32]). The difference between (2.1) and (2.2) is analogous to the distinction made between factor analysis and that with a rotation to achieve a simple structure [42]. As such, (2.2) can be viewed as a special case of (2.1) where each item measures only one of the several latent traits. Further, the two models differ in that (2.1) is exploratory whereas (2.2) is confirmatory in nature.
The unidimensional IRT model, which has a systematic component form of , has a well-known identification problem in terms of location and scale invariance (e.g., [43]). Common practices of resolving this problem are to impose some constraint on the item parameters, that is, and , or select some specific values for the location and scale parameters for the prior normal distribution of , for example, (see, e.g., [3, 2729, 43]). Further, Bafumi et al. [5] proposed using a parameter transformation to approach the identification problem in the context of unidimensional IRT models. More specifically, the model parameters are transformed using a normalization procedure after estimation is completed. Bafumi et al. [5] noted that this transformation procedure obviates the problem of elusive convergence that results from highly correlated samples.
In terms of the multi-unidimensional IRT model in (2.2), Lee [31] extended Tsutakawa’s [43] approach by adopting a constrained covariance matrix for the latent traits and modeling the constrained covariance matrix indirectly. Lee’s [31] method not only solves the model indeterminacy problem, but also appropriately estimates the interrelationship between multiple latent traits (see also [32, 44]).
The more general MIRT model, as defined in (2.1), involves a new source of model indeterminacy called rotational invariance and is statistically more complicated than the unidimensional or multi-unidimensional models. As such, a Gibbs sampler is subsequently derived based on the ideas suggested in [5, 31] to address the general MIRT model identification problems and to model the latent structure directly.
It is noted that in an effort to develop computer software, Sheng [45] has shown that the approaches based on [5, 31] are useful for the 2PNO additive MIRT model, whose systematic component for modeling takes the form . The model assumes that each item measures two latent traits: , a common latent trait that all items measure, and , a latent trait that is specific for items in the th subtest. The difference between the model in [45] and the general MIRT model presented herein is comparable to that between a bifactor model (see [46]) and a general factor analysis model. The two models assume different latent structures, and hence the approaches for resolving their model indeterminacies are not the same.

#### 3. The Gibbs Sampler

The derivation of the Gibbs sampler associated with the MIRT model defined in (2.1) begins by considering a multivariate distribution for and a linear transformation on it, which will be based on the following definitions.

Definition 3.1. Let , where is a constrained covariance matrix or a correlation matrix, with 1s on the diagonal and with correlations (between and ) on the off-diagonal.

Definition 3.2. Let , where and , where is an diagonal matrix. Note that this variance-correlation decomposition of [47] makes the interpretation easier [48] and is essential for modeling the correlation matrix indirectly while solving the model indeterminacy in the context of the MIRT model.
From Definitions 3.1 and 3.2, it can be shown that where can be transformed from using for . To obviate the identification problem associated with the unconstrained parameters, let be related with the item parameters ( and ) so that the likelihoods are preserved given where the item parameters ( and ) will have to be constrained such that and . This leads us to the following proposition.

Proposition 3.3. If are constrained such that , then

Proof. It follows from (3.1) that , and thus, substituting into (3.3) gives Using (3.5), we can subsequently derive Setting in (3.6) and subsequently multiplying the left-hand side yields which leads to for . Hence, given the constraint that , each nonzero element in is .

To implement Gibbs sampling for the MIRT model in (2.1), a latent variable is introduced such that (see, e.g., [27, 49]). Further, from Definition 3.1, we assume that to ensure unique scaling for , which precludes the identification problem associated with such models (see [45]). Furthermore, for the unconstrained covariance matrix , we assume that . Thus, if with assumed prior distributions, then the joint posterior distribution of () is where is the likelihood function, with being the model probability function as defined in (2.1).

The proposed Gibbs sampler involves the following five steps:(1)sampling of the augmented parameters from(2)sampling of the latent variable (person) parameters from where and ,(3)sampling of the item parameters from where , assuming uniform priors and , or from where and , assuming conjugate normal priors , ,(4)sampling of the unconstrained covariance matrix from where is an inverse Wishart distribution, , and where is derived from (3.4),(5)a transformation from to .

In view of the additional model indeterminacy that results from the additive nature of , the parameters are further normalized after each Markov transition step is completed [5, 45]. More specifically, , , and are transformed () to the following normalized parameters: , and , where and represent the mean and standard deviation of . This rescaling preserves the likelihood because , while allowing the computation to proceed more efficiently [50]. Further, the transformation also assists in terms of speeding up the convergence of the Markov chains by reducing the posterior correlation in the posterior probability densities [51].

Thus, with initial starting values of , , and , the observations (i.e., , , , , and ) can be drawn or transformed iteratively from (3.10), (3.11), (3.12), (3.14), and (3.2) (or (3.13) in lieu of (3.12)), respectively. This iterative process continues for a sufficient number of samples after the posterior distributions reach stationarity (i.e., a phase commonly referred to as burn-in). The posterior means of all the samples collected after the burn-in stage are considered to be estimates of the model parameters (, ) and the hyperparameter ().

#### 4. Numerical Examples

To demonstrate the methodology presented above, the proposed Gibbs sampler was implemented using both simulated and real data. In terms of simulated data, tests that measure two latent traits were considered. In particular, three (i.e., , , and ) dichotomous data matrices were simulated from the 2PNO MIRT model where the population correlation between the two latent traits was set to , 0.4, 0.6, respectively. The item parameters were generated randomly from uniform distributions so that , . Gibbs sampling was subsequently implemented to recover the model parameters assuming informative normal (i.e., and ) or uniform priors for . Convergence was evaluated using the Gelman and Rubin [52] statistic for each item parameter. While the usual practice is to use multiple Markov chains from different starting points, a single chain can also be divided into subchains so that convergence is assessed by comparing the between and within subchain variances (see [53]). In view of the fact that a single chain is more economical in the number of iterations needed, the latter approach was adopted. The posterior estimates of item parameters (), the intertrait correlation hyperparameter, and the associated Gelman-Rubin statistics were obtained and are listed in Tables 1, 2, and 3 (note that is denoted as in these tables).

Table 1: Posterior estimates and Gelman-Rubin statistics for , , , and when the specified intertrait correlation is 0.2 (chain length = 10,000, burn-in = 5,000).
Table 2: Posterior estimates and Gelman-Rubin statistics for , , , and when the specified intertrait correlation is 0.4 (chain length = 10,000, burn-in = 5,000).
Table 3: Posterior estimates and Gelman-Rubin statistics for , , , and when the specified intertrait correlation is 0.6 (chain length = 10,000, burn-in = 5,000).

The Gelman-Rubin statistic provides a numerical measure for assessing convergence for each item parameter. With values close to 1, it is determined that in the implementation of the Gibbs sampler, Markov chains reached stationarity with a run length of 10,000 iterations and a burn-in period of 5,000 iterations. The posterior estimates of the item parameters as well as the intertrait correlation hyperparameter are fairly close to the specified parameters, suggesting that the algorithm performs well in recovering these parameters when the latent dimensions have a low to medium correlation. Further, the two sets of posterior estimates, resulting from different prior distributions, differ only slightly from each other, signifying that the posterior estimates are not sensitive to the choice of noninformative or informative priors for the slope and intercept parameters.

In the context of real data, a subset of the College Basic Academic Subjects Examination (CBASE; [54]) English data was used to demonstrate the methodology. Specifically, these data contain independent binary responses of 1,200 college students to 41 multiple-choice items. The English test is further organized to have two subtests, namely, reading and writing, so that 25 items are in the reading subtest and 16 are in the writing subtest. It is noted that the test was designed in such a manner that it conforms to the multi-unidimensional model, as each item measures one of the two latent traits. However, one may use the more general MIRT model to explore the latent structure, and in particular, to assess individual test items (i.e., to determine if the trait mainly involved in answering each item agrees with the one that it is supposed to measure). This can be accomplished by examining the estimated slope parameters, as a larger corresponds to a latent dimension that is more important in determining a person’s success on the item. Hence, assuming uniform priors for , Gibbs sampling was implemented to fit the MIRT model to the CBASE data with a run length of 10,000 iterations and a burn-in period of 5,000, which was sufficient for the chains to converge. An examination of the posterior estimates of shown in Table 4 suggests that all 16 items in the writing subtest relies on the second dimension writing more than the first dimension reading. However, some items in the reading subtest, such as items 17, 19–26, 28, and 30, require further attention and modification, as they do not seem to measure mainly reading as the rest of the items do.

Table 4: Posterior estimates and Gelman-Rubin statistics for , , and for the CBASE data, assuming uniform priors (chain length = 10,000, burn-in = 5,000).

In summary, the proposed MCMC algorithm provides computationally efficient and accurate estimation in the context of both simulated and real data examples. Not only does the algorithm appropriately model parameters, but also the algorithm efficiently models the intertrait correlations for the compensatory MIRT model, which provides an exploratory approach for examining the latent structure of a test and detecting items that do not measure the trait they are designed to measure.

#### 5. Concluding Remarks

The MCMC algorithm presented in this paper offers solutions for directly modeling the underlying structure of IRT models with multiple continuous latent traits. The algorithm works well when the actual intertrait correlation is low to moderate (less than 0.8), as a high correlation tends to result in high collinearity, which makes it difficult to distinguish among multiple latent traits and estimate them. With model parameters being accurately estimated, the compensatory MIRT model can be used to explore the relative importance of a latent trait in answering each test item. This is particularly useful when the underlying structure is not known, or when it is desirable to confirm the structure by examining the performance of individual items.

#### References

1. R. D. Bock and M. Aitkin, “Marginal maximum likelihood estimation of item parameters: application of an EM algorithm,” Psychometrika, vol. 46, no. 4, pp. 443–459, 1981.
2. R. J. Mislevy, “Estimation of latent group effects,” Journal of the American Statistical Association, vol. 80, no. 392, pp. 993–997, 1985.
3. R. J. Patz and B. W. Junker, “A straightforward approach to Markov chain Monte Carlo methods for item response models,” Journal of Educational and Behavioral Statistics, vol. 24, no. 2, pp. 146–178, 1999.
4. R. K. Tsutakawa and H. Y. Lin, “Bayesian estimation of item response curves,” Psychometrika, vol. 51, no. 2, pp. 251–267, 1986.
5. J. Bafumi, A. Gelman, D. K. Park, and N. Kaplan, “Practical issues in implementing and understanding Bayesian ideal point estimation,” Political Analysis, vol. 13, no. 2, pp. 171–187, 2005.
6. C. S. Martin, T. Chung, L. Kirisci, and J. W. Langenbucher, “Item response theory analysis of diagnostic criteria for alcohol and cannabis use disorders in adolescents: implications for DSM-V,” Journal of Abnormal Psychology, vol. 115, no. 4, pp. 807–814, 2006.
7. U. Feske, L. Kirisci, R. E. Tarter, and P. A. Pilkonis, “An application of item response theory to the DSM-III-R criteria for borderline personality disorder,” Journal of Personality Disorders, vol. 21, no. 4, pp. 418–433, 2007.
8. C. L. Beseler, L. A. Taylor, and R. F. Leeman, “An item-response theory analysis of DSM-IV Alcohol-Use disorder criteria and “binge” drinking in undergraduates,” Journal of Studies on Alcohol and Drugs, vol. 71, no. 3, pp. 418–423, 2010.
9. D. A. Gilder, I. R. Gizer, and C. L. Ehlers, “Item response theory analysis of binge drinking and its relationship to lifetime alcohol use disorder symptom severity in an american indian community sample,” Alcoholism, vol. 35, no. 5, pp. 984–995, 2011.
10. A. T. Panter and B. B. Reeve, “Assessing tobacco beliefs among youth using item response theory models,” Drug and Alcohol Dependence, vol. 68, pp. S21–S39, 2002.
11. D. Courvoisier and J. F. Etter, “Using item response theory to study the convergent and discriminant validity of three questionnaires measuring cigarette dependence,” Psychology of Addictive Behaviors, vol. 22, no. 3, pp. 391–401, 2008.
12. J. S. Rose and L. C. Dierker, “An item response theory analysis of nicotine dependence symptoms in recent onset adolescent smokers,” Drug and Alcohol Dependence, vol. 110, no. 1-2, pp. 70–79, 2010.
13. S. E. Fienberg, M. S. Johnson, and B. W. Junker, “Classical multilevel and Bayesian approaches to population size estimation using multiple lists,” Journal of the Royal Statistical Society. Series A, vol. 162, no. 3, pp. 383–405, 1999.
14. M. Reiser, “An application of the item-response model to psychiatric epidemiology,” Sociological Methods and Research, vol. 18, pp. 66–103, 1989.
15. M. Orlando, C. D. Sherbourne, and D. Thissen, “Summed-score linking using item response theory: application to depression measurement,” Psychological Assessment, vol. 12, no. 3, pp. 354–359, 2000.
16. A. Tsutsumi, N. Iwata, N. Watanabe et al., “Application of item response theory to achieve cross-cultural comparability of occupational stress measurement,” International Journal of Methods in Psychiatric Research, vol. 18, no. 1, pp. 58–67, 2009.
17. D. F. Andrade and H. R. Tavares, “Item response theory for longitudinal data: population parameter estimation,” Journal of Multivariate Analysis, vol. 95, no. 1, pp. 1–22, 2005.
18. J. M. te Marvelde, C. A. W. Glas, G. Van Landeghem, and J. Van Damme, “Application of multidimensional item response theory models to longitudinal data,” Educational and Psychological Measurement, vol. 66, no. 1, pp. 5–34, 2006.
19. M. von Davier and S. Sinharay, “An importance sampling em algorithm for latent regression models,” Journal of Educational and Behavioral Statistics, vol. 32, no. 3, pp. 233–251, 2007.
20. M. von Davier and S. Sinharay, “Stochastic approximation methods for latent regression item response models,” Journal of Educational and Behavioral Statistics, vol. 35, no. 2, pp. 174–193, 2010.
21. R. Holman and C. A. W. Glas, “Modelling non-ignorable missing-data mechanisms with item response theory models,” The British Journal of Mathematical and Statistical Psychology, vol. 58, no. 1, pp. 1–17, 2005.
22. A. Birnbaum, “Statistical theory for logistic mental test models with a prior distribution of ability,” Journal of Mathematical Psychology, vol. 6, no. 2, pp. 258–276, 1969.
23. F. B. Baker and S.-H. Kim, Item response theory: Parameter Estimation Techniques, vol. 176, Marcel Dekker, New York, NY, USA, 2nd edition, 2004.
24. I. W. Molenaar, “Estimation of item parameters,” in Rasch Models: Foundations, Recent Developments, and Applications, G. H. Fischer and I. W. Molenaar, Eds., pp. 39–51, Springer, New York, NY, USA, 1995.
25. A. F. M. Smith and G. O. Roberts, “Bayesian computation via the Gibbs sampler and related Markov chain Monte Carlo methods,” Journal of the Royal Statistical Society. Series B, vol. 55, no. 1, pp. 3–23, 1993.
26. L. Tierney, “Markov chains for exploring posterior distributions,” The Annals of Statistics, vol. 22, no. 4, pp. 1701–1762, 1994.
27. J. H. Albert, “Bayesian estimation of normal ogive item response curves using gibbs sampling,” Journal of Educational Statistics, vol. 17, pp. 251–269, 1992.
28. R. J. Patz and B. W. Junker, “Applications and extensions of MCMC in IRT: multiple item types, missing data, and rated responses,” Journal of Educational and Behavioral Statistics, vol. 24, no. 4, pp. 342–366, 1999.
29. S. K. Sahu, “Bayesian estimation and model choice in item response models,” Journal of Statistical Computation and Simulation, vol. 72, no. 3, pp. 217–232, 2002.
30. A. A. Béguin and C. A. W. Glas, “MCMC estimation and some model-fit analysis of multidimensional IRT models,” Psychometrika, vol. 66, no. 4, pp. 541–561, 2001.
31. H. Lee, Markov Chain Monte Carlo Methods for Estimating Multidimensional Ability in Item Response Analysis, ProQuest LLC, Ann Arbor, Mich, USA, 1995.
32. Y. Sheng and C. K. Wikle, “Comparing multiunidimensional and unidimensional item response theory models,” Educational and Psychological Measurement, vol. 67, no. 6, pp. 899–919, 2007.
33. Y. Sheng and C. K. Wikle, “Bayesian multidimensional IRT models with a hierarchical structure,” Educational and Psychological Measurement, vol. 68, no. 3, pp. 413–430, 2008.
34. Y. Sheng and C. K. Wikle, “Bayesian IRT models in incorporating general and specific abilities,” Behaviormetrika, vol. 36, no. 1, pp. 27–48, 2009.
35. L. Yao, BMIRT: Bayesian Multivariate Item Response Theory [Computer Software], CTB/McGraw-Hill, Monterey, Calif, USA, 2003.
36. L. Yao and K. A. Boughton, “A multidimensional item response modeling approach for improving subscale proficiency estimation and classification,” Applied Psychological Measurement, vol. 31, no. 2, pp. 83–105, 2007.
37. R. Levy, “The rise of markov chain monte carlo estimation for psychometric modeling,” Journal of Probability and Statistics, vol. 2009, Article ID 537139, 18 pages, 2009.
38. M. D. Reckase, Ed., Multidimensional Item Response Theory, Springer, New York, NY, USA, 2009.
39. D. M. Bolt and V. F. Lall, “Estimation of compensatory and noncompensatory multidimensional item response models using Markov chain Monte Carlo,” Applied Psychological Measurement, vol. 27, no. 6, pp. 395–414, 2003.
40. S. Geman and D. Geman, “Stochastic relaxation, gibbs distributions, and the bayesian restoration of images,” IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. 6, no. 6, pp. 721–741, 1984.
41. S. E. Embretson and S. P. Reise, Item Response Theory for Psychologists, Lawrence Erlbaum Associates, Hillside, NJ, USA, 2000.
42. R. P. McDonald, Factor Analysis and Related Methods, Lawrence Erlbaum, Hillside, NJ, USA, 1985.
43. R. K. Tsutakawa, “Estimation of two-parameter logistic item response curves,” Journal of Educational Statistics, vol. 9, pp. 263–276, 1985.
44. Y. Sheng, “A MATLAB package for Markov chain Monte Carlo with a multi-unidimensional IRT model,” Journal of Statistical Software, vol. 28, no. 10, pp. 1–20, 2008.
45. Y. Sheng, “Bayesian estimation of MIRT models with general and specific latent traits in MATLAB,” Journal of Statistical Software, vol. 34, no. 3, pp. 1–27, 2010.
46. K. J. Holzinger and F. Swineford, “The Bi-factor method,” Psychometrika, vol. 2, no. 1, pp. 41–54, 1937.
47. J. Barnard, R. McCulloch, and X.-L. Meng, “Modeling covariance matrices in terms of standard deviations and correlations, with application to shrinkage,” Statistica Sinica, vol. 10, no. 4, pp. 1281–1311, 2000.
48. M. Pourahmadi, “Cholesky decompositions and estimation of a covariance matrix: orthogonality of variance-correlation parameters,” Biometrika, vol. 94, no. 4, pp. 1006–1013, 2007.
49. M. A. Tanner and W. H. Wong, “The calculation of posterior distributions by data augmentation,” Journal of the American Statistical Association, vol. 82, no. 398, pp. 528–550, 1987.
50. A. Gelman, J. B. Carlin, H. S. Stern, and D. B. Rubin, Bayesian Data Analysis, Chapman & Hall/CRC, Boca Raton, Fla, USA, 2nd edition, 2004.
51. W. R. Gilks, S. Richardson, and D. J. Spiedelhalter, Markov Chain Monte Carlo in Practice, Chapman and Hall, London, UK, 1996.
52. A. Gelman and D. B. Rubin, “Inference from iterative simulation using multiple sequences,” Statistical Science, vol. 7, pp. 457–511, 1992.
53. J. P. Fox, “Multilevel IRT modeling in practice with the package mlirt,” Journal of Statistical Software, vol. 20, no. 5, pp. 1–16, 2007.
54. S. Osterlind, “A national review of scholastic achievement in general education: how are we doing and why should we care?” vol. 25 of ASHE-ERIC Higher Education Report, George Washington University Graduate School of Education and Human Development, Washington, DC, USA, 1997.