Table of Contents Author Guidelines Submit a Manuscript
Journal of Probability and Statistics
Volume 2010, Article ID 480364, 10 pages
http://dx.doi.org/10.1155/2010/480364
Research Article

Peirce's and Cohen's for Measures of Rater Reliability

Departement of Human Development and Family Studies, The Pennsylvania State University, University Park, PA 16802, USA

Received 11 November 2009; Revised 27 April 2010; Accepted 16 May 2010

Academic Editor: Junbin B. Gao

Copyright © 2010 Beau Abar and Eric Loken. This is an open access article distributed under the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.

Linked References

  1. C. S. Peirce, “The numerical measure of the success of predictions,” Science, vol. 4, no. 93, pp. 453–454, 1884. View at Google Scholar · View at Scopus
  2. M. J. Rovine and D. R. Anderson, “Peirce and Bowditch: an American contribution to correlation and regression,” The American Statistician, vol. 58, no. 3, pp. 232–236, 2004. View at Publisher · View at Google Scholar · View at MathSciNet
  3. E. Loken and M. J. Rovine, “Peirce's 19th century mixture model approach to rater agreement,” The American Statistician, vol. 60, no. 2, pp. 158–161, 2006. View at Publisher · View at Google Scholar · View at MathSciNet
  4. D. B. Stephenson, “Use of the “odds ratio” for diagnosing forecast skill,” Weather and Forecasting, vol. 15, no. 2, pp. 221–232, 2000. View at Google Scholar · View at Scopus
  5. F. W. Wilson, “Measuring the decision support value of probabilistic forecasts,” in Proceedings of the 12th Conference on Aviation Range and Aerospace Meteorology and the 18th Conference on Probability and Statistics in the Atmospheric Sciences, Atlanta, Ga, USA, 2006. View at Publisher · View at Google Scholar · View at Scopus
  6. A. Martín Andrés and J. D. Luna del Castillo, “Tests and intervals in multiple choice tests: a modification of the simplest classical model,” British Journal of Mathematical and Statistical Psychology, vol. 42, pp. 251–263, 1989. View at Google Scholar · View at Zentralblatt MATH
  7. M. Aickin, “Maximum likelihood estimation of agreement in the constant predictive probability model, and its relation to Cohen's kappa,” Biometrics, vol. 46, no. 2, pp. 293–302, 1990. View at Publisher · View at Google Scholar · View at Zentralblatt MATH · View at MathSciNet
  8. I. Guggenmoos-Holzmann and R. Vonk, “Kappa-like indices of observer agreement viewed from a latent class perspective,” Statistics in Medicine, vol. 17, no. 8, pp. 797–812, 1998. View at Google Scholar · View at Scopus
  9. C. Schuster and D. A. Smith, “Indexing systematic rater agreement with a latent-class model,” Psychological Methods, vol. 7, no. 3, pp. 384–395, 2002. View at Publisher · View at Google Scholar · View at Scopus
  10. C. Schuster and D. A. Smith, “Estimating with a latent class model the reliability of nominal judgments upon which two raters agree,” Educational and Psychological Measurement, vol. 66, no. 5, pp. 739–747, 2006. View at Publisher · View at Google Scholar · View at MathSciNet
  11. A. Martín Andrés and P. Femia Marzo, “Delta: a new measure of agreement between two raters,” The British Journal of Mathematical and Statistical Psychology, vol. 57, no. 1, pp. 1–19, 2004. View at Publisher · View at Google Scholar · View at MathSciNet
  12. A. Martín Andrés and P. Femia-Marzo, “Chance-corrected measures of reliability and validity in 2×2 tables,” Communications in Statistics, vol. 37, no. 3–5, pp. 760–772, 2008. View at Publisher · View at Google Scholar · View at MathSciNet
  13. J. Cohen, “A coefficient of agreement for nominal scales,” Educational and Psychological Measurement, vol. 20, pp. 37–46, 1960. View at Publisher · View at Google Scholar
  14. R. L. Brennan and D. J. Prediger, “Coefficient Kappa: some uses, misuses and alternatives,” Educational and Psychological Measurement, vol. 41, pp. 687–699, 1981. View at Publisher · View at Google Scholar
  15. R. Zwick, “Another look at interrater agreement,” Psychological Bulletin, vol. 103, no. 3, pp. 374–378, 1988. View at Publisher · View at Google Scholar · View at Scopus
  16. S. C. Weller and N. C. Mann, “Assessing rater performance without a “gold standard” using consensus theory,” Medical Decision Making, vol. 17, no. 1, pp. 71–79, 1997. View at Publisher · View at Google Scholar · View at Scopus
  17. J. C. Nelson and M. S. Pepe, “Statistical description of interrater variability in ordinal ratings,” Statistical Methods in Medical Research, vol. 9, no. 5, pp. 475–496, 2000. View at Publisher · View at Google Scholar
  18. A. Agresti and J. B. Lang, “Quasi-symmetric latent class models, with application to rater agreement,” Biometrics, vol. 49, no. 1, pp. 131–139, 1993. View at Publisher · View at Google Scholar · View at Scopus
  19. C. M. Dayton, “Applications and computational strategies for the two-point mixture index of fit,” The British Journal of Mathematical and Statistical Psychology, vol. 56, no. 1, pp. 1–13, 2003. View at Publisher · View at Google Scholar · View at MathSciNet
  20. T. Rudas, C. C. Clogg, and B. G. Lindsay, “A new index of fit based on mixture methods for the analysis of contingency tables,” Journal of the Royal Statistical Society. Series B, vol. 56, no. 4, pp. 623–639, 1994. View at Google Scholar · View at Zentralblatt MATH · View at MathSciNet
  21. J. S. Uebersax and W. M. Grove, “A latent trait finite mixture model for the analysis of rating agreement,” Biometrics, vol. 49, no. 3, pp. 823–835, 1993. View at Publisher · View at Google Scholar · View at MathSciNet