Table of Contents
ISRN Probability and Statistics
Volume 2012, Article ID 656390, 11 pages
http://dx.doi.org/10.5402/2012/656390
Research Article

On the Equivalence of Multirater Kappas Based on 2-Agreement and 3-Agreement with Binary Scores

Unit of Methodology and Statistics, Institute of Psychology, Leiden University, P.O. Box 9555, 2300 RB Leiden, The Netherlands

Received 7 August 2012; Accepted 25 August 2012

Academic Editors: J. Hu and O. Pons

Copyright © 2012 Matthijs J. Warrens. This is an open access article distributed under the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.

Linked References

  1. R. Artstein and M. Poesio, “Kappa3=Alpha (or beta),” NLE Technical Note 05-1, University of Essex, 2005. View at Google Scholar
  2. M. Banerjee, M. Capozzoli, L. McSweeney, and D. Sinha, “Beyond kappa: a review of interrater agreement measures,” The Canadian Journal of Statistics, vol. 27, no. 1, pp. 3–23, 1999. View at Publisher · View at Google Scholar · View at Zentralblatt MATH
  3. K. J. Berry and P. W. Mielke, “A generalization of Cohen’s kappa agreement measure to interval measurement and multiple raters,” Educational and Psychological Measurement, vol. 48, pp. 921–933, 1988. View at Google Scholar
  4. D. Cicchetti, R. Bronen, S. Spencer et al., “Rating scales, scales of measurement, issues of reliability: resolving some critical issues for clinicians and researchers,” The Journal of Nervous and Mental Disease, vol. 194, no. 8, pp. 557–564, 2006. View at Publisher · View at Google Scholar · View at Scopus
  5. J. Cohen, “A coefficient of agreement for nominal scales,” Educational and Psychological Measurement, vol. 20, pp. 37–46, 1960. View at Publisher · View at Google Scholar
  6. A. J. Conger, “Integration and generalization of kappas for multiple raters,” Psychological Bulletin, vol. 88, no. 2, pp. 322–328, 1980. View at Publisher · View at Google Scholar · View at Scopus
  7. M. Davies and J. L. Fleiss, “Measuring agreement for multinomial data,” Biometrics, vol. 38, pp. 1047–1051, 1982. View at Publisher · View at Google Scholar · View at Zentralblatt MATH
  8. A. Von Eye and E. Y. Mun, Analyzing Rater Agreement. Manifest Variable Methods, Lawrence Erlbaum Associates, 2006.
  9. J. L. Fleiss, “Measuring nominal scale agreement among many raters,” Psychological Bulletin, vol. 76, no. 5, pp. 378–382, 1971. View at Publisher · View at Google Scholar · View at Scopus
  10. J. L. Fleiss, “Measuring agreement between two judges on the presence or absence of a trait,” Biometrics, vol. 31, no. 3, pp. 651–659, 1975. View at Publisher · View at Google Scholar
  11. A. P. J. M. Heuvelmans and P. F. Sanders, “Beoordelaarsovereenstemming,” in Psychometrie in De Praktijk, P. F. Sanders and T. J. H. M. Eggen, Eds., pp. 443–470, Cito Instituut voor Toestontwikkeling, Arnhem, The Netherlands, 1993. View at Google Scholar
  12. L. M. Hsu and R. Field, “Interrater agreement measures: comments on kappan, Cohen’s kappa, Scott’s π and Aickin’s α,” Understanding Statistics, vol. 2, pp. 205–219, 2003. View at Publisher · View at Google Scholar
  13. L. Hubert, “Kappa revisited,” Psychological Bulletin, vol. 84, no. 2, pp. 289–297, 1977. View at Publisher · View at Google Scholar · View at Scopus
  14. H. Janson and U. Olsson, “A measure of agreement for interval or nominal multivariate observations,” Educational and Psychological Measurement A, vol. 61, no. 2, pp. 277–289, 2001. View at Publisher · View at Google Scholar
  15. J. R. Landis and G. G. Koch, “An application of hierarchical kappatype statistics in the assessment of majority agreement among multiple observers,” Biometrics, vol. 33, pp. 363–374, 1977. View at Google Scholar
  16. J. R. Landis and G. G. Koch, “The measurement of observer agreement for categorical data,” Biometrics, vol. 33, pp. 159–174, 1977. View at Google Scholar
  17. P. W. Mielke, K. J. Berry, and J. E. Johnston, “The exact variance of weighted kappa with multiple raters,” Psychological Reports, vol. 101, no. 2, pp. 655–660, 2007. View at Publisher · View at Google Scholar · View at Scopus
  18. P. W. Mielke, K. J. Berry, and J. E. Johnston, “Resampling probability values for weighted kappa with multiple raters,” Psychological Reports, vol. 102, no. 2, pp. 606–613, 2008. View at Publisher · View at Google Scholar · View at Scopus
  19. J. C. Nelson and M. S. Pepe, “Statistical description of interrater variability in ordinal ratings,” Statistical Methods in Medical Research, vol. 9, no. 5, pp. 475–496, 2000. View at Publisher · View at Google Scholar · View at Zentralblatt MATH · View at Scopus
  20. F. P. O'Malley, S. K. Mohsin, S. Badve et al., “Interobserver reproducibility in the diagnosis of flat epithelial atypia of the breast,” Modern Pathology, vol. 19, no. 2, pp. 172–179, 2006. View at Publisher · View at Google Scholar · View at Scopus
  21. R. Popping, Overeenstemmingsmaten voor Nominale Data [Ph.D. thesis], Rijksuniversiteit Groningen, Groningen, The Netherlands, 1983.
  22. R. Popping, “Some views on agreement to be used in content analysis studies,” Quality & Quantity, vol. 44, no. 6, pp. 1067–1078, 2010. View at Publisher · View at Google Scholar · View at Scopus
  23. W. A. Scott, “Reliability of content analysis: the case of nominal scale coding,” Public Opinion Quarterly, vol. 19, no. 3, pp. 321–325, 1955. View at Publisher · View at Google Scholar · View at Scopus
  24. S. Vanbelle and A. Albert, “Agreement between an isolated rater and a group of raters,” Statistica Neerlandica, vol. 63, no. 1, pp. 82–100, 2009. View at Publisher · View at Google Scholar
  25. S. Vanbelle and A. Albert, “Agreement between two independent groups of raters,” Psychometrika, vol. 74, no. 3, pp. 477–491, 2009. View at Publisher · View at Google Scholar
  26. S. Vanbelle and A. Albert, “A note on the linearly weighted kappa coefficient for ordinal scales,” Statistical Methodology, vol. 6, no. 2, pp. 157–163, 2009. View at Publisher · View at Google Scholar · View at Zentralblatt MATH
  27. M. J. Warrens, “κ-adic similarity coefficients for binary (presence/absence) data,” Journal of Classification, vol. 26, no. 2, pp. 227–245, 2009. View at Publisher · View at Google Scholar
  28. M. J. Warrens, “Inequalities between kappa and kappa-like statistics for κ×κ tables,” Psychometrika, vol. 75, no. 1, pp. 176–185, 2010. View at Publisher · View at Google Scholar
  29. M. J. Warrens, “Inequalities between multi-rater kappas,” Advances in Data Analysis and Classification, vol. 4, no. 4, pp. 271–286, 2010. View at Publisher · View at Google Scholar
  30. M. J. Warrens, “Cohen's linearly weighted kappa is a weighted average of 2×2 kappas,” Psychometrika, vol. 76, no. 3, pp. 471–486, 2011. View at Publisher · View at Google Scholar
  31. M. J. Warrens, “Weighted kappa is higher than Cohen's kappa for tridiagonal agreement tables,” Statistical Methodology, vol. 8, no. 2, pp. 268–272, 2011. View at Publisher · View at Google Scholar · View at Zentralblatt MATH
  32. M. J. Warrens, “Equivalences of weighted kappas for multiple raters,” Statistical Methodology, vol. 9, no. 3, pp. 407–422, 2012. View at Publisher · View at Google Scholar
  33. M. J. Warrens, “A family of multi-rater kappas that can always be increased and decreased by combining categories,” Statistical Methodology, vol. 9, no. 3, pp. 330–340, 2012. View at Publisher · View at Google Scholar
  34. M. J. Warrens, “Conditional inequalities between Cohen's kappa and weighted kappas,” Statistical Methodology, vol. 10, pp. 14–22, 2013. View at Publisher · View at Google Scholar