Journal of Probability and Statistics
Volume 2010 (2010), Article ID 480364, 10 pages
Research Article

Peirce's and Cohen's for Measures of Rater Reliability

Departement of Human Development and Family Studies, The Pennsylvania State University, University Park, PA 16802, USA

Received 11 November 2009; Revised 27 April 2010; Accepted 16 May 2010

This study examined a historical mixture model approach to the evaluation of ratings made in “gold standard” and two-rater contingency tables. Peirce's and the derived average were discussed in relation to a widely used index of reliability in the behavioral sciences, Cohen's . Sample size, population base rate of occurrence, the true “science of the method”, and guessing rates were manipulated across simulations. In “gold standard” situations, Peirce's tended to recover the true reliability of ratings as well as better than . In two-rater situations, ave tended to recover the true reliability as well as better than in most situations. The empirical utility and potential theoretical benefits of mixture model methods in estimating reliability are discussed, as are the associations between the statistics and other modern mixture model approaches.