Table of Contents Author Guidelines Submit a Manuscript
Journal of Healthcare Engineering
Volume 2017, Article ID 7653071, 5 pages
Research Article

The Impact of Diagnostic Code Misclassification on Optimizing the Experimental Design of Genetic Association Studies

1Center for Human Genetics, Marshfield Clinic Research Institute, Marshfield, WI, USA
2Computation and Informatics in Biology and Medicine, University of Wisconsin-Madison, Madison, WI, USA

Correspondence should be addressed to Steven J. Schrodi; ude.nilcdlfm.frcm@nevets.idorhcs

Received 17 May 2017; Accepted 13 September 2017; Published 18 October 2017

Academic Editor: Richard Segall

Copyright © 2017 Steven J. Schrodi. This is an open access article distributed under the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.


Diagnostic codes within electronic health record systems can vary widely in accuracy. It has been noted that the number of instances of a particular diagnostic code monotonically increases with the accuracy of disease phenotype classification. As a growing number of health system databases become linked with genomic data, it is critically important to understand the effect of this misclassification on the power of genetic association studies. Here, I investigate the impact of this diagnostic code misclassification on the power of genetic association studies with the aim to better inform experimental designs using health informatics data. The trade-off between (i) reduced misclassification rates from utilizing additional instances of a diagnostic code per individual and (ii) the resulting smaller sample size is explored, and general rules are presented to improve experimental designs.