Table of Contents
Epidemiology Research International
Volume 2013, Article ID 875234, 6 pages
http://dx.doi.org/10.1155/2013/875234
Research Article

First Use of Multiple Imputation with the National Tuberculosis Surveillance System

1Division of Infectious Diseases & HIV Medicine, Drexel University College of Medicine, 245 N 15th Street MS 461, New College Building 6314, Philadelphia, PA 19102, USA
2Department of Biostatistics and Epidemiology, Perelman School of Medicine, University of Pennsylvania, Philadelphia, PA 19104, USA
3Division of Infectious Diseases, Department of Medicine, Perelman School of Medicine, University of Pennsylvania, Philadelphia, PA 19104, USA
4Division of Tuberculosis Elimination, Centers for Disease Control and Prevention, Atlanta, GA 30333, USA

Received 27 July 2012; Accepted 18 December 2012

Academic Editor: Huibert Burger

Copyright © 2013 Christopher Vinnard et al. This is an open access article distributed under the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.

Abstract

Aims. The purpose of this study was to compare methods for handling missing data in analysis of the National Tuberculosis Surveillance System of the Centers for Disease Control and Prevention. Because of the high rate of missing human immunodeficiency virus (HIV) infection status in this dataset, we used multiple imputation methods to minimize the bias that may result from less sophisticated methods. Methods. We compared analysis based on multiple imputation methods with analysis based on deleting subjects with missing covariate data from regression analysis (case exclusion), and determined whether the use of increasing numbers of imputed datasets would lead to changes in the estimated association between isoniazid resistance and death. Results. Following multiple imputation, the odds ratio for initial isoniazid resistance and death was 2.07 (95% CI 1.30, 3.29); with case exclusion, this odds ratio decreased to 1.53 (95% CI 0.83, 2.83). The use of more than 5 imputed datasets did not substantively change the results. Conclusions. Our experience with the National Tuberculosis Surveillance System dataset supports the use of multiple imputation methods in epidemiologic analysis, but also demonstrates that close attention should be paid to the potential impact of missing covariates at each step of the analysis.