Table of Contents
ISRN Bioinformatics
Volume 2012 (2012), Article ID 371718, 10 pages
http://dx.doi.org/10.5402/2012/371718
Research Article

CallSim: Evaluation of Base Calls Using Sequencing Simulation

Center for Biotechnology Education, Johns Hopkins University, Baltimore, MD 21218, USA

Received 17 October 2012; Accepted 5 November 2012

Academic Editors: A. Bolshoy, F. Pappalardo, and J. Wang

Copyright © 2012 Jarrett D. Morrow and Brandon W. Higgs. This is an open access article distributed under the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.

Abstract

Accurate base calls generated from sequencing data are required for downstream biological interpretation, particularly in the case of rare variants. CallSim is a software application that provides evidence for the validity of base calls believed to be sequencing errors and it is applicable to Ion Torrent and 454 data. The algorithm processes a single read using a Monte Carlo approach to sequencing simulation, not dependent upon information from any other read in the data set. Three examples from general read correction, as well as from error-or-variant classification, demonstrate its effectiveness for a robust low-volume read processing base corrector. Specifically, correction of errors in Ion Torrent reads from a study involving mutations in multidrug resistant Staphylococcus aureus illustrates an ability to classify an erroneous homopolymer call. In addition, support for a rare variant in 454 data for a mixed viral population demonstrates “base rescue” capabilities. CallSim provides evidence regarding the validity of base calls in sequences produced by 454 or Ion Torrent systems and is intended for hands-on downstream processing analysis. These downstream efforts, although time consuming, are necessary steps for accurate identification of rare variants.