Table of Contents Author Guidelines Submit a Manuscript
BioMed Research International
Volume 2017, Article ID 6375059, 8 pages
Research Article

CNNdel: Calling Structural Variations on Low Coverage Data Based on Convolutional Neural Networks

Department of Computer Science and Technology, Beijing University of Chemical Technology, Beijing, China

Correspondence should be addressed to Cheng Ling; nc.ude.tcub@gnehcgnil and Jingyang Gao; nc.ude.tcub.liam@yjoag

Received 29 December 2016; Revised 3 April 2017; Accepted 12 April 2017; Published 28 May 2017

Academic Editor: Jialiang Yang

Copyright © 2017 Jing Wang et al. This is an open access article distributed under the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.


Many structural variations (SVs) detection methods have been proposed due to the popularization of next-generation sequencing (NGS). These SV calling methods use different SV-property-dependent features; however, they all suffer from poor accuracy when running on low coverage sequences. The union of results from these tools achieves fairly high sensitivity but still produces low accuracy on low coverage sequence data. That is, these methods contain many false positives. In this paper, we present CNNdel, an approach for calling deletions from paired-end reads. CNNdel gathers SV candidates reported by multiple tools and then extracts features from aligned BAM files at the positions of candidates. With labeled feature-expressed candidates as a training set, CNNdel trains convolutional neural networks (CNNs) to distinguish true unlabeled candidates from false ones. Results show that CNNdel works well with NGS reads from 26 low coverage genomes of the 1000 Genomes Project. The paper demonstrates that convolutional neural networks can automatically assign the priority of SV features and reduce the false positives efficaciously.