Table of Contents Author Guidelines Submit a Manuscript
Scientific Programming
Volume 2015, Article ID 279715, 12 pages
Research Article

Parallel Seed-Based Approach to Multiple Protein Structure Similarities Detection

1INRIA/IRISA and University of Rennes 1, Campus Universitaire de Beaulieu, 35042 Rennes Cedex, France
2Los Alamos National Laboratory, Information Sciences, P.O. Box 1663, MS B256, Los Alamos, NM 87545, USA

Received 15 April 2014; Accepted 2 November 2014

Academic Editor: Ewa Deelman

Copyright © 2015 Guillaume Chapuis et al. This is an open access article distributed under the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.


Finding similarities between protein structures is a crucial task in molecular biology. Most of the existing tools require proteins to be aligned in order-preserving way and only find single alignments even when multiple similar regions exist. We propose a new seed-based approach that discovers multiple pairs of similar regions. Its computational complexity is polynomial and it comes with a quality guarantee—the returned alignments have both root mean squared deviations (coordinate-based as well as internal-distances based) lower than a given threshold, if such exist. We do not require the alignments to be order preserving (i.e., we consider nonsequential alignments), which makes our algorithm suitable for detecting similar domains when comparing multidomain proteins as well as to detect structural repetitions within a single protein. Because the search space for nonsequential alignments is much larger than for sequential ones, the computational burden is addressed by extensive use of parallel computing techniques: a coarse-grain level parallelism making use of available CPU cores for computation and a fine-grain level parallelism exploiting bit-level concurrency as well as vector instructions.