Scientific Programming

Scientific Programming / 2012 / Article

Open Access

Volume 20 |Article ID 958482 | 11 pages |

Analyzing Influenza Virus Sequences using Binary Encoding Approach


Capturing mutation patterns of each individual influenza virus sequence is often challenging; in this paper, we demonstrated that using a binary encoding scheme coupled with dimension reduction technique, we were able to capture the intrinsic mutation pattern of the virus. Our approach looks at the variance between sequences instead of the commonly used p-distance or Hamming distance. We first convert the influenza genetic sequences to a binary strings and form a binary sequence alignment matrix and then apply Principal Component Analysis (PCA) to this matrix. PCA also provides identification power to identify reassortant virus by using data projection technique. Due to the sparsity of the binary string, we were able to analyze large volume of influenza sequence data in a very short time. For protein sequences, our scheme also allows the incorporation of biophysical properties of each amino acid. Here, we present various encouraging results from analyzing influenza nucleotide, protein and genome sequences using the proposed approach.

Copyright © 2012 Hindawi Publishing Corporation. This is an open access article distributed under the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.

More related articles

260 Views | 270 Downloads | 1 Citation
 PDF  Download Citation  Citation
 Order printed copiesOrder

Related articles

We are committed to sharing findings related to COVID-19 as quickly and safely as possible. Any author submitting a COVID-19 paper should notify us at to ensure their research is fast-tracked and made available on a preprint server as soon as possible. We will be providing unlimited waivers of publication charges for accepted articles related to COVID-19. Sign up here as a reviewer to help fast-track new submissions.