Advances in Bioinformatics http://www.hindawi.com The latest articles from Hindawi Publishing Corporation © 2013 , Hindawi Publishing Corporation . All rights reserved. Computational and Statistical Approaches for Modeling of Proteomic and Genomic Networks Thu, 16 May 2013 09:08:24 +0000 http://www.hindawi.com/journals/abi/2013/561968/ Mohamed Nounou, Hazem Nounou, Erchin Serpedin, Aniruddha Datta, and Yufei Huang Copyright © 2013 Mohamed Nounou et al. All rights reserved. Reverse Engineering Sparse Gene Regulatory Networks Using Cubature Kalman Filter and Compressed Sensing Wed, 08 May 2013 11:21:31 +0000 http://www.hindawi.com/journals/abi/2013/205763/ This paper proposes a novel algorithm for inferring gene regulatory networks which makes use of cubature Kalman filter (CKF) and Kalman filter (KF) techniques in conjunction with compressed sensing methods. The gene network is described using a state-space model. A nonlinear model for the evolution of gene expression is considered, while the gene expression data is assumed to follow a linear Gaussian model. The hidden states are estimated using CKF. The system parameters are modeled as a Gauss-Markov process and are estimated using compressed sensing-based KF. These parameters provide insight into the regulatory relations among the genes. The Cramér-Rao lower bound of the parameter estimates is calculated for the system model and used as a benchmark to assess the estimation accuracy. The proposed algorithm is evaluated rigorously using synthetic data in different scenarios which include different number of genes and varying number of sample points. In addition, the algorithm is tested on the DREAM4 in silico data sets as well as the in vivo data sets from IRMA network. The proposed algorithm shows superior performance in terms of accuracy, robustness, and scalability. Amina Noor, Erchin Serpedin, Mohamed Nounou, and Hazem Nounou Copyright © 2013 Amina Noor et al. All rights reserved. Efficient Serial and Parallel Algorithms for Selection of Unique Oligos in EST Databases Mon, 08 Apr 2013 17:06:36 +0000 http://www.hindawi.com/journals/abi/2013/793130/ Obtaining unique oligos from an EST database is a problem of great importance in bioinformatics, particularly in the discovery of new genes and the mapping of the human genome. Many algorithms have been developed to find unique oligos, many of which are much less time consuming than the traditional brute force approach. An algorithm was presented by Zheng et al. (2004) which finds the solution of the unique oligos search problem efficiently. We implement this algorithm as well as several new algorithms based on some theorems included in this paper. We demonstrate how, with these new algorithms, we can obtain unique oligos much faster than with previous ones. We parallelize these new algorithms to further improve the time of finding unique oligos. All algorithms are run on ESTs obtained from a Barley EST database. Manrique Mata-Montero, Nabil Shalaby, and Bradley Sheppard Copyright © 2013 Manrique Mata-Montero et al. All rights reserved. Correction of Spatial Bias in Oligonucleotide Array Data Wed, 13 Mar 2013 15:09:36 +0000 http://www.hindawi.com/journals/abi/2013/167915/ Background. Oligonucleotide microarrays allow for high-throughput gene expression profiling assays. The technology relies on the fundamental assumption that observed hybridization signal intensities (HSIs) for each intended target, on average, correlate with their target’s true concentration in the sample. However, systematic, nonbiological variation from several sources undermines this hypothesis. Background hybridization signal has been previously identified as one such important source, one manifestation of which appears in the form of spatial autocorrelation. Results. We propose an algorithm, pyn, for the elimination of spatial autocorrelation in HSIs, exploiting the duality of desirable mutual information shared by probes in a common probe set and undesirable mutual information shared by spatially proximate probes. We show that this correction procedure reduces spatial autocorrelation in HSIs; increases HSI reproducibility across replicate arrays; increases differentially expressed gene detection power; and performs better than previously published methods. Conclusions. The proposed algorithm increases both precision and accuracy, while requiring virtually no changes to users’ current analysis pipelines: the correction consists merely of a transformation of raw HSIs (e.g., CEL files for Affymetrix arrays). A free, open-source implementation is provided as an R package, compatible with standard Bioconductor tools. The approach may also be tailored to other platform types and other sources of bias. Philippe Serhal and Sébastien Lemieux Copyright © 2013 Philippe Serhal and Sébastien Lemieux. All rights reserved. Gene Regulation, Modulation, and Their Applications in Gene Expression Data Analysis Wed, 13 Mar 2013 11:03:18 +0000 http://www.hindawi.com/journals/abi/2013/360678/ Common microarray and next-generation sequencing data analysis concentrate on tumor subtype classification, marker detection, and transcriptional regulation discovery during biological processes by exploring the correlated gene expression patterns and their shared functions. Genetic regulatory network (GRN) based approaches have been employed in many large studies in order to scrutinize for dysregulation and potential treatment controls. In addition to gene regulation and network construction, the concept of the network modulator that has significant systemic impact has been proposed, and detection algorithms have been developed in past years. Here we provide a unified mathematic description of these methods, followed with a brief survey of these modulator identification algorithms. As an early attempt to extend the concept to new RNA regulation mechanism, competitive endogenous RNA (ceRNA), into a modulator framework, we provide two applications to illustrate the network construction, modulation effect, and the preliminary finding from these networks. Those methods we surveyed and developed are used to dissect the regulated network under different modulators. Not limit to these, the concept of “modulation” can adapt to various biological mechanisms to discover the novel gene regulation mechanisms. Mario Flores, Tzu-Hung Hsiao, Yu-Chiao Chiu, Eric Y. Chuang, Yufei Huang, and Yidong Chen Copyright © 2013 Mario Flores et al. All rights reserved. Spectral Analysis on Time-Course Expression Data: Detecting Periodic Genes Using a Real-Valued Iterative Adaptive Approach Thu, 28 Feb 2013 15:42:47 +0000 http://www.hindawi.com/journals/abi/2013/171530/ Time-course expression profiles and methods for spectrum analysis have been applied for detecting transcriptional periodicities, which are valuable patterns to unravel genes associated with cell cycle and circadian rhythm regulation. However, most of the proposed methods suffer from restrictions and large false positives to a certain extent. Additionally, in some experiments, arbitrarily irregular sampling times as well as the presence of high noise and small sample sizes make accurate detection a challenging task. A novel scheme for detecting periodicities in time-course expression data is proposed, in which a real-valued iterative adaptive approach (RIAA), originally proposed for signal processing, is applied for periodogram estimation. The inferred spectrum is then analyzed using Fisher’s hypothesis test. With a proper -value threshold, periodic genes can be detected. A periodic signal, two nonperiodic signals, and four sampling strategies were considered in the simulations, including both bursts and drops. In addition, two yeast real datasets were applied for validation. The simulations and real data analysis reveal that RIAA can perform competitively with the existing algorithms. The advantage of RIAA is manifested when the expression data are highly irregularly sampled, and when the number of cycles covered by the sampling time points is very reduced. Kwadwo S. Agyepong, Fang-Han Hsu, Edward R. Dougherty, and Erchin Serpedin Copyright © 2013 Kwadwo S. Agyepong et al. All rights reserved. Identification of Robust Pathway Markers for Cancer through Rank-Based Pathway Activity Inference Wed, 27 Feb 2013 09:47:10 +0000 http://www.hindawi.com/journals/abi/2013/618461/ One important problem in translational genomics is the identification of reliable and reproducible markers that can be used to discriminate between different classes of a complex disease, such as cancer. The typical small sample setting makes the prediction of such markers very challenging, and various approaches have been proposed to address this problem. For example, it has been shown that pathway markers, which aggregate the gene activities in the same pathway, tend to be more robust than gene markers. Furthermore, the use of gene expression ranking has been demonstrated to be robust to batch effects and that it can lead to more interpretable results. In this paper, we propose an enhanced pathway activity inference method that uses gene ranking to predict the pathway activity in a probabilistic manner. The main focus of this work is on identifying robust pathway markers that can ultimately lead to robust classifiers with reproducible performance across datasets. Simulation results based on multiple breast cancer datasets show that the proposed inference method identifies better pathway markers that can predict breast cancer metastasis with higher accuracy. Moreover, the identified pathway markers can lead to better classifiers with more consistent classification performance across independent datasets. Navadon Khunlertgit and Byung-Jun Yoon Copyright © 2013 Navadon Khunlertgit and Byung-Jun Yoon. All rights reserved. An Overview of the Statistical Methods Used for Inferring Gene Regulatory Networks and Protein-Protein Interaction Networks Thu, 21 Feb 2013 15:22:25 +0000 http://www.hindawi.com/journals/abi/2013/953814/ The large influx of data from high-throughput genomic and proteomic technologies has encouraged the researchers to seek approaches for understanding the structure of gene regulatory networks and proteomic networks. This work reviews some of the most important statistical methods used for modeling of gene regulatory networks (GRNs) and protein-protein interaction (PPI) networks. The paper focuses on the recent advances in the statistical graphical modeling techniques, state-space representation models, and information theoretic methods that were proposed for inferring the topology of GRNs. It appears that the problem of inferring the structure of PPI networks is quite different from that of GRNs. Clustering and probabilistic graphical modeling techniques are of prime importance in the statistical inference of PPI networks, and some of the recent approaches using these techniques are also reviewed in this paper. Performance evaluation criteria for the approaches used for modeling GRNs and PPI networks are also discussed. Amina Noor, Erchin Serpedin, Mohamed Nounou, Hazem Nounou, Nady Mohamed, and Lotfi Chouchane Copyright © 2013 Amina Noor et al. All rights reserved. Using Protein Clusters from Whole Proteomes to Construct and Augment a Dendrogram Wed, 20 Feb 2013 08:15:54 +0000 http://www.hindawi.com/journals/abi/2013/191586/ In this paper we present a new ab initio approach for constructing an unrooted dendrogram using protein clusters, an approach that has the potential for estimating relationships among several thousands of species based on their putative proteomes. We employ an open-source software program called pClust that was developed for use in metagenomic studies. Sequence alignment is performed by pClust using the Smith-Waterman algorithm, which is known to give optimal alignment and, hence, greater accuracy than BLAST-based methods. Protein clusters generated by pClust are used to create protein profiles for each species in the dendrogram, these profiles forming a correlation filter library for use with a new taxon. To augment the dendrogram with a new taxon, a protein profile for the taxon is created using BLASTp, and this new taxon is placed into a position within the dendrogram corresponding to the highest correlation with profiles in the correlation filter library. This work was initiated because of our interest in plasmids, and each step is illustrated using proteomes from Gram-negative bacterial plasmids. Proteomes for 527 plasmids were used to generate the dendrogram, and to demonstrate the utility of the insertion algorithm twelve recently sequenced pAKD plasmids were used to augment the dendrogram. Yunyun Zhou, Douglas R. Call, and Shira L. Broschat Copyright © 2013 Yunyun Zhou et al. All rights reserved. Solving the 0/1 Knapsack Problem by a Biomolecular DNA Computer Mon, 18 Feb 2013 07:55:04 +0000 http://www.hindawi.com/journals/abi/2013/341419/ Solving some mathematical problems such as NP-complete problems by conventional silicon-based computers is problematic and takes so long time. DNA computing is an alternative method of computing which uses DNA molecules for computing purposes. DNA computers have massive degrees of parallel processing capability. The massive parallel processing characteristic of DNA computers is of particular interest in solving NP-complete and hard combinatorial problems. NP-complete problems such as knapsack problem and other hard combinatorial problems can be easily solved by DNA computers in a very short period of time comparing to conventional silicon-based computers. Sticker-based DNA computing is one of the methods of DNA computing. In this paper, the sticker based DNA computing was used for solving the 0/1 knapsack problem. At first, a biomolecular solution space was constructed by using appropriate DNA memory complexes. Then, by the application of a sticker-based parallel algorithm using biological operations, knapsack problem was resolved in polynomial time. Hassan Taghipour, Mahdi Rezaei, and Heydar Ali Esmaili Copyright © 2013 Hassan Taghipour et al. All rights reserved. MRMPath and MRMutation, Facilitating Discovery of Mass Transitions for Proteotypic Peptides in Biological Pathways Using a Bioinformatics Approach Tue, 29 Jan 2013 14:45:02 +0000 http://www.hindawi.com/journals/abi/2013/527295/ Quantitative proteomics applications in mass spectrometry depend on the knowledge of the mass-to-charge ratio (m/z) values of proteotypic peptides for the proteins under study and their product ions. MRMPath and MRMutation, web-based bioinformatics software that are platform independent, facilitate the recovery of this information by biologists. MRMPath utilizes publicly available information related to biological pathways in the Kyoto Encyclopedia of Genes and Genomes (KEGG) database. All the proteins involved in pathways of interest are recovered and processed in silico to extract information relevant to quantitative mass spectrometry analysis. Peptides may also be subjected to automated BLAST analysis to determine whether they are proteotypic. MRMutation catalogs and makes available, following processing, known (mutant) variants of proteins from the current UniProtKB database. All these results, available via the web from well-maintained, public databases, are written to an Excel spreadsheet, which the user can download and save. MRMPath and MRMutation can be freely accessed. As a system that seeks to allow two or more resources to interoperate, MRMPath represents an advance in bioinformatics tool development. As a practical matter, the MRMPath automated approach represents significant time savings to researchers. Chiquito Crasto, Chandrahas Narne, Mikako Kawai, Landon Wilson, and Stephen Barnes Copyright © 2013 Chiquito Crasto et al. All rights reserved. Statistical Analysis of Terminal Extensions of Protein β-Strand Pairs Mon, 28 Jan 2013 14:05:10 +0000 http://www.hindawi.com/journals/abi/2013/909436/ The long-range interactions, required to the accurate predictions of tertiary structures of β-sheet-containing proteins, are still difficult to simulate. To remedy this problem and to facilitate β-sheet structure predictions, many efforts have been made by computational methods. However, known efforts on β-sheets mainly focus on interresidue contacts or amino acid partners. In this study, to go one step further, we studied β-sheets on the strand level, in which a statistical analysis was made on the terminal extensions of paired β-strands. In most cases, the two paired β-strands have different lengths, and terminal extensions exist. The terminal extensions are the extended part of the paired strands besides the common paired part. However, we found that the best pairing required a terminal alignment, and β-strands tend to pair to make bigger common parts. As a result, 96.97%  of β-strand pairs have a ratio of 25% of the paired common part to the whole length. Also 94.26% and 95.98%  of β-strand pairs have a ratio of 40% of the paired common part to the length of the two β-strands, respectively. Interstrand register predictions by searching interacting β-strands from several alternative offsets should comply with this rule to reduce the computational searching space to improve the performances of algorithms. Ning Zhang, Shan Gao, Lei Zhang, Jishou Ruan, and Tao Zhang Copyright © 2013 Ning Zhang et al. All rights reserved. Literature Mining Solutions for Life Science Research Tue, 08 Jan 2013 09:25:35 +0000 http://www.hindawi.com/journals/abi/2013/320436/ Jörg Hakenberg, Goran Nenadic, Dietrich Rebholz-Schuhmann, and Jin-Dong Kim Copyright © 2013 Jörg Hakenberg et al. All rights reserved. In Silico Docking of HNF-1a Receptor Ligands Wed, 19 Dec 2012 14:32:52 +0000 http://www.hindawi.com/journals/abi/2012/705435/ Background. HNF-1a is a transcription factor that regulates glucose metabolism by expression in various tissues. Aim. To dock potential ligands of HNF-1a using docking software in silico. Methods. We performed in silico studies using HNF-1a protein 2GYP·pdb and the following softwares: ISIS/Draw 2.5SP4, ARGUSLAB 4.0.1, and HEX5.1. Observations. The docking distances (in angstrom units: 1 angstrom unit (Å) = 0.1 nanometer or  metres) with ligands in decreasing order are as follows: resveratrol (3.8 Å), aspirin (4.5 Å), stearic acid (4.9 Å), retinol (6.0 Å), nitrazepam (6.8 Å), ibuprofen (7.9 Å), azulfidine (9.0 Å), simvastatin (9.0 Å), elaidic acid (10.1 Å), and oleic acid (11.6 Å). Conclusion. HNF-1a domain interacted most closely with resveratrol and aspirin Gumpeny Ramachandra Sridhar, Padmanabhuni Venkata Nageswara Rao, Dowluru SVGK Kaladhar, Tatavarthi Uma Devi, and Sali Veeresh Kumar Copyright © 2012 Gumpeny Ramachandra Sridhar et al. All rights reserved. Do Peers See More in a Paper Than Its Authors? Tue, 27 Nov 2012 11:28:07 +0000 http://www.hindawi.com/journals/abi/2012/750214/ Recent years have shown a gradual shift in the content of biomedical publications that is freely accessible, from titles and abstracts to full text. This has enabled new forms of automatic text analysis and has given rise to some interesting questions: How informative is the abstract compared to the full-text? What important information in the full-text is not present in the abstract? What should a good summary contain that is not already in the abstract? Do authors and peers see an article differently? We answer these questions by comparing the information content of the abstract to that in citances—sentences containing citations to that article. We contrast the important points of an article as judged by its authors versus as seen by peers. Focusing on the area of molecular interactions, we perform manual and automatic analysis, and we find that the set of all citances to a target article not only covers most information (entities, functions, experimental methods, and other biological concepts) found in its abstract, but also contains 20% more concepts. We further present a detailed summary of the differences across information types, and we examine the effects other citations and time have on the content of citances. Anna Divoli, Preslav Nakov, and Marti A. Hearst Copyright © 2012 Anna Divoli et al. All rights reserved. Wavelet Packet Entropy for Heart Murmurs Classification Sun, 25 Nov 2012 15:35:27 +0000 http://www.hindawi.com/journals/abi/2012/327269/ Heart murmurs are the first signs of cardiac valve disorders. Several studies have been conducted in recent years to automatically differentiate normal heart sounds, from heart sounds with murmurs using various types of audio features. Entropy was successfully used as a feature to distinguish different heart sounds. In this paper, new entropy was introduced to analyze heart sounds and the feasibility of using this entropy in classification of five types of heart sounds and murmurs was shown. The entropy was previously introduced to analyze mammograms. Four common murmurs were considered including aortic regurgitation, mitral regurgitation, aortic stenosis, and mitral stenosis. Wavelet packet transform was employed for heart sound analysis, and the entropy was calculated for deriving feature vectors. Five types of classification were performed to evaluate the discriminatory power of the generated features. The best results were achieved by BayesNet with 96.94% accuracy. The promising results substantiate the effectiveness of the proposed wavelet packet entropy for heart sounds classification. Fatemeh Safara, Shyamala Doraisamy, Azreen Azman, Azrul Jantan, and Sri Ranga Copyright © 2012 Fatemeh Safara et al. All rights reserved. On the Meaning of Affinity Limits in B-Cell Epitope Prediction for Antipeptide Antibody-Mediated Immunity Wed, 14 Nov 2012 15:09:41 +0000 http://www.hindawi.com/journals/abi/2012/346765/ B-cell epitope prediction aims to aid the design of peptide-based immunogens (e.g., vaccines) for eliciting antipeptide antibodies that protect against disease, but such antibodies fail to confer protection and even promote disease if they bind with low affinity. Hence, the Immune Epitope Database (IEDB) was searched to obtain published thermodynamic and kinetic data on binding interactions of antipeptide antibodies. The data suggest that the affinity of the antibodies for their immunizing peptides appears to be limited in a manner consistent with previously proposed kinetic constraints on affinity maturation in vivo and that cross-reaction of the antibodies with proteins tends to occur with lower affinity than the corresponding reaction of the antibodies with their immunizing peptides. These observations better inform B-cell epitope prediction to avoid overestimating the affinity for both active and passive immunization; whereas active immunization is subject to limitations of affinity maturation in vivo and of the capacity to accumulate endogenous antibodies, passive immunization may transcend such limitations, possibly with the aid of artificial affinity-selection processes and of protein engineering. Additionally, protein disorder warrants further investigation as a possible supplementary criterion for B-cell epitope prediction, where such disorder obviates thermodynamically unfavorable protein structural adjustments in cross-reactions between antipeptide antibodies and proteins. Salvador Eugenio C. Caoili Copyright © 2012 Salvador Eugenio C. Caoili. All rights reserved. Application of an Integrative Computational Framework in Trancriptomic Data of Atherosclerotic Mice Suggests Numerous Molecular Players Tue, 06 Nov 2012 15:37:37 +0000 http://www.hindawi.com/journals/abi/2012/453513/ Atherosclerosis is a multifactorial disease involving a lot of genes and proteins recruited throughout its manifestation. The present study aims to exploit bioinformatic tools in order to analyze microarray data of atherosclerotic aortic lesions of ApoE knockout mice, a model widely used in atherosclerosis research. In particular, a dynamic analysis was performed among young and aged animals, resulting in a list of 852 significantly altered genes. Pathway analysis indicated alterations in critical cellular processes related to cell communication and signal transduction, immune response, lipid transport, and metabolism. Cluster analysis partitioned the significantly differentiated genes in three major clusters of similar expression profile. Promoter analysis applied to functional related groups of the same cluster revealed shared putative cis-elements potentially contributing to a common regulatory mechanism. Finally, by reverse engineering the functional relevance of differentially expressed genes with specific cellular pathways, putative genes acting as hubs, were identified, linking functionally disparate cellular processes in the context of traditional molecular description. Olga Papadodima, Allan Sirsjö, Fragiskos N. Kolisis, and Aristotelis Chatziioannou Copyright © 2012 Olga Papadodima et al. All rights reserved. Intervention in Biological Phenomena via Feedback Linearization Tue, 06 Nov 2012 11:17:02 +0000 http://www.hindawi.com/journals/abi/2012/534810/ The problems of modeling and intervention of biological phenomena have captured the interest of many researchers in the past few decades. The aim of the therapeutic intervention strategies is to move an undesirable state of a diseased network towards a more desirable one. Such an objective can be achieved by the application of drugs to act on some genes/metabolites that experience the undesirable behavior. For the purpose of design and analysis of intervention strategies, mathematical models that can capture the complex dynamics of the biological systems are needed. S-systems, which offer a good compromise between accuracy and mathematical flexibility, are a promising framework for modeling the dynamical behavior of biological phenomena. Due to the complex nonlinear dynamics of the biological phenomena represented by S-systems, nonlinear intervention schemes are needed to cope with the complexity of the nonlinear S-system models. Here, we present an intervention technique based on feedback linearization for biological phenomena modeled by S-systems. This technique is based on perfect knowledge of the S-system model. The proposed intervention technique is applied to the glycolytic-glycogenolytic pathway, and simulation results presented demonstrate the effectiveness of the proposed technique. Mohamed Amine Fnaiech, Hazem Nounou, Mohamed Nounou, and Aniruddha Datta Copyright © 2012 Mohamed Amine Fnaiech et al. All rights reserved. MicroRNA Response Elements-Mediated miRNA-miRNA Interactions in Prostate Cancer Sun, 04 Nov 2012 08:29:37 +0000 http://www.hindawi.com/journals/abi/2012/839837/ The cell is a highly organized system of interacting molecules including proteins, mRNAs, and miRNAs. Analyzing the cell from a systems perspective by integrating different types of data helps revealing the complexity of diseases. Although there is emerging evidence that microRNAs have a functional role in cancer, the role of microRNAs in mediating cancer progression and metastasis remains not fully explored. As the amount of available miRNA and mRNA gene expression data grows, more systematic methods combining gene expression and biological networks become necessary to explore miRNA function. In this work I integrated functional miRNA-target interactions with mRNA and miRNA expression to infer mRNA-mediated miRNA-miRNA interactions. The inferred network represents miRNA modulation through common targets. The network is used to characterize the functional role of microRNA response element (MRE) to mediate interactions between miRNAs targeting the MRE. Results revealed that miRNA-1 is a key player in regulating prostate cancer progression. 11 miRNAs were identified as diagnostic and prognostic biomarkers that act as tumor suppressor miRNAs. This work demonstrates the utility of a network analysis as opposed to differential expression to find important miRNAs that regulate prostate cancer. Mohammed Alshalalfa Copyright © 2012 Mohammed Alshalalfa. All rights reserved. Flux Analysis of the Trypanosoma brucei Glycolysis Based on a Multiobjective-Criteria Bioinformatic Approach Sat, 13 Oct 2012 10:16:56 +0000 http://www.hindawi.com/journals/abi/2012/159423/ Trypanosoma brucei is a protozoan parasite of major of interest in discovering new genes for drug targets. This parasite alternates its life cycle between the mammal host(s) (bloodstream form) and the insect vector (procyclic form), with two divergent glucose metabolism amenable to in vitro culture. While the metabolic network of the bloodstream forms has been well characterized, the flux distribution between the different branches of the glucose metabolic network in the procyclic form has not been addressed so far. We present a computational analysis (called Metaboflux) that exploits the metabolic topology of the procyclic form, and allows the incorporation of multipurpose experimental data to increase the biological relevance of the model. The alternatives resulting from the structural complexity of networks are formulated as an optimization problem solved by a metaheuristic where experimental data are modeled in a multiobjective function. Our results show that the current metabolic model is in agreement with experimental data and confirms the observed high metabolic flexibility of glucose metabolism. In addition, Metaboflux offers a rational explanation for the high flexibility in the ratio between final products from glucose metabolism, thsat is, flux redistribution through the malic enzyme steps. Amine Ghozlane, Frédéric Bringaud, Hayssam Soueidan, Isabelle Dutour, Fabien Jourdan, and Patricia Thébault Copyright © 2012 Amine Ghozlane et al. All rights reserved. CMD: A Database to Store the Bonding States of Cysteine Motifs with Secondary Structures Wed, 10 Oct 2012 11:45:56 +0000 http://www.hindawi.com/journals/abi/2012/849830/ Computational approaches to the disulphide bonding state and its connectivity pattern prediction are based on various descriptors. One descriptor is the amino acid sequence motifs flanking the cysteine residue motifs. Despite the existence of disulphide bonding information in many databases and applications, there is no complete reference and motif query available at the moment. Cysteine motif database (CMD) is the first online resource that stores all cysteine residues, their flanking motifs with their secondary structure, and propensity values assignment derived from the laboratory data. We extracted more than 3 million cysteine motifs from PDB and UniProt data, annotated with secondary structure assignment, propensity value assignment, and frequency of occurrence and coefficiency of their bonding status. Removal of redundancies generated 15875 unique flanking motifs that are always bonded and 41577 unique patterns that are always nonbonded. Queries are based on the protein ID, FASTA sequence, sequence motif, and secondary structure individually or in batch format using the provided APIs that allow remote users to query our database via third party software and/or high throughput screening/querying. The CMD offers extensive information about the bonded, free cysteine residues, and their motifs that allows in-depth characterization of the sequence motif composition. Hamed Bostan, Naomie Salim, Zeti Azura Hussein, Peter Klappa, and Mohd Shahir Shamsir Copyright © 2012 Hamed Bostan et al. All rights reserved. A High-Throughput Computational Framework for Identifying Significant Copy Number Aberrations from Array Comparative Genomic Hybridisation Data Thu, 13 Sep 2012 17:00:14 +0000 http://www.hindawi.com/journals/abi/2012/876976/ Reliable identification of copy number aberrations (CNA) from comparative genomic hybridization data would be improved by the availability of a generalised method for processing large datasets. To this end, we developed swatCGH, a data analysis framework and region detection heuristic for computational grids. swatCGH analyses sequentially displaced (sliding) windows of neighbouring probes and applies adaptive thresholds of varying stringency to identify the 10% of each chromosome that contains the most frequently occurring CNAs. We used the method to analyse a published dataset, comparing data preprocessed using four different DNA segmentation algorithms, and two methods for prioritising the detected CNAs. The consolidated list of the most commonly detected aberrations confirmed the value of swatCGH as a simplified high-throughput method for identifying biologically significant CNA regions of interest. Ian Roberts, Stephanie A. Carter, Cinzia G. Scarpini, Konstantina Karagavriilidou, Jenny C. J. Barna, Mark Calleja, and Nicholas Coleman Copyright © 2012 Ian Roberts et al. All rights reserved. Gap Detection for Genome-Scale Constraint-Based Models Mon, 10 Sep 2012 09:50:14 +0000 http://www.hindawi.com/journals/abi/2012/323472/ Constraint-based metabolic models are currently the most comprehensive system-wide models of cellular metabolism. Several challenges arise when building an in silico constraint-based model of an organism that need to be addressed before flux balance analysis (FBA) can be applied for simulations. An algorithm called FBA-Gap is presented here that aids the construction of a working model based on plausible modifications to a given list of reactions that are known to occur in the organism. When applied to a working model, the algorithm gives a hypothesis concerning a minimal medium for sustaining the cell in culture. The utility of the algorithm is demonstrated in creating a new model organism and is applied to four existing working models for generating hypotheses about culture media. In modifying a partial metabolic reconstruction so that biomass may be produced using FBA, the proposed method is more efficient than a previously proposed method in that fewer new reactions are added to complete the model. The proposed method is also more accurate than other approaches in that only biologically plausible reactions and exchange reactions are used. J. Paul Brooks, William P. Burns, Stephen S. Fong, Chris M. Gowen, and Seth B. Roberts Copyright © 2012 J. Paul Brooks et al. All rights reserved. Producing High-Accuracy Lattice Models from Protein Atomic Coordinates Including Side Chains Wed, 15 Aug 2012 08:50:00 +0000 http://www.hindawi.com/journals/abi/2012/148045/ Lattice models are a common abstraction used in the study of protein structure, folding, and refinement. They are advantageous because the discretisation of space can make extensive protein evaluations computationally feasible. Various approaches to the protein chain lattice fitting problem have been suggested but only a single backbone-only tool is available currently. We introduce LatFit, a new tool to produce high-accuracy lattice protein models. It generates both backbone-only and backbone-side-chain models in any user defined lattice. LatFit implements a new distance RMSD-optimisation fitting procedure in addition to the known coordinate RMSD method. We tested LatFit's accuracy and speed using a large nonredundant set of high resolution proteins (SCOP database) on three commonly used lattices: 3D cubic, face-centred cubic, and knight's walk. Fitting speed compared favourably to other methods and both backbone-only and backbone-side-chain models show low deviation from the original data (~1.5 Å RMSD in the FCC lattice). To our knowledge this represents the first comprehensive study of lattice quality for on-lattice protein models including side chains while LatFit is the only available tool for such models. Martin Mann, Rhodri Saunders, Cameron Smith, Rolf Backofen, and Charlotte M. Deane Copyright © 2012 Martin Mann et al. All rights reserved. Sequence Complexity of Chromosome 3 in Caenorhabditis elegans Fri, 20 Jul 2012 14:16:55 +0000 http://www.hindawi.com/journals/abi/2012/287486/ The nucleotide sequences complexity in chromosome 3 of Caenorhabditis elegans (C. elegans) is studied. The complexity of these sequences is compared with some random sequences. Moreover, by using some parameters related to complexity such as fractal dimension and frequency, indicator matrix is given a first classification of sequences of C. elegans. In particular, the sequences with highest and lowest fractal value are singled out. It is shown that the intrinsic nature of the low fractal dimension sequences has many common features with the random sequences. Gaetano Pierro Copyright © 2012 Gaetano Pierro. All rights reserved. Detecting Cancer Outlier Genes with Potential Rearrangement Using Gene Expression Data and Biological Networks Thu, 28 Jun 2012 09:39:37 +0000 http://www.hindawi.com/journals/abi/2012/373506/ Gene alterations are a major component of the landscape of tumor genomes. To assess the significance of these alterations in the development of prostate cancer, it is necessary to identify these alterations and analyze them from systems biology perspective. Here, we present a new method (EigFusion) for predicting outlier genes with potential gene rearrangement. EigFusion demonstrated excellent performance in identifying outlier genes with potential rearrangement by testing it to synthetic and real data to evaluate performance. EigFusion was able to identify previously unrecognized genes such as FABP5 and KCNH8 and confirmed their association with primary and metastatic prostate samples while confirmed the metastatic specificity for other genes such as PAH, TOP2A, and SPINK1. We performed protein network based approaches to analyze the network context of potential rearranged genes. Functional gene rearrangement Modules are constructed by integrating functional protein networks. Rearranged genes showed to be highly connected to well-known altered genes in cancer such as AR, RB1, MYC, and BRCA1. Finally, using clinical outcome data of prostate cancer patients, potential rearranged genes demonstrated significant association with prostate cancer specific death. Mohammed Alshalalfa, Tarek A. Bismar, and Reda Alhajj Copyright © 2012 Mohammed Alshalalfa et al. All rights reserved. Literature Retrieval and Mining in Bioinformatics: State of the Art and Challenges Thu, 21 Jun 2012 10:55:43 +0000 http://www.hindawi.com/journals/abi/2012/573846/ The world has widely changed in terms of communicating, acquiring, and storing information. Hundreds of millions of people are involved in information retrieval tasks on a daily basis, in particular while using a Web search engine or searching their e-mail, making such field the dominant form of information access, overtaking traditional database-style searching. How to handle this huge amount of information has now become a challenging issue. In this paper, after recalling the main topics concerning information retrieval, we present a survey on the main works on literature retrieval and mining in bioinformatics. While claiming that information retrieval approaches are useful in bioinformatics tasks, we discuss some challenges aimed at showing the effectiveness of these approaches applied therein. Andrea Manconi, Eloisa Vargiu, Giuliano Armano, and Luciano Milanesi Copyright © 2012 Andrea Manconi et al. All rights reserved. Exploring Biomolecular Literature with EVEX: Connecting Genes through Events, Homology, and Indirect Associations Wed, 06 Jun 2012 11:11:12 +0000 http://www.hindawi.com/journals/abi/2012/582765/ Technological advancements in the field of genetics have led not only to an abundance of experimental data, but also caused an exponential increase of the number of published biomolecular studies. Text mining is widely accepted as a promising technique to help researchers in the life sciences deal with the amount of available literature. This paper presents a freely available web application built on top of 21.3 million detailed biomolecular events extracted from all PubMed abstracts. These text mining results were generated by a state-of-the-art event extraction system and enriched with gene family associations and abstract generalizations, accounting for lexical variants and synonymy. The EVEX resource locates relevant literature on phosphorylation, regulation targets, binding partners, and several other biomolecular events and assigns confidence values to these events. The search function accepts official gene/protein symbols as well as common names from all species. Finally, the web application is a powerful tool for generating homology-based hypotheses as well as novel, indirect associations between genes and proteins such as coregulators. Sofie Van Landeghem, Kai Hakala, Samuel Rönnqvist, Tapio Salakoski, Yves Van de Peer, and Filip Ginter Copyright © 2012 Sofie Van Landeghem et al. All rights reserved. Maximum Recommended Dosage of Lithium for Pregnant Women Based on a PBPK Model for Lithium Absorption Wed, 30 May 2012 14:26:37 +0000 http://www.hindawi.com/journals/abi/2012/352729/ Treatment of bipolar disorder with lithium therapy during pregnancy is a medical challenge. Bipolar disorder is more prevalent in women and its onset is often concurrent with peak reproductive age. Treatment typically involves administration of the element lithium, which has been classified as a class D drug (legal to use during pregnancy, but may cause birth defects) and is one of only thirty known teratogenic drugs. There is no clear recommendation in the literature on the maximum acceptable dosage regimen for pregnant, bipolar women. We recommend a maximum dosage regimen based on a physiologically based pharmacokinetic (PBPK) model. The model simulates the concentration of lithium in the organs and tissues of a pregnant woman and her fetus. First, we modeled time-dependent lithium concentration profiles resulting from lithium therapy known to have caused birth defects. Next, we identified maximum and average fetal lithium concentrations during treatment. Then, we developed a lithium therapy regimen to maximize the concentration of lithium in the mother’s brain, while maintaining the fetal concentration low enough to reduce the risk of birth defects. This maximum dosage regimen suggested by the model was 400 mg lithium three times per day. Scott Horton, Amalie Tuerk, Daniel Cook, Jiadi Cook, and Prasad Dhurjati Copyright © 2012 Scott Horton et al. All rights reserved.