Advances in Bioinformatics The latest articles from Hindawi Publishing Corporation © 2014 , Hindawi Publishing Corporation . All rights reserved. Elementary Flux Mode Analysis of Acetyl-CoA Pathway in Carboxydothermus hydrogenoformans Z-2901 Wed, 16 Apr 2014 08:04:40 +0000 Carboxydothermus hydrogenoformans is a carboxydotrophic hydrogenogenic bacterium species that produces hydrogen molecule by utilizing carbon monoxide (CO) or pyruvate as a carbon source. To investigate the underlying biochemical mechanism of hydrogen production, an elementary mode analysis of acetyl-CoA pathway was performed to determine the intermediate fluxes by combining linear programming (LP) method available in CellNetAnalyzer software. We hypothesized that addition of enzymes necessary for carbon monoxide fixation and pyruvate dissimilation would enhance the theoretical yield of hydrogen. An in silico gene knockout of pyk, pykC, and mdh genes of modeled acetyl-CoA pathway allows the maximum theoretical hydrogen yield of 47.62 mmol/gCDW/h for 1 mole of carbon monoxide (CO) uptake. The obtained hydrogen yield is comparatively two times greater than the previous experimental data. Therefore, it could be concluded that this elementary flux mode analysis is a crucial way to achieve efficient hydrogen production through acetyl-CoA pathway and act as a model for strain improvement. Rajadurai Chinnasamy Perumal, Ashok Selvaraj, and Gopal Ramesh Kumar Copyright © 2014 Rajadurai Chinnasamy Perumal et al. All rights reserved. Objective and Comprehensive Evaluation of Bisulfite Short Read Mapping Tools Tue, 15 Apr 2014 16:28:46 +0000 Background. Large-scale bisulfite treatment and short reads sequencing technology allow comprehensive estimation of methylation states of Cs in the genomes of different tissues, cell types, and developmental stages. Accurate characterization of DNA methylation is essential for understanding genotype phenotype association, gene and environment interaction, diseases, and cancer. Aligning bisulfite short reads to a reference genome has been a challenging task. We compared five bisulfite short read mapping tools, BSMAP, Bismark, BS-Seeker, BiSS, and BRAT-BW, representing two classes of mapping algorithms (hash table and suffix/prefix tries). We examined their mapping efficiency (i.e., the percentage of reads that can be mapped to the genomes), usability, running time, and effects of changing default parameter settings using both real and simulated reads. We also investigated how preprocessing data might affect mapping efficiency. Conclusion. Among the five programs compared, in terms of mapping efficiency, Bismark performs the best on the real data, followed by BiSS, BSMAP, and finally BRAT-BW and BS-Seeker with very similar performance. If CPU time is not a constraint, Bismark is a good choice of program for mapping bisulfite treated short reads. Data quality impacts a great deal mapping efficiency. Although increasing the number of mismatches allowed can increase mapping efficiency, it not only significantly slows down the program, but also runs the risk of having increased false positives. Therefore, users should carefully set the related parameters depending on the quality of their sequencing data. Hong Tran, Jacob Porter, Ming-an Sun, Hehuang Xie, and Liqing Zhang Copyright © 2014 Hong Tran et al. All rights reserved. Network Completion for Static Gene Expression Data Wed, 26 Mar 2014 11:30:15 +0000 We tackle the problem of completing and inferring genetic networks under stationary conditions from static data, where network completion is to make the minimum amount of modifications to an initial network so that the completed network is most consistent with the expression data in which addition of edges and deletion of edges are basic modification operations. For this problem, we present a new method for network completion using dynamic programming and least-squares fitting. This method can find an optimal solution in polynomial time if the maximum indegree of the network is bounded by a constant. We evaluate the effectiveness of our method through computational experiments using synthetic data. Furthermore, we demonstrate that our proposed method can distinguish the differences between two types of genetic networks under stationary conditions from lung cancer and normal gene expression data. Natsu Nakajima and Tatsuya Akutsu Copyright © 2014 Natsu Nakajima and Tatsuya Akutsu. All rights reserved. Secondary Structure Preferences of Mn2+ Binding Sites in Bacterial Proteins Mon, 17 Mar 2014 11:37:15 +0000 3D structures of proteins with coordinated Mn2+ ions from bacteria with low, average, and high genomic GC-content have been analyzed (149 PDB files were used). Major Mn2+ binders are aspartic acid (6.82% of Asp residues), histidine (14.76% of His residues), and glutamic acid (3.51% of Glu residues). We found out that the motif of secondary structure “beta strand-major binder-random coil” is overrepresented around all the three major Mn2+ binders. That motif may be followed by either alpha helix or beta strand. Beta strands near Mn2+ binding residues should be stable because they are enriched by such beta formers as valine and isoleucine, as well as by specific combinations of hydrophobic and hydrophilic amino acid residues characteristic to beta sheet. In the group of proteins from GC-rich bacteria glutamic acid residues situated in alpha helices frequently coordinate Mn2+ ions, probably, because of the decrease of Lys usage under the influence of mutational GC-pressure. On the other hand, the percentage of Mn2+ sites with at least one amino acid in the “beta strand-major binder-random coil” motif of secondary structure (77.88%) does not depend on genomic GC-content. Tatyana Aleksandrovna Khrustaleva Copyright © 2014 Tatyana Aleksandrovna Khrustaleva. All rights reserved. A Parallel Framework for Multipoint Spiral Search in ab Initio Protein Structure Prediction Sun, 16 Mar 2014 11:57:48 +0000 Protein structure prediction is computationally a very challenging problem. A large number of existing search algorithms attempt to solve the problem by exploring possible structures and finding the one with the minimum free energy. However, these algorithms perform poorly on large sized proteins due to an astronomically wide search space. In this paper, we present a multipoint spiral search framework that uses parallel processing techniques to expedite exploration by starting from different points. In our approach, a set of random initial solutions are generated and distributed to different threads. We allow each thread to run for a predefined period of time. The improved solutions are stored threadwise. When the threads finish, the solutions are merged together and the duplicates are removed. A selected distinct set of solutions are then split to different threads again. In our ab initio protein structure prediction method, we use the three-dimensional face-centred-cubic lattice for structure-backbone mapping. We use both the low resolution hydrophobic-polar energy model and the high-resolution energy model for search guiding. The experimental results show that our new parallel framework significantly improves the results obtained by the state-of-the-art single-point search approaches for both energy models on three-dimensional face-centred-cubic lattice. We also experimentally show the effectiveness of mixing energy models within parallel threads. Mahmood A. Rashid, Swakkhar Shatabda, M. A. Hakim Newton, Md Tamjidul Hoque, and Abdul Sattar Copyright © 2014 Mahmood A. Rashid et al. All rights reserved. A Brachytherapy Plan Evaluation Tool for Interstitial Applications Sun, 09 Feb 2014 05:54:49 +0000 Radiobiological metrics such as tumor control probability (TCP) and normal tissue complication probability (NTCP) help in assessing the quality of brachytherapy plans. Application of such metrics in clinics as well as research is still inadequate. This study presents the implementation of two indigenously designed plan evaluation modules: Brachy_TCP and Brachy_NTCP. Evaluation tools were constructed to compute TCP and NTCP from dose volume histograms (DVHs) of any interstitial brachytherapy treatment plan. The computation module was employed to estimate probabilities of tumor control and normal tissue complications in ten cervical cancer patients based on biologically effective equivalent uniform dose (BEEUD). The tumor control and normal tissue morbidity were assessed with clinical followup and were scored. The acute toxicity was graded using common terminology criteria for adverse events (CTCAE) version 4.0. Outcome score was found to be correlated with the TCP/NTCP estimates. Thus, the predictive ability of the estimates was quantified with the clinical outcomes. Biologically effective equivalent uniform dose-based formalism was found to be effective in predicting the complexities and disease control. Surega Anbumani, N. Arunai Nambiraj, Sridhar Dayalan, Kalaivany Ganesh, Pichandi Anchineyan, and Ramesh S. Bilimagga Copyright © 2014 Surega Anbumani et al. All rights reserved. Prediction of B-Cell Epitopes in Listeriolysin O, a Cholesterol Dependent Cytolysin Secreted by Listeria monocytogenes Thu, 02 Jan 2014 16:05:40 +0000 Listeria monocytogenes is a gram-positive, foodborne bacterium responsible for disease in humans and animals. Listeriolysin O (LLO) is a required virulence factor for the pathogenic effects of L. monocytogenes. Bioinformatics revealed conserved putative epitopes of LLO that could be used to develop monoclonal antibodies against LLO. Continuous and discontinuous epitopes were located by using four different B-cell prediction algorithms. Three-dimensional molecular models were generated to more precisely characterize the predicted antigenicity of LLO. Domain 4 was predicted to contain five of eleven continuous epitopes. A large portion of domain 4 was also predicted to comprise discontinuous immunogenic epitopes. Domain 4 of LLO may serve as an immunogen for eliciting monoclonal antibodies that can be used to study the pathogenesis of L. monocytogenes as well as develop an inexpensive assay. Morris S. Jones and J. Mark Carter Copyright © 2014 Morris S. Jones and J. Mark Carter. All rights reserved. Comparing Imputation Procedures for Affymetrix Gene Expression Datasets Using MAQC Datasets Wed, 09 Oct 2013 13:53:52 +0000 Introduction. The microarray datasets from the MicroArray Quality Control (MAQC) project have enabled the assessment of the precision, comparability of microarrays, and other various microarray analysis methods. However, to date no studies that we are aware of have reported the performance of missing value imputation schemes on the MAQC datasets. In this study, we use the MAQC Affymetrix datasets to evaluate several imputation procedures in Affymetrix microarrays. Results. We evaluated several cutting edge imputation procedures and compared them using different error measures. We randomly deleted 5% and 10% of the data and imputed the missing values using imputation tests. We performed 1000 simulations and averaged the results. The results for both 5% and 10% deletion are similar. Among the imputation methods, we observe the local least squares method with is most accurate under the error measures considered. The k-nearest neighbor method with has the highest error rate among imputation methods and error measures. Conclusions. We conclude for imputing missing values in Affymetrix microarray datasets, using the MAS 5.0 preprocessing scheme, the local least squares method with has the best overall performance and k-nearest neighbor method with has the worst overall performance. These results hold true for both 5% and 10% missing values. Sreevidya Sadananda Sadasiva Rao, Lori A. Shepherd, Andrew E. Bruno, Song Liu, and Jeffrey C. Miecznikowski Copyright © 2013 Sreevidya Sadananda Sadasiva Rao et al. All rights reserved. A Multilevel Gamma-Clustering Layout Algorithm for Visualization of Biological Networks Tue, 25 Jun 2013 15:26:07 +0000 Visualization of large complex networks has become an indispensable part of systems biology, where organisms need to be considered as one complex system. The visualization of the corresponding network is challenging due to the size and density of edges. In many cases, the use of standard visualization algorithms can lead to high running times and poorly readable visualizations due to many edge crossings. We suggest an approach that analyzes the structure of the graph first and then generates a new graph which contains specific semantic symbols for regular substructures like dense clusters. We propose a multilevel gamma-clustering layout visualization algorithm (MLGA) which proceeds in three subsequent steps: (i) a multilevel γ-clustering is used to identify the structure of the underlying network, (ii) the network is transformed to a tree, and (iii) finally, the resulting tree which shows the network structure is drawn using a variation of a force-directed algorithm. The algorithm has a potential to visualize very large networks because it uses modern clustering heuristics which are optimized for large graphs. Moreover, most of the edges are removed from the visual representation which allows keeping the overview over complex graphs with dense subgraphs. Tomas Hruz, Markus Wyss, Christoph Lucas, Oliver Laule, Peter von Rohr, Philip Zimmermann, and Stefan Bleuler Copyright © 2013 Tomas Hruz et al. All rights reserved. Computational and Statistical Approaches for Modeling of Proteomic and Genomic Networks Thu, 16 May 2013 09:08:24 +0000 Mohamed Nounou, Hazem Nounou, Erchin Serpedin, Aniruddha Datta, and Yufei Huang Copyright © 2013 Mohamed Nounou et al. All rights reserved. Reverse Engineering Sparse Gene Regulatory Networks Using Cubature Kalman Filter and Compressed Sensing Wed, 08 May 2013 11:21:31 +0000 This paper proposes a novel algorithm for inferring gene regulatory networks which makes use of cubature Kalman filter (CKF) and Kalman filter (KF) techniques in conjunction with compressed sensing methods. The gene network is described using a state-space model. A nonlinear model for the evolution of gene expression is considered, while the gene expression data is assumed to follow a linear Gaussian model. The hidden states are estimated using CKF. The system parameters are modeled as a Gauss-Markov process and are estimated using compressed sensing-based KF. These parameters provide insight into the regulatory relations among the genes. The Cramér-Rao lower bound of the parameter estimates is calculated for the system model and used as a benchmark to assess the estimation accuracy. The proposed algorithm is evaluated rigorously using synthetic data in different scenarios which include different number of genes and varying number of sample points. In addition, the algorithm is tested on the DREAM4 in silico data sets as well as the in vivo data sets from IRMA network. The proposed algorithm shows superior performance in terms of accuracy, robustness, and scalability. Amina Noor, Erchin Serpedin, Mohamed Nounou, and Hazem Nounou Copyright © 2013 Amina Noor et al. All rights reserved. Efficient Serial and Parallel Algorithms for Selection of Unique Oligos in EST Databases Mon, 08 Apr 2013 17:06:36 +0000 Obtaining unique oligos from an EST database is a problem of great importance in bioinformatics, particularly in the discovery of new genes and the mapping of the human genome. Many algorithms have been developed to find unique oligos, many of which are much less time consuming than the traditional brute force approach. An algorithm was presented by Zheng et al. (2004) which finds the solution of the unique oligos search problem efficiently. We implement this algorithm as well as several new algorithms based on some theorems included in this paper. We demonstrate how, with these new algorithms, we can obtain unique oligos much faster than with previous ones. We parallelize these new algorithms to further improve the time of finding unique oligos. All algorithms are run on ESTs obtained from a Barley EST database. Manrique Mata-Montero, Nabil Shalaby, and Bradley Sheppard Copyright © 2013 Manrique Mata-Montero et al. All rights reserved. Correction of Spatial Bias in Oligonucleotide Array Data Wed, 13 Mar 2013 15:09:36 +0000 Background. Oligonucleotide microarrays allow for high-throughput gene expression profiling assays. The technology relies on the fundamental assumption that observed hybridization signal intensities (HSIs) for each intended target, on average, correlate with their target’s true concentration in the sample. However, systematic, nonbiological variation from several sources undermines this hypothesis. Background hybridization signal has been previously identified as one such important source, one manifestation of which appears in the form of spatial autocorrelation. Results. We propose an algorithm, pyn, for the elimination of spatial autocorrelation in HSIs, exploiting the duality of desirable mutual information shared by probes in a common probe set and undesirable mutual information shared by spatially proximate probes. We show that this correction procedure reduces spatial autocorrelation in HSIs; increases HSI reproducibility across replicate arrays; increases differentially expressed gene detection power; and performs better than previously published methods. Conclusions. The proposed algorithm increases both precision and accuracy, while requiring virtually no changes to users’ current analysis pipelines: the correction consists merely of a transformation of raw HSIs (e.g., CEL files for Affymetrix arrays). A free, open-source implementation is provided as an R package, compatible with standard Bioconductor tools. The approach may also be tailored to other platform types and other sources of bias. Philippe Serhal and Sébastien Lemieux Copyright © 2013 Philippe Serhal and Sébastien Lemieux. All rights reserved. Gene Regulation, Modulation, and Their Applications in Gene Expression Data Analysis Wed, 13 Mar 2013 11:03:18 +0000 Common microarray and next-generation sequencing data analysis concentrate on tumor subtype classification, marker detection, and transcriptional regulation discovery during biological processes by exploring the correlated gene expression patterns and their shared functions. Genetic regulatory network (GRN) based approaches have been employed in many large studies in order to scrutinize for dysregulation and potential treatment controls. In addition to gene regulation and network construction, the concept of the network modulator that has significant systemic impact has been proposed, and detection algorithms have been developed in past years. Here we provide a unified mathematic description of these methods, followed with a brief survey of these modulator identification algorithms. As an early attempt to extend the concept to new RNA regulation mechanism, competitive endogenous RNA (ceRNA), into a modulator framework, we provide two applications to illustrate the network construction, modulation effect, and the preliminary finding from these networks. Those methods we surveyed and developed are used to dissect the regulated network under different modulators. Not limit to these, the concept of “modulation” can adapt to various biological mechanisms to discover the novel gene regulation mechanisms. Mario Flores, Tzu-Hung Hsiao, Yu-Chiao Chiu, Eric Y. Chuang, Yufei Huang, and Yidong Chen Copyright © 2013 Mario Flores et al. All rights reserved. Spectral Analysis on Time-Course Expression Data: Detecting Periodic Genes Using a Real-Valued Iterative Adaptive Approach Thu, 28 Feb 2013 15:42:47 +0000 Time-course expression profiles and methods for spectrum analysis have been applied for detecting transcriptional periodicities, which are valuable patterns to unravel genes associated with cell cycle and circadian rhythm regulation. However, most of the proposed methods suffer from restrictions and large false positives to a certain extent. Additionally, in some experiments, arbitrarily irregular sampling times as well as the presence of high noise and small sample sizes make accurate detection a challenging task. A novel scheme for detecting periodicities in time-course expression data is proposed, in which a real-valued iterative adaptive approach (RIAA), originally proposed for signal processing, is applied for periodogram estimation. The inferred spectrum is then analyzed using Fisher’s hypothesis test. With a proper -value threshold, periodic genes can be detected. A periodic signal, two nonperiodic signals, and four sampling strategies were considered in the simulations, including both bursts and drops. In addition, two yeast real datasets were applied for validation. The simulations and real data analysis reveal that RIAA can perform competitively with the existing algorithms. The advantage of RIAA is manifested when the expression data are highly irregularly sampled, and when the number of cycles covered by the sampling time points is very reduced. Kwadwo S. Agyepong, Fang-Han Hsu, Edward R. Dougherty, and Erchin Serpedin Copyright © 2013 Kwadwo S. Agyepong et al. All rights reserved. Identification of Robust Pathway Markers for Cancer through Rank-Based Pathway Activity Inference Wed, 27 Feb 2013 09:47:10 +0000 One important problem in translational genomics is the identification of reliable and reproducible markers that can be used to discriminate between different classes of a complex disease, such as cancer. The typical small sample setting makes the prediction of such markers very challenging, and various approaches have been proposed to address this problem. For example, it has been shown that pathway markers, which aggregate the gene activities in the same pathway, tend to be more robust than gene markers. Furthermore, the use of gene expression ranking has been demonstrated to be robust to batch effects and that it can lead to more interpretable results. In this paper, we propose an enhanced pathway activity inference method that uses gene ranking to predict the pathway activity in a probabilistic manner. The main focus of this work is on identifying robust pathway markers that can ultimately lead to robust classifiers with reproducible performance across datasets. Simulation results based on multiple breast cancer datasets show that the proposed inference method identifies better pathway markers that can predict breast cancer metastasis with higher accuracy. Moreover, the identified pathway markers can lead to better classifiers with more consistent classification performance across independent datasets. Navadon Khunlertgit and Byung-Jun Yoon Copyright © 2013 Navadon Khunlertgit and Byung-Jun Yoon. All rights reserved. An Overview of the Statistical Methods Used for Inferring Gene Regulatory Networks and Protein-Protein Interaction Networks Thu, 21 Feb 2013 15:22:25 +0000 The large influx of data from high-throughput genomic and proteomic technologies has encouraged the researchers to seek approaches for understanding the structure of gene regulatory networks and proteomic networks. This work reviews some of the most important statistical methods used for modeling of gene regulatory networks (GRNs) and protein-protein interaction (PPI) networks. The paper focuses on the recent advances in the statistical graphical modeling techniques, state-space representation models, and information theoretic methods that were proposed for inferring the topology of GRNs. It appears that the problem of inferring the structure of PPI networks is quite different from that of GRNs. Clustering and probabilistic graphical modeling techniques are of prime importance in the statistical inference of PPI networks, and some of the recent approaches using these techniques are also reviewed in this paper. Performance evaluation criteria for the approaches used for modeling GRNs and PPI networks are also discussed. Amina Noor, Erchin Serpedin, Mohamed Nounou, Hazem Nounou, Nady Mohamed, and Lotfi Chouchane Copyright © 2013 Amina Noor et al. All rights reserved. Using Protein Clusters from Whole Proteomes to Construct and Augment a Dendrogram Wed, 20 Feb 2013 08:15:54 +0000 In this paper we present a new ab initio approach for constructing an unrooted dendrogram using protein clusters, an approach that has the potential for estimating relationships among several thousands of species based on their putative proteomes. We employ an open-source software program called pClust that was developed for use in metagenomic studies. Sequence alignment is performed by pClust using the Smith-Waterman algorithm, which is known to give optimal alignment and, hence, greater accuracy than BLAST-based methods. Protein clusters generated by pClust are used to create protein profiles for each species in the dendrogram, these profiles forming a correlation filter library for use with a new taxon. To augment the dendrogram with a new taxon, a protein profile for the taxon is created using BLASTp, and this new taxon is placed into a position within the dendrogram corresponding to the highest correlation with profiles in the correlation filter library. This work was initiated because of our interest in plasmids, and each step is illustrated using proteomes from Gram-negative bacterial plasmids. Proteomes for 527 plasmids were used to generate the dendrogram, and to demonstrate the utility of the insertion algorithm twelve recently sequenced pAKD plasmids were used to augment the dendrogram. Yunyun Zhou, Douglas R. Call, and Shira L. Broschat Copyright © 2013 Yunyun Zhou et al. All rights reserved. Solving the 0/1 Knapsack Problem by a Biomolecular DNA Computer Mon, 18 Feb 2013 07:55:04 +0000 Solving some mathematical problems such as NP-complete problems by conventional silicon-based computers is problematic and takes so long time. DNA computing is an alternative method of computing which uses DNA molecules for computing purposes. DNA computers have massive degrees of parallel processing capability. The massive parallel processing characteristic of DNA computers is of particular interest in solving NP-complete and hard combinatorial problems. NP-complete problems such as knapsack problem and other hard combinatorial problems can be easily solved by DNA computers in a very short period of time comparing to conventional silicon-based computers. Sticker-based DNA computing is one of the methods of DNA computing. In this paper, the sticker based DNA computing was used for solving the 0/1 knapsack problem. At first, a biomolecular solution space was constructed by using appropriate DNA memory complexes. Then, by the application of a sticker-based parallel algorithm using biological operations, knapsack problem was resolved in polynomial time. Hassan Taghipour, Mahdi Rezaei, and Heydar Ali Esmaili Copyright © 2013 Hassan Taghipour et al. All rights reserved. MRMPath and MRMutation, Facilitating Discovery of Mass Transitions for Proteotypic Peptides in Biological Pathways Using a Bioinformatics Approach Tue, 29 Jan 2013 14:45:02 +0000 Quantitative proteomics applications in mass spectrometry depend on the knowledge of the mass-to-charge ratio (m/z) values of proteotypic peptides for the proteins under study and their product ions. MRMPath and MRMutation, web-based bioinformatics software that are platform independent, facilitate the recovery of this information by biologists. MRMPath utilizes publicly available information related to biological pathways in the Kyoto Encyclopedia of Genes and Genomes (KEGG) database. All the proteins involved in pathways of interest are recovered and processed in silico to extract information relevant to quantitative mass spectrometry analysis. Peptides may also be subjected to automated BLAST analysis to determine whether they are proteotypic. MRMutation catalogs and makes available, following processing, known (mutant) variants of proteins from the current UniProtKB database. All these results, available via the web from well-maintained, public databases, are written to an Excel spreadsheet, which the user can download and save. MRMPath and MRMutation can be freely accessed. As a system that seeks to allow two or more resources to interoperate, MRMPath represents an advance in bioinformatics tool development. As a practical matter, the MRMPath automated approach represents significant time savings to researchers. Chiquito Crasto, Chandrahas Narne, Mikako Kawai, Landon Wilson, and Stephen Barnes Copyright © 2013 Chiquito Crasto et al. All rights reserved. Statistical Analysis of Terminal Extensions of Protein β-Strand Pairs Mon, 28 Jan 2013 14:05:10 +0000 The long-range interactions, required to the accurate predictions of tertiary structures of β-sheet-containing proteins, are still difficult to simulate. To remedy this problem and to facilitate β-sheet structure predictions, many efforts have been made by computational methods. However, known efforts on β-sheets mainly focus on interresidue contacts or amino acid partners. In this study, to go one step further, we studied β-sheets on the strand level, in which a statistical analysis was made on the terminal extensions of paired β-strands. In most cases, the two paired β-strands have different lengths, and terminal extensions exist. The terminal extensions are the extended part of the paired strands besides the common paired part. However, we found that the best pairing required a terminal alignment, and β-strands tend to pair to make bigger common parts. As a result, 96.97%  of β-strand pairs have a ratio of 25% of the paired common part to the whole length. Also 94.26% and 95.98%  of β-strand pairs have a ratio of 40% of the paired common part to the length of the two β-strands, respectively. Interstrand register predictions by searching interacting β-strands from several alternative offsets should comply with this rule to reduce the computational searching space to improve the performances of algorithms. Ning Zhang, Shan Gao, Lei Zhang, Jishou Ruan, and Tao Zhang Copyright © 2013 Ning Zhang et al. All rights reserved. Literature Mining Solutions for Life Science Research Tue, 08 Jan 2013 09:25:35 +0000 Jörg Hakenberg, Goran Nenadic, Dietrich Rebholz-Schuhmann, and Jin-Dong Kim Copyright © 2013 Jörg Hakenberg et al. All rights reserved. In Silico Docking of HNF-1a Receptor Ligands Wed, 19 Dec 2012 14:32:52 +0000 Background. HNF-1a is a transcription factor that regulates glucose metabolism by expression in various tissues. Aim. To dock potential ligands of HNF-1a using docking software in silico. Methods. We performed in silico studies using HNF-1a protein 2GYP·pdb and the following softwares: ISIS/Draw 2.5SP4, ARGUSLAB 4.0.1, and HEX5.1. Observations. The docking distances (in angstrom units: 1 angstrom unit (Å) = 0.1 nanometer or  metres) with ligands in decreasing order are as follows: resveratrol (3.8 Å), aspirin (4.5 Å), stearic acid (4.9 Å), retinol (6.0 Å), nitrazepam (6.8 Å), ibuprofen (7.9 Å), azulfidine (9.0 Å), simvastatin (9.0 Å), elaidic acid (10.1 Å), and oleic acid (11.6 Å). Conclusion. HNF-1a domain interacted most closely with resveratrol and aspirin Gumpeny Ramachandra Sridhar, Padmanabhuni Venkata Nageswara Rao, Dowluru SVGK Kaladhar, Tatavarthi Uma Devi, and Sali Veeresh Kumar Copyright © 2012 Gumpeny Ramachandra Sridhar et al. All rights reserved. Do Peers See More in a Paper Than Its Authors? Tue, 27 Nov 2012 11:28:07 +0000 Recent years have shown a gradual shift in the content of biomedical publications that is freely accessible, from titles and abstracts to full text. This has enabled new forms of automatic text analysis and has given rise to some interesting questions: How informative is the abstract compared to the full-text? What important information in the full-text is not present in the abstract? What should a good summary contain that is not already in the abstract? Do authors and peers see an article differently? We answer these questions by comparing the information content of the abstract to that in citances—sentences containing citations to that article. We contrast the important points of an article as judged by its authors versus as seen by peers. Focusing on the area of molecular interactions, we perform manual and automatic analysis, and we find that the set of all citances to a target article not only covers most information (entities, functions, experimental methods, and other biological concepts) found in its abstract, but also contains 20% more concepts. We further present a detailed summary of the differences across information types, and we examine the effects other citations and time have on the content of citances. Anna Divoli, Preslav Nakov, and Marti A. Hearst Copyright © 2012 Anna Divoli et al. All rights reserved. Wavelet Packet Entropy for Heart Murmurs Classification Sun, 25 Nov 2012 15:35:27 +0000 Heart murmurs are the first signs of cardiac valve disorders. Several studies have been conducted in recent years to automatically differentiate normal heart sounds, from heart sounds with murmurs using various types of audio features. Entropy was successfully used as a feature to distinguish different heart sounds. In this paper, new entropy was introduced to analyze heart sounds and the feasibility of using this entropy in classification of five types of heart sounds and murmurs was shown. The entropy was previously introduced to analyze mammograms. Four common murmurs were considered including aortic regurgitation, mitral regurgitation, aortic stenosis, and mitral stenosis. Wavelet packet transform was employed for heart sound analysis, and the entropy was calculated for deriving feature vectors. Five types of classification were performed to evaluate the discriminatory power of the generated features. The best results were achieved by BayesNet with 96.94% accuracy. The promising results substantiate the effectiveness of the proposed wavelet packet entropy for heart sounds classification. Fatemeh Safara, Shyamala Doraisamy, Azreen Azman, Azrul Jantan, and Sri Ranga Copyright © 2012 Fatemeh Safara et al. All rights reserved. On the Meaning of Affinity Limits in B-Cell Epitope Prediction for Antipeptide Antibody-Mediated Immunity Wed, 14 Nov 2012 15:09:41 +0000 B-cell epitope prediction aims to aid the design of peptide-based immunogens (e.g., vaccines) for eliciting antipeptide antibodies that protect against disease, but such antibodies fail to confer protection and even promote disease if they bind with low affinity. Hence, the Immune Epitope Database (IEDB) was searched to obtain published thermodynamic and kinetic data on binding interactions of antipeptide antibodies. The data suggest that the affinity of the antibodies for their immunizing peptides appears to be limited in a manner consistent with previously proposed kinetic constraints on affinity maturation in vivo and that cross-reaction of the antibodies with proteins tends to occur with lower affinity than the corresponding reaction of the antibodies with their immunizing peptides. These observations better inform B-cell epitope prediction to avoid overestimating the affinity for both active and passive immunization; whereas active immunization is subject to limitations of affinity maturation in vivo and of the capacity to accumulate endogenous antibodies, passive immunization may transcend such limitations, possibly with the aid of artificial affinity-selection processes and of protein engineering. Additionally, protein disorder warrants further investigation as a possible supplementary criterion for B-cell epitope prediction, where such disorder obviates thermodynamically unfavorable protein structural adjustments in cross-reactions between antipeptide antibodies and proteins. Salvador Eugenio C. Caoili Copyright © 2012 Salvador Eugenio C. Caoili. All rights reserved. Application of an Integrative Computational Framework in Trancriptomic Data of Atherosclerotic Mice Suggests Numerous Molecular Players Tue, 06 Nov 2012 15:37:37 +0000 Atherosclerosis is a multifactorial disease involving a lot of genes and proteins recruited throughout its manifestation. The present study aims to exploit bioinformatic tools in order to analyze microarray data of atherosclerotic aortic lesions of ApoE knockout mice, a model widely used in atherosclerosis research. In particular, a dynamic analysis was performed among young and aged animals, resulting in a list of 852 significantly altered genes. Pathway analysis indicated alterations in critical cellular processes related to cell communication and signal transduction, immune response, lipid transport, and metabolism. Cluster analysis partitioned the significantly differentiated genes in three major clusters of similar expression profile. Promoter analysis applied to functional related groups of the same cluster revealed shared putative cis-elements potentially contributing to a common regulatory mechanism. Finally, by reverse engineering the functional relevance of differentially expressed genes with specific cellular pathways, putative genes acting as hubs, were identified, linking functionally disparate cellular processes in the context of traditional molecular description. Olga Papadodima, Allan Sirsjö, Fragiskos N. Kolisis, and Aristotelis Chatziioannou Copyright © 2012 Olga Papadodima et al. All rights reserved. Intervention in Biological Phenomena via Feedback Linearization Tue, 06 Nov 2012 11:17:02 +0000 The problems of modeling and intervention of biological phenomena have captured the interest of many researchers in the past few decades. The aim of the therapeutic intervention strategies is to move an undesirable state of a diseased network towards a more desirable one. Such an objective can be achieved by the application of drugs to act on some genes/metabolites that experience the undesirable behavior. For the purpose of design and analysis of intervention strategies, mathematical models that can capture the complex dynamics of the biological systems are needed. S-systems, which offer a good compromise between accuracy and mathematical flexibility, are a promising framework for modeling the dynamical behavior of biological phenomena. Due to the complex nonlinear dynamics of the biological phenomena represented by S-systems, nonlinear intervention schemes are needed to cope with the complexity of the nonlinear S-system models. Here, we present an intervention technique based on feedback linearization for biological phenomena modeled by S-systems. This technique is based on perfect knowledge of the S-system model. The proposed intervention technique is applied to the glycolytic-glycogenolytic pathway, and simulation results presented demonstrate the effectiveness of the proposed technique. Mohamed Amine Fnaiech, Hazem Nounou, Mohamed Nounou, and Aniruddha Datta Copyright © 2012 Mohamed Amine Fnaiech et al. All rights reserved. MicroRNA Response Elements-Mediated miRNA-miRNA Interactions in Prostate Cancer Sun, 04 Nov 2012 08:29:37 +0000 The cell is a highly organized system of interacting molecules including proteins, mRNAs, and miRNAs. Analyzing the cell from a systems perspective by integrating different types of data helps revealing the complexity of diseases. Although there is emerging evidence that microRNAs have a functional role in cancer, the role of microRNAs in mediating cancer progression and metastasis remains not fully explored. As the amount of available miRNA and mRNA gene expression data grows, more systematic methods combining gene expression and biological networks become necessary to explore miRNA function. In this work I integrated functional miRNA-target interactions with mRNA and miRNA expression to infer mRNA-mediated miRNA-miRNA interactions. The inferred network represents miRNA modulation through common targets. The network is used to characterize the functional role of microRNA response element (MRE) to mediate interactions between miRNAs targeting the MRE. Results revealed that miRNA-1 is a key player in regulating prostate cancer progression. 11 miRNAs were identified as diagnostic and prognostic biomarkers that act as tumor suppressor miRNAs. This work demonstrates the utility of a network analysis as opposed to differential expression to find important miRNAs that regulate prostate cancer. Mohammed Alshalalfa Copyright © 2012 Mohammed Alshalalfa. All rights reserved. Flux Analysis of the Trypanosoma brucei Glycolysis Based on a Multiobjective-Criteria Bioinformatic Approach Sat, 13 Oct 2012 10:16:56 +0000 Trypanosoma brucei is a protozoan parasite of major of interest in discovering new genes for drug targets. This parasite alternates its life cycle between the mammal host(s) (bloodstream form) and the insect vector (procyclic form), with two divergent glucose metabolism amenable to in vitro culture. While the metabolic network of the bloodstream forms has been well characterized, the flux distribution between the different branches of the glucose metabolic network in the procyclic form has not been addressed so far. We present a computational analysis (called Metaboflux) that exploits the metabolic topology of the procyclic form, and allows the incorporation of multipurpose experimental data to increase the biological relevance of the model. The alternatives resulting from the structural complexity of networks are formulated as an optimization problem solved by a metaheuristic where experimental data are modeled in a multiobjective function. Our results show that the current metabolic model is in agreement with experimental data and confirms the observed high metabolic flexibility of glucose metabolism. In addition, Metaboflux offers a rational explanation for the high flexibility in the ratio between final products from glucose metabolism, thsat is, flux redistribution through the malic enzyme steps. Amine Ghozlane, Frédéric Bringaud, Hayssam Soueidan, Isabelle Dutour, Fabien Jourdan, and Patricia Thébault Copyright © 2012 Amine Ghozlane et al. All rights reserved.