Experimental results show that our solution based on the Sadakane’s compressed index consumes significantly less space than the ones based on noncompressed data structures like the suffix tree and the enhanced suffix array. Our experimental results show that our parallel algorithm is efficient and scales well with increasing number of processors. Maan Haj Rachid, Qutaibah Malluhi, and Mohamed Abouelhoda Copyright © 2014 Maan Haj Rachid et al. All rights reserved. A Knowledge-Driven Approach to Extract Disease-Related Biomarkers from the Literature Wed, 16 Apr 2014 15:51:54 +0000 The biomedical literature represents a rich source of biomarker information. However, both the size of literature databases and their lack of standardization hamper the automatic exploitation of the information contained in these resources. Text mining approaches have proven to be useful for the exploitation of information contained in the scientific publications. Here, we show that a knowledge-driven text mining approach can exploit a large literature database to extract a dataset of biomarkers related to diseases covering all therapeutic areas. Our methodology takes advantage of the annotation of MEDLINE publications pertaining to biomarkers with MeSH terms, narrowing the search to specific publications and, therefore, minimizing the false positive ratio. It is based on a dictionary-based named entity recognition system and a relation extraction module. The application of this methodology resulted in the identification of 131,012 disease-biomarker associations between 2,803 genes and 2,751 diseases, and represents a valuable knowledge base for those interested in disease-related biomarkers. Additionally, we present a bibliometric analysis of the journals reporting biomarker related information during the last 40 years. À. Bravo, M. Cases, N. Queralt-Rosinach, F. Sanz, and L. I. Furlong Copyright © 2014 À. Bravo et al. All rights reserved. Integrated Analysis of Gene Network in Childhood Leukemia from Microarray and Pathway Databases Tue, 15 Apr 2014 14:07:22 +0000 Glucocorticoids (GCs) have been used as therapeutic agents for children with acute lymphoblastic leukaemia (ALL) for over 50 years. However, much remains to be understood about the molecular mechanism of GCs actions in ALL subtypes. In this study, we delineate differential responses of ALL subtypes, B- and T-ALL, to GCs treatment at systems level by identifying the differences among biological processes, molecular pathways, and interaction networks that emerge from the action of GCs through the use of a selected number of available bioinformatics methods and tools. We provide biological insight into GC-regulated genes, their related functions, and their networks specific to the ALL subtypes. We show that differentially expressed GC-regulated genes participate in distinct underlying biological processes affected by GCs in B-ALL and T-ALL with little to no overlap. These findings provide the opportunity towards identifying new therapeutic targets. Amphun Chaiboonchoe, Sandhya Samarasinghe, Don Kulasiri, and Kourosh Salehi-Ashtiani Copyright © 2014 Amphun Chaiboonchoe et al. All rights reserved. A Novel Algorithm for Detecting Protein Complexes with the Breadth First Search Thu, 10 Apr 2014 11:03:26 +0000 Most biological processes are carried out by protein complexes. A substantial number of false positives of the protein-protein interaction (PPI) data can compromise the utility of the datasets for complexes reconstruction. In order to reduce the impact of such discrepancies, a number of data integration and affinity scoring schemes have been devised. The methods encode the reliabilities (confidence) of physical interactions between pairs of proteins. The challenge now is to identify novel and meaningful protein complexes from the weighted PPI network. To address this problem, a novel protein complex mining algorithm ClusterBFS (Cluster with Breadth-First Search) is proposed. Based on the weighted density, ClusterBFS detects protein complexes of the weighted network by the breadth first search algorithm, which originates from a given seed protein used as starting-point. The experimental results show that ClusterBFS performs significantly better than the other computational approaches in terms of the identification of protein complexes. Xiwei Tang, Jianxin Wang, Min Li, Yiming He, and Yi Pan Copyright © 2014 Xiwei Tang et al. All rights reserved. Gene Expression Correlation for Cancer Diagnosis: A Pilot Study Wed, 09 Apr 2014 14:12:08 +0000 Poor prognosis for late-stage, high-grade, and recurrent cancers has been motivating cancer researchers to search for more efficient biomarkers to identify the onset of cancer. Recent advances in constructing and dynamically analyzing biomolecular networks for different types of cancer have provided a promising novel strategy to detect tumorigenesis and metastasis. The observation of different biomolecular networks associated with normal and cancerous states led us to hypothesize that correlations for gene expressions could serve as valid indicators of early cancer development. In this pilot study, we tested our hypothesis by examining whether the mRNA expressions of three randomly selected cancer-related genes PIK3C3, PIM3, and PTEN were correlated during cancer progression and the correlation coefficients could be used for cancer diagnosis. Strong correlations were observed between PIK3C3 and PIM3 in breast cancer, between PIK3C3 and PTEN in breast and ovary cancers, and between PIM3 and PTEN in breast, kidney, liver, and thyroid cancers during disease progression, implicating that the correlations for cancer network gene expressions could serve as a supplement to current clinical biomarkers, such as cancer antigens, for early cancer diagnosis. Binbing Ling, Lifeng Chen, Qiang Liu, and Jian Yang Copyright © 2014 Binbing Ling et al. All rights reserved. Computational Systems Biology Methods in Molecular Biology, Chemistry Biology, Molecular Biomedicine, and Biopharmacy Wed, 09 Apr 2014 13:17:43 +0000 Yudong Cai, Julio Vera González, Zengrong Liu, and Tao Huang Copyright © 2014 Yudong Cai et al. All rights reserved. Tools and Databases of the KOMICS Web Portal for Preprocessing, Mining, and Dissemination of Metabolomics Data Wed, 09 Apr 2014 12:35:01 +0000 A metabolome—the collection of comprehensive quantitative data on metabolites in an organism—has been increasingly utilized for applications such as data-intensive systems biology, disease diagnostics, biomarker discovery, and assessment of food quality. A considerable number of tools and databases have been developed to date for the analysis of data generated by various combinations of chromatography and mass spectrometry. We report here a web portal named KOMICS (The Kazusa Metabolomics Portal), where the tools and databases that we developed are available for free to academic users. KOMICS includes the tools and databases for preprocessing, mining, visualization, and publication of metabolomics data. Improvements in the annotation of unknown metabolites and dissemination of comprehensive metabolomic data are the primary aims behind the development of this portal. For this purpose, PowerGet and FragmentAlign include a manual curation function for the results of metabolite feature alignments. A metadata-specific wiki-based database, Metabolonote, functions as a hub of web resources related to the submitters' work. This feature is expected to increase citation of the submitters' work, thereby promoting data publication. As an example of the practical use of KOMICS, a workflow for a study on Jatropha curcas is presented. The tools and databases available at KOMICS should contribute to enhanced production, interpretation, and utilization of metabolomic Big Data. Nozomu Sakurai, Takeshi Ara, Mitsuo Enomoto, Takeshi Motegi, Yoshihiko Morishita, Atsushi Kurabayashi, Yoko Iijima, Yoshiyuki Ogata, Daisuke Nakajima, Hideyuki Suzuki, and Daisuke Shibata Copyright © 2014 Nozomu Sakurai et al. All rights reserved. An Infrastructure to Mine Molecular Descriptors for Ligand Selection on Virtual Screening Wed, 09 Apr 2014 11:34:08 +0000 The receptor-ligand interaction evaluation is one important step in rational drug design. The databases that provide the structures of the ligands are growing on a daily basis. This makes it impossible to test all the ligands for a target receptor. Hence, a ligand selection before testing the ligands is needed. One possible approach is to evaluate a set of molecular descriptors. With the aim of describing the characteristics of promising compounds for a specific receptor we introduce a data warehouse-based infrastructure to mine molecular descriptors for virtual screening (VS). We performed experiments that consider as target the receptor HIV-1 protease and different compounds for this protein. A set of 9 molecular descriptors are taken as the predictive attributes and the free energy of binding is taken as a target attribute. By applying the J48 algorithm over the data we obtain decision tree models that achieved up to 84% of accuracy. The models indicate which molecular descriptors and their respective values are relevant to influence good FEB results. Using their rules we performed ligand selection on ZINC database. Our results show important reduction in ligands selection to be applied in VS experiments; for instance, the best selection model picked only 0.21% of the total amount of drug-like ligands. Vinicius Rosa Seus, Giovanni Xavier Perazzo, Ana T. Winck, Adriano V. Werhli, and Karina S. Machado Copyright © 2014 Vinicius Rosa Seus et al. All rights reserved. An Intelligent Clinical Decision Support System for Patient-Specific Predictions to Improve Cervical Intraepithelial Neoplasia Detection Wed, 09 Apr 2014 08:12:50 +0000 Nowadays, there are molecular biology techniques providing information related to cervical cancer and its cause: the human Papillomavirus (HPV), including DNA microarrays identifying HPV subtypes, mRNA techniques such as nucleic acid based amplification or flow cytometry identifying E6/E7 oncogenes, and immunocytochemistry techniques such as overexpression of p16. Each one of these techniques has its own performance, limitations and advantages, thus a combinatorial approach via computational intelligence methods could exploit the benefits of each method and produce more accurate results. In this article we propose a clinical decision support system (CDSS), composed by artificial neural networks, intelligently combining the results of classic and ancillary techniques for diagnostic accuracy improvement. We evaluated this method on 740 cases with complete series of cytological assessment, molecular tests, and colposcopy examination. The CDSS demonstrated high sensitivity (89.4%), high specificity (97.1%), high positive predictive value (89.4%), and high negative predictive value (97.1%), for detecting cervical intraepithelial neoplasia grade 2 or worse (CIN2+). In comparison to the tests involved in this study and their combinations, the CDSS produced the most balanced results in terms of sensitivity, specificity, PPV, and NPV. The proposed system may reduce the referral rate for colposcopy and guide personalised management and therapeutic interventions. Panagiotis Bountris, Maria Haritou, Abraham Pouliakis, Niki Margari, Maria Kyrgiou, Aris Spathis, Asimakis Pappas, Ioannis Panayiotides, Evangelos A. Paraskevaidis, Petros Karakitsos, and Dimitrios-Dionyssios Koutsouris Copyright © 2014 Panagiotis Bountris et al. All rights reserved. Supervised Clustering Based on DPClusO: Prediction of Plant-Disease Relations Using Jamu Formulas of KNApSAcK Database Mon, 07 Apr 2014 14:04:55 +0000 Indonesia has the largest medicinal plant species in the world and these plants are used as Jamu medicines. Jamu medicines are popular traditional medicines from Indonesia and we need to systemize the formulation of Jamu and develop basic scientific principles of Jamu to meet the requirement of Indonesian Healthcare System. We propose a new approach to predict the relation between plant and disease using network analysis and supervised clustering. At the preliminary step, we assigned 3138 Jamu formulas to 116 diseases of International Classification of Diseases (ver. 10) which belong to 18 classes of disease from National Center for Biotechnology Information. The correlation measures between Jamu pairs were determined based on their ingredient similarity. Networks are constructed and analyzed by selecting highly correlated Jamu pairs. Clusters were then generated by using the network clustering algorithm DPClusO. By using matching score of a cluster, the dominant disease and high frequency plant associated to the cluster are determined. The plant to disease relations predicted by our method were evaluated in the context of previously published results and were found to produce around 90% successful predictions. Sony Hartono Wijaya, Husnawati Husnawati, Farit Mochamad Afendi, Irmanida Batubara, Latifah K. Darusman, Md. Altaf-Ul-Amin, Tetsuo Sato, Naoaki Ono, Tadao Sugiura, and Shigehiko Kanaya Copyright © 2014 Sony Hartono Wijaya et al. All rights reserved. Combining Haar Wavelet and Karhunen Loeve Transforms for Medical Images Watermarking Mon, 07 Apr 2014 08:14:41 +0000 This paper presents a novel watermarking method, applied to the medical imaging domain, used to embed the patient’s data into the corresponding image or set of images used for the diagnosis. The main objective behind the proposed technique is to perform the watermarking of the medical images in such a way that the three main attributes of the hidden information (i.e., imperceptibility, robustness, and integration rate) can be jointly ameliorated as much as possible. These attributes determine the effectiveness of the watermark, resistance to external attacks, and increase the integration rate. In order to improve the robustness, a combination of the characteristics of Discrete Wavelet and Karhunen Loeve Transforms is proposed. The Karhunen Loeve Transform is applied on the subblocks (sized ) of the different wavelet coefficients (in the HL2, LH2, and HH2 subbands). In this manner, the watermark will be adapted according to the energy values of each of the Karhunen Loeve components, with the aim of ensuring a better watermark extraction under various types of attacks. For the correct identification of inserted data, the use of an Errors Correcting Code (ECC) mechanism is required for the check and, if possible, the correction of errors introduced into the inserted data. Concerning the enhancement of the imperceptibility factor, the main goal is to determine the optimal value of the visibility factor, which depends on several parameters of the DWT and the KLT transforms. As a first step, a Fuzzy Inference System (FIS) has been set up and then applied to determine an initial visibility factor value. Several features extracted from the Cooccurrence matrix are used as an input to the FIS and used to determine an initial visibility factor for each block; these values are subsequently reweighted in function of the eigenvalues extracted from each subblock. Regarding the integration rate, the previous works insert one bit per coefficient. In our proposal, the integration of the data to be hidden is 3 bits per coefficient so that we increase the integration rate by a factor of magnitude 3. Mohamed Ali Hajjaji, El-Bay Bourennane, Abdessalem Ben Abdelali, and Abdellatif Mtibaa Copyright © 2014 Mohamed Ali Hajjaji et al. All rights reserved. A Novel Feature Selection Strategy for Enhanced Biomedical Event Extraction Using the Turku System Sun, 06 Apr 2014 07:51:12 +0000 Feature selection is of paramount importance for text-mining classifiers with high-dimensional features. The Turku Event Extraction System (TEES) is the best performing tool in the GENIA BioNLP 2009/2011 shared tasks, which relies heavily on high-dimensional features. This paper describes research which, based on an implementation of an accumulated effect evaluation (AEE) algorithm applying the greedy search strategy, analyses the contribution of every single feature class in TEES with a view to identify important features and modify the feature set accordingly. With an updated feature set, a new system is acquired with enhanced performance which achieves an increased -score of 53.27% up from 51.21% for Task 1 under strict evaluation criteria and 57.24% according to the approximate span and recursive criterion. Jingbo Xia, Alex Chengyu Fang, and Xing Zhang Copyright © 2014 Jingbo Xia et al. All rights reserved. A Novel Bioinformatics Method for Efficient Knowledge Discovery by BLSOM from Big Genomic Sequence Data Thu, 03 Apr 2014 13:31:48 +0000 With remarkable increase of genomic sequence data of a wide range of species, novel tools are needed for comprehensive analyses of the big sequence data. Self-Organizing Map (SOM) is an effective tool for clustering and visualizing high-dimensional data such as oligonucleotide composition on one map. By modifying the conventional SOM, we have previously developed Batch-Learning SOM (BLSOM), which allows classification of sequence fragments according to species, solely depending on the oligonucleotide composition. In the present study, we introduce the oligonucleotide BLSOM used for characterization of vertebrate genome sequences. We first analyzed pentanucleotide compositions in 100 kb sequences derived from a wide range of vertebrate genomes and then the compositions in the human and mouse genomes in order to investigate an efficient method for detecting differences between the closely related genomes. BLSOM can recognize the species-specific key combination of oligonucleotide frequencies in each genome, which is called a “genome signature,” and the specific regions specifically enriched in transcription-factor-binding sequences. Because the classification and visualization power is very high, BLSOM is an efficient powerful tool for extracting a wide range of information from massive amounts of genomic sequences (i.e., big sequence data). Yu Bai, Yuki Iwasaki, Shigehiko Kanaya, Yue Zhao, and Toshimichi Ikemura Copyright © 2014 Yu Bai et al. All rights reserved. msiDBN: A Method of Identifying Critical Proteins in Dynamic PPI Networks Wed, 02 Apr 2014 12:56:21 +0000 Dynamics of protein-protein interactions (PPIs) reveals the recondite principles of biological processes inside a cell. Shown in a wealth of study, just a small group of proteins, rather than the majority, play more essential roles at crucial points of biological processes. This present work focuses on identifying these critical proteins exhibiting dramatic structural changes in dynamic PPI networks. First, a comprehensive way of modeling the dynamic PPIs is presented which simultaneously analyzes the activity of proteins and assembles the dynamic coregulation correlation between proteins at each time point. Second, a novel method is proposed, named msiDBN, which models a common representation of multiple PPI networks using a deep belief network framework and analyzes the reconstruction errors and the variabilities across the time courses in the biological process. Experiments were implemented on data of yeast cell cycles. We evaluated our network construction method by comparing the functional representations of the derived networks with two other traditional construction methods. The ranking results of critical proteins in msiDBN were compared with the results from the baseline methods. The results of comparison showed that msiDBN had better reconstruction rate and identified more proteins of critical value to yeast cell cycle process. Yuan Zhang, Nan Du, Kang Li, Jinchao Feng, Kebin Jia, and Aidong Zhang Copyright © 2014 Yuan Zhang et al. All rights reserved. Applied Graph-Mining Algorithms to Study Biomolecular Interaction Networks Wed, 02 Apr 2014 11:57:36 +0000 Protein-protein interaction (PPI) networks carry vital information on the organization of molecular interactions in cellular systems. The identification of functionally relevant modules in PPI networks is one of the most important applications of biological network analysis. Computational analysis is becoming an indispensable tool to understand large-scale biomolecular interaction networks. Several types of computational methods have been developed and employed for the analysis of PPI networks. Of these computational methods, graph comparison and module detection are the two most commonly used strategies. This review summarizes current literature on graph kernel and graph alignment methods for graph comparison strategies, as well as module detection approaches including seed-and-extend, hierarchical clustering, optimization-based, probabilistic, and frequent subgraph methods. Herein, we provide a comprehensive review of the major algorithms employed under each theme, including our recently published frequent subgraph method, for detecting functional modules commonly shared across multiple cancer PPI networks. Ru Shen and Chittibabu Guda Copyright © 2014 Ru Shen and Chittibabu Guda. All rights reserved. An Unsupervised Approach to Predict Functional Relations between Genes Based on Expression Data Mon, 31 Mar 2014 07:16:16 +0000 This work presents a novel approach to predict functional relations between genes using gene expression data. Genes may have various types of relations between them, for example, regulatory relations, or they may be concerned with the same protein complex or metabolic/signaling pathways and obviously gene expression data should contain some clues to such relations. The present approach first digitizes the log-ratio type gene expression data of S. cerevisiae to a matrix consisting of 1, 0, and −1 indicating highly expressed, no major change, and highly suppressed conditions for genes, respectively. For each gene pair, a probability density mass function table is constructed indicating nine joint probabilities. Then gene pairs were selected based on linear and probabilistic relation between their profiles indicated by the sum of probability density masses in selected points. The selected gene pairs share many Gene Ontology terms. Furthermore a network is constructed by selecting a large number of gene pairs based on FDR analysis and the clustering of the network generates many modules rich with similar function genes. Also, the promoters of the gene sets in many modules are rich with binding sites of known transcription factors indicating the effectiveness of the proposed approach in predicting regulatory relations. Md. Altaf-Ul-Amin, Tetsuo Katsuragi, Tetsuo Sato, Naoaki Ono, and Shigehiko Kanaya Copyright © 2014 Md. Altaf-Ul-Amin et al. All rights reserved. Protein Sequence Classification with Improved Extreme Learning Machine Algorithms Sun, 30 Mar 2014 09:04:21 +0000 Precisely classifying a protein sequence from a large biological protein sequences database plays an important role for developing competitive pharmacological products. Comparing the unseen sequence with all the identified protein sequences and returning the category index with the highest similarity scored protein, conventional methods are usually time-consuming. Therefore, it is urgent and necessary to build an efficient protein sequence classification system. In this paper, we study the performance of protein sequence classification using SLFNs. The recent efficient extreme learning machine (ELM) and its invariants are utilized as the training algorithms. The optimal pruned ELM is first employed for protein sequence classification in this paper. To further enhance the performance, the ensemble based SLFNs structure is constructed where multiple SLFNs with the same number of hidden nodes and the same activation function are used as ensembles. For each ensemble, the same training algorithm is adopted. The final category index is derived using the majority voting method. Two approaches, namely, the basic ELM and the OP-ELM, are adopted for the ensemble based SLFNs. The performance is analyzed and compared with several existing methods using datasets obtained from the Protein Information Resource center. The experimental results show the priority of the proposed algorithms. Jiuwen Cao and Lianglin Xiong Copyright © 2014 Jiuwen Cao and Lianglin Xiong. All rights reserved. Association between 2/3/4, Promoter Polymorphism (−491A/T, −427T/C, and −219T/G) at the Apolipoprotein E Gene, and Mental Retardation in Children from an Iodine Deficiency Area, China Tue, 25 Mar 2014 12:55:09 +0000 Background. Several common single-nucleotide polymorphisms (SNPs) at apolipoprotein E (ApoE) have been linked with late onset sporadic Alzheimer’s disease and declining normative cognitive ability in elder people, but we are unclear about their relationship with cognition in children. Results. We studied , , and promoter polymorphisms and at ApoE among children with mental retardation (MR, ), borderline MR (), and controls () from an iodine deficiency area in China. The allelic and genotypic distribution of individual locus did not significantly differ among three groups with Mantel-Haenszel test (). However, frequencies of haplotype of /// were distributed as MR > borderline MR > controls ( uncorrected = 0.004), indicating that the presence of this haplotype may increase the risk of disease. Conclusions. In this large population-based study in children, we did not find any significant association between single locus of the four common ApoE polymorphisms (, , , and ) and MR or borderline MR. However, we found that the presence of ATT haplotype was associated with an increased risk of MR and borderline MR. Our present work may help enlarge our knowledge of the cognitive role of ApoE across the lifespan and the mechanisms of human cognition. Jun Li, Fuchang Zhang, Yunliang Wang, Yan Wang, Wei Qin, Qinghe Xing, Xueqing Qian, Tingwei Guo, Xiaocai Gao, Lin He, and Jianjun Gao Copyright © 2014 Jun Li et al. All rights reserved. Survey of Network-Based Approaches to Research of Cardiovascular Diseases Thu, 20 Mar 2014 08:21:55 +0000 Cardiovascular diseases (CVDs) are the leading health problem worldwide. Investigating causes and mechanisms of CVDs calls for an integrative approach that would take into account its complex etiology. Biological networks generated from available data on biomolecular interactions are an excellent platform for understanding interconnectedness of all processes within a living cell, including processes that underlie diseases. Consequently, topology of biological networks has successfully been used for identifying genes, pathways, and modules that govern molecular actions underlying various complex diseases. Here, we review approaches that explore and use relationships between topological properties of biological networks and mechanisms underlying CVDs. Anida Sarajlić and Nataša Pržulj Copyright © 2014 Anida Sarajlić and Nataša Pržulj. All rights reserved. New Strategies for Evaluation and Analysis of SELEX Experiments Wed, 19 Mar 2014 13:58:14 +0000 Aptamers are an interesting alternative to antibodies in pharmaceutics and biosensorics, because they are able to bind to a multitude of possible target molecules with high affinity. Therefore the process of finding such aptamers, which is commonly a SELEX screening process, becomes crucial. The standard SELEX procedure schedules the validation of certain found aptamers via binding experiments, which is not leading to any detailed specification of the aptamer enrichment during the screening. For the purpose of advanced analysis of the accrued enrichment within the SELEX library we used sequence information gathered by next generation sequencing techniques in addition to the standard SELEX procedure. As sequence motifs are one possibility of enrichment description, the need of finding those recurring sequence motifs corresponding to substructures within the aptamers, which are characteristically fitted to specific binding sites of the target, arises. In this paper a motif search algorithm is presented, which helps to describe the aptamers enrichment in more detail. The extensive characterization of target and binding aptamers may later reveal a functional connection between these molecules, which can be modeled and used to optimize future SELEX runs in case of the generation of target-specific starting libraries. Rico Beier, Elke Boschke, and Dirk Labudde Copyright © 2014 Rico Beier et al. All rights reserved. Essential Functional Modules for Pathogenic and Defensive Mechanisms in Candida albicans Infections Tue, 18 Mar 2014 12:20:46 +0000 The clinical and biological significance of the study of fungal pathogen Candida albicans (C. albicans) has markedly increased. However, the explicit pathogenic and invasive mechanisms of such host-pathogen interactions have not yet been fully elucidated. Therefore, the essential functional modules involved in C. albicans-zebrafish interactions were investigated in this study. Adopting a systems biology approach, the early-stage and late-stage protein-protein interaction (PPI) networks for both C. albicans and zebrafish were constructed. By comparing PPI networks at the early and late stages of the infection process, several critical functional modules were identified in both pathogenic and defensive mechanisms. Functional modules in C. albicans, like those involved in hyphal morphogenesis, ion and small molecule transport, protein secretion, and shifts in carbon utilization, were seen to play important roles in pathogen invasion and damage caused to host cells. Moreover, the functional modules in zebrafish, such as those involved in immune response, apoptosis mechanisms, ion transport, protein secretion, and hemostasis-related processes, were found to be significant as defensive mechanisms during C. albicans infection. The essential functional modules thus determined could provide insights into the molecular mechanisms of host-pathogen interactions during the infection process and thereby devise potential therapeutic strategies to treat C. albicans infection. Yu-Chao Wang, I-Chun Tsai, Che Lin, Wen-Ping Hsieh, Chung-Yu Lan, Yung-Jen Chuang, and Bor-Sen Chen Copyright © 2014 Yu-Chao Wang et al. All rights reserved. A Diverse Stochastic Search Algorithm for Combination Therapeutics Wed, 12 Mar 2014 00:00:00 +0000 Background. Design of drug combination cocktails to maximize sensitivity for individual patients presents a challenge in terms of minimizing the number of experiments to attain the desired objective. The enormous number of possible drug combinations constrains exhaustive experimentation approaches, and personal variations in genetic diseases restrict the use of prior knowledge in optimization. Results. We present a stochastic search algorithm that consisted of a parallel experimentation phase followed by a combination of focused and diversified sequential search. We evaluated our approach on seven synthetic examples; four of them were evaluated twice with different parameters, and two biological examples of bacterial and lung cancer cell inhibition response to combination drugs. The performance of our approach as compared to recently proposed adaptive reference update approach was superior for all the examples considered, achieving an average of 45% reduction in the number of experimental iterations. Conclusions. As the results illustrate, the proposed diverse stochastic search algorithm can produce optimized combinations in relatively smaller number of iterative steps. This approach can be combined with available knowledge on the genetic makeup of the patient to design optimal selection of drug cocktails. Mehmet Umut Caglar and Ranadip Pal Copyright © 2014 Mehmet Umut Caglar and Ranadip Pal. All rights reserved. Visualization of Genome Signatures of Eukaryote Genomes by Batch-Learning Self-Organizing Map with a Special Emphasis on Drosophila Genomes Tue, 11 Mar 2014 09:27:17 +0000 A strategy of evolutionary studies that can compare vast numbers of genome sequences is becoming increasingly important with the remarkable progress of high-throughput DNA sequencing methods. We previously established a sequence alignment-free clustering method “BLSOM” for di-, tri-, and tetranucleotide compositions in genome sequences, which can characterize sequence characteristics (genome signatures) of a wide range of species. In the present study, we generated BLSOMs for tetra- and pentanucleotide compositions in approximately one million sequence fragments derived from 101 eukaryotes, for which almost complete genome sequences were available. BLSOM recognized phylotype-specific characteristics (e.g., key combinations of oligonucleotide frequencies) in the genome sequences, permitting phylotype-specific clustering of the sequences without any information regarding the species. In our detailed examination of 12 Drosophila species, the correlation between their phylogenetic classification and the classification on the BLSOMs was observed to visualize oligonucleotides diagnostic for species-specific clustering. Takashi Abe, Yuta Hamano, and Toshimichi Ikemura Copyright © 2014 Takashi Abe et al. All rights reserved. Exact and Heuristic Methods for Network Completion for Time-Varying Genetic Networks Sun, 09 Mar 2014 11:48:52 +0000 Robustness in biological networks can be regarded as an important feature of living systems. A system maintains its functions against internal and external perturbations, leading to topological changes in the network with varying delays. To understand the flexibility of biological networks, we propose a novel approach to analyze time-dependent networks, based on the framework of network completion, which aims to make the minimum amount of modifications to a given network so that the resulting network is most consistent with the observed data. We have developed a novel network completion method for time-varying networks by extending our previous method for the completion of stationary networks. In particular, we introduce a double dynamic programming technique to identify change time points and required modifications. Although this extended method allows us to guarantee the optimality of the solution, this method has relatively low computational efficiency. In order to resolve this difficulty, we developed a heuristic method for speeding up the calculation of minimum least squares errors. We demonstrate the effectiveness of our proposed methods through computational experiments using synthetic data and real microarray gene expression data. The results indicate that our methods exhibit good performance in terms of completing and inferring gene association networks with time-varying structures. Natsu Nakajima and Tatsuya Akutsu Copyright © 2014 Natsu Nakajima and Tatsuya Akutsu. All rights reserved. Evaluating Word Representation Features in Biomedical Named Entity Recognition Tasks Thu, 06 Mar 2014 13:34:51 +0000 Biomedical Named Entity Recognition (BNER), which extracts important entities such as genes and proteins, is a crucial step of natural language processing in the biomedical domain. Various machine learning-based approaches have been applied to BNER tasks and showed good performance. In this paper, we systematically investigated three different types of word representation (WR) features for BNER, including clustering-based representation, distributional representation, and word embeddings. We selected one algorithm from each of the three types of WR features and applied them to the JNLPBA and BioCreAtIvE II BNER tasks. Our results showed that all the three WR algorithms were beneficial to machine learning-based BNER systems. Moreover, combining these different types of WR features further improved BNER performance, indicating that they are complementary to each other. By combining all the three types of WR features, the improvements in -measure on the BioCreAtIvE II GM and JNLPBA corpora were 3.75% and 1.39%, respectively, when compared with the systems using baseline features. To the best of our knowledge, this is the first study to systematically evaluate the effect of three different types of WR features for BNER tasks. Buzhou Tang, Hongxin Cao, Xiaolong Wang, Qingcai Chen, and Hua Xu Copyright © 2014 Buzhou Tang et al. All rights reserved. Identifying Gastric Cancer Related Genes Using the Shortest Path Algorithm and Protein-Protein Interaction Network Wed, 05 Mar 2014 16:35:58 +0000 Gastric cancer, as one of the leading causes of cancer related deaths worldwide, causes about 800,000 deaths per year. Up to now, the mechanism underlying this disease is still not totally uncovered. Identification of related genes of this disease is an important step which can help to understand the mechanism underlying this disease, thereby designing effective treatments. In this study, some novel gastric cancer related genes were discovered based on the knowledge of known gastric cancer related ones. These genes were searched by applying the shortest path algorithm in protein-protein interaction network. The analysis results suggest that some of them are indeed involved in the biological process of gastric cancer, which indicates that they are the actual gastric cancer related genes with high probability. It is hopeful that the findings in this study may help promote the study of this disease and the methods can provide new insights to study various diseases. Yang Jiang, Yang Shu, Ying Shi, Li-Peng Li, Fei Yuan, and Hui Ren Copyright © 2014 Yang Jiang et al. All rights reserved. TF2LncRNA: Identifying Common Transcription Factors for a List of lncRNA Genes from ChIP-Seq Data Tue, 04 Mar 2014 07:37:38 +0000 High-throughput genomic technologies like lncRNA microarray and RNA-Seq often generate a set of lncRNAs of interest, yet little is known about the transcriptional regulation of the set of lncRNA genes. Here, based on ChIP-Seq peak lists of transcription factors (TFs) from ENCODE and annotated human lncRNAs from GENCODE, we developed a web-based interface titled “TF2lncRNA,” where TF peaks from each ChIP-Seq experiment are crossed with the genomic coordinates of a set of input lncRNAs, to identify which TFs present a statistically significant number of binding sites (peaks) within the regulatory region of the input lncRNA genes. The input can be a set of coexpressed lncRNA genes or any other cluster of lncRNA genes. Users can thus infer which TFs are likely to be common transcription regulators of the set of lncRNAs. In addition, users can retrieve all lncRNAs potentially regulated by a specific TF in a specific cell line of interest or retrieve all TFs that have one or more binding sites in the regulatory region of a given lncRNA in the specific cell line. TF2LncRNA is an efficient and easy-to-use web-based tool. Qinghua Jiang, Jixuan Wang, Yadong Wang, Rui Ma, Xiaoliang Wu, and Yu Li Copyright © 2014 Qinghua Jiang et al. All rights reserved. Comparative Metagenomic Analysis of Human Gut Microbiome Composition Using Two Different Bioinformatic Pipelines Tue, 25 Feb 2014 09:12:11 +0000 Technological advances in next-generation sequencing-based approaches have greatly impacted the analysis of microbial community composition. In particular, 16S rRNA-based methods have been widely used to analyze the whole set of bacteria present in a target environment. As a consequence, several specific bioinformatic pipelines have been developed to manage these data. MetaGenome Rapid Annotation using Subsystem Technology (MG-RAST) and Quantitative Insights Into Microbial Ecology (QIIME) are two freely available tools for metagenomic analyses that have been used in a wide range of studies. Here, we report the comparative analysis of the same dataset with both QIIME and MG-RAST in order to evaluate their accuracy in taxonomic assignment and in diversity analysis. We found that taxonomic assignment was more accurate with QIIME which, at family level, assigned a significantly higher number of reads. Thus, QIIME generated a more accurate BIOM file, which in turn improved the diversity analysis output. Finally, although informatics skills are needed to install QIIME, it offers a wide range of metrics that are useful for downstream applications and, not less important, it is not dependent on server times. Valeria D’Argenio, Giorgio Casaburi, Vincenza Precone, and Francesco Salvatore Copyright © 2014 Valeria D’Argenio et al. All rights reserved. Approaches for Recognizing Disease Genes Based on Network Mon, 24 Feb 2014 07:40:35 +0000 Diseases are closely related to genes, thus indicating that genetic abnormalities may lead to certain diseases. The recognition of disease genes has long been a goal in biology, which may contribute to the improvement of health care and understanding gene functions, pathways, and interactions. However, few large-scale gene-gene association datasets, disease-disease association datasets, and gene-disease association datasets are available. A number of machine learning methods have been used to recognize disease genes based on networks. This paper states the relationship between disease and gene, summarizes the approaches used to recognize disease genes based on network, analyzes the core problems and challenges of the methods, and outlooks future research direction. Quan Zou, Jinjin Li, Chunyu Wang, and Xiangxiang Zeng Copyright © 2014 Quan Zou et al. All rights reserved. Predicting Glycerophosphoinositol Identities in Lipidomic Datasets Using VaLID (Visualization and Phospholipid Identification)—An Online Bioinformatic Search Engine Thu, 20 Feb 2014 09:59:05 +0000 The capacity to predict and visualize all theoretically possible glycerophospholipid molecular identities present in lipidomic datasets is currently limited. To address this issue, we expanded the search-engine and compositional databases of the online Visualization and Phospholipid Identification (VaLID) bioinformatic tool to include the glycerophosphoinositol superfamily. VaLID v1.0.0 originally allowed exact and average mass libraries of 736,584 individual species from eight phospholipid classes: glycerophosphates, glyceropyrophosphates, glycerophosphocholines, glycerophosphoethanolamines, glycerophosphoglycerols, glycerophosphoglycerophosphates, glycerophosphoserines, and cytidine 5′-diphosphate 1,2-diacyl-sn-glycerols to be searched for any mass to charge value (with adjustable tolerance levels) under a variety of mass spectrometry conditions. Here, we describe an update that now includes all possible glycerophosphoinositols, glycerophosphoinositol monophosphates, glycerophosphoinositol bisphosphates, and glycerophosphoinositol trisphosphates. This update expands the total number of lipid species represented in the VaLID v2.0.0 database to 1,473,168 phospholipids. Each phospholipid can be generated in skeletal representation. A subset of species curated by the Canadian Institutes of Health Research Training Program in Neurodegenerative Lipidomics (CTPNL) team is provided as an array of high-resolution structures. VaLID is freely available and responds to all users through the CTPNL resources web site. Graeme S. V. McDowell, Alexandre P. Blanchard, Graeme P. Taylor, Daniel Figeys, Stephen Fai, and Steffany A. L. Bennett Copyright © 2014 Graeme S. V. McDowell et al. All rights reserved. Integrative Analysis of miRNA-mRNA and miRNA-miRNA Interactions Wed, 12 Feb 2014 15:17:43 +0000 MicroRNAs (miRNAs) are small, noncoding regulatory molecules. They are involved in many essential biological processes and act by suppressing gene expression. The present work reports an integrative analysis of miRNA-mRNA and miRNA-miRNA interactions and their regulatory patterns using high-throughput miRNA and mRNA datasets. Aberrantly expressed miRNA and mRNA profiles were obtained based on fold change analysis, and qRT-PCR was used for further validation of deregulated miRNAs. miRNAs and target mRNAs were found to show various expression patterns. miRNA-miRNA interactions and clustered/homologous miRNAs were also found to contribute to the flexible and selective regulatory network. Interacting miRNAs (e.g., miRNA-103a and miR-103b) showed more pronounced differences in expression, which suggests the potential “restricted interaction” in the miRNA world. miRNAs from the same gene clusters (e.g., miR-23b gene cluster) or gene families (e.g., miR-10 gene family) always showed the same types of deregulation patterns, although they sometimes differed in expression levels. These clustered and homologous miRNAs may have close functional relationships, which may indicate collaborative interactions between miRNAs. The integrative analysis of miRNA-mRNA based on biological characteristics of miRNA will further enrich miRNA study. Li Guo, Yang Zhao, Sheng Yang, Hui Zhang, and Feng Chen Copyright © 2014 Li Guo et al. All rights reserved. Network-Assisted Prediction of Potential Drugs for Addiction Sun, 09 Feb 2014 12:25:55 +0000 Drug addiction is a chronic and complex brain disease, adding much burden on the community. Though numerous efforts have been made to identify the effective treatment, it is necessary to find more novel therapeutics for this complex disease. As network pharmacology has become a promising approach for drug repurposing, we proposed to apply the approach to drug addiction, which might provide new clues for the development of effective addiction treatment drugs. We first extracted 44 addictive drugs from the NIDA and their targets from DrugBank. Then, we constructed two networks: an addictive drug-target network and an expanded addictive drug-target network by adding other drugs that have at least one common target with these addictive drugs. By performing network analyses, we found that those addictive drugs with similar actions tended to cluster together. Additionally, we predicted 94 nonaddictive drugs with potential pharmacological functions to the addictive drugs. By examining the PubMed data, 51 drugs significantly cooccurred with addictive keywords than expected. Thus, the network analyses provide a list of candidate drugs for further investigation of their potential in addiction treatment or risk. Jingchun Sun, Liang-Chin Huang, Hua Xu, and Zhongming Zhao Copyright © 2014 Jingchun Sun et al. All rights reserved. Erratum to “New Optical Methods for Liveness Detection on Fingers” Sun, 02 Feb 2014 13:42:02 +0000 Martin Drahansky, Michal Dolezel, Jan Vana, Eva Brezinova, Jaegeol Yim, and Kyubark Shim Copyright © 2014 Martin Drahansky et al. All rights reserved. A Novel Approach for Discovering Condition-Specific Correlations of Gene Expressions within Biological Pathways by Using Cloud Computing Technology Wed, 22 Jan 2014 17:16:42 +0000 Microarrays are widely used to assess gene expressions. Most microarray studies focus primarily on identifying differential gene expressions between conditions (e.g., cancer versus normal cells), for discovering the major factors that cause diseases. Because previous studies have not identified the correlations of differential gene expression between conditions, crucial but abnormal regulations that cause diseases might have been disregarded. This paper proposes an approach for discovering the condition-specific correlations of gene expressions within biological pathways. Because analyzing gene expression correlations is time consuming, an Apache Hadoop cloud computing platform was implemented. Three microarray data sets of breast cancer were collected from the Gene Expression Omnibus, and pathway information from the Kyoto Encyclopedia of Genes and Genomes was applied for discovering meaningful biological correlations. The results showed that adopting the Hadoop platform considerably decreased the computation time. Several correlations of differential gene expressions were discovered between the relapse and nonrelapse breast cancer samples, and most of them were involved in cancer regulation and cancer-related pathways. The results showed that breast cancer recurrence might be highly associated with the abnormal regulations of these gene pairs, rather than with their individual expression levels. The proposed method was computationally efficient and reliable, and stable results were obtained when different data sets were used. The proposed method is effective in identifying meaningful biological regulation patterns between conditions. Tzu-Hao Chang, Shih-Lin Wu, Wei-Jen Wang, Jorng-Tzong Horng, and Cheng-Wei Chang Copyright © 2014 Tzu-Hao Chang et al. All rights reserved. Microsatellites in the Genome of the Edible Mushroom, Volvariella volvacea Sun, 19 Jan 2014 00:00:00 +0000 Using bioinformatics software and database, we have characterized the microsatellite pattern in the V. volvacea genome and compared it with microsatellite patterns found in the genomes of four other edible fungi: Coprinopsis cinerea, Schizophyllum commune, Agaricus bisporus, and Pleurotus ostreatus. A total of 1346 microsatellites have been identified, with mono-nucleotides being the most frequent motif. The relative abundance of microsatellites was lower in coding regions with 21 No./Mb. However, the microsatellites in the V. volvacea gene models showed a greater tendency to be located in the CDS regions. There was also a higher preponderance of trinucleotide repeats, especially in the kinase genes, which implied a possible role in phenotypic variation. Among the five fungal genomes, microsatellite abundance appeared to be unrelated to genome size. Furthermore, the short motifs (mono- to tri-nucleotides) outnumbered other categories although these differed in proportion. Data analysis indicated a possible relationship between the most frequent microsatellite types and the genetic distance between the five fungal genomes. Ying Wang, Mingjie Chen, Hong Wang, Jing-Fang Wang, and Dapeng Bao Copyright © 2014 Ying Wang et al. All rights reserved. Integration of High-Volume Molecular and Imaging Data for Composite Biomarker Discovery in the Study of Melanoma Thu, 16 Jan 2014 16:36:04 +0000 In this work the effects of simple imputations are studied, regarding the integration of multimodal data originating from different patients. Two separate datasets of cutaneous melanoma are used, an image analysis (dermoscopy) dataset together with a transcriptomic one, specifically DNA microarrays. Each modality is related to a different set of patients, and four imputation methods are employed to the formation of a unified, integrative dataset. The application of backward selection together with ensemble classifiers (random forests), followed by principal components analysis and linear discriminant analysis, illustrates the implication of the imputations on feature selection and dimensionality reduction methods. The results suggest that the expansion of the feature space through the data integration, achieved by the exploitation of imputation schemes in general, aids the classification task, imparting stability as regards the derivation of putative classifiers. In particular, although the biased imputation methods increase significantly the predictive performance and the class discrimination of the datasets, they still contribute to the study of prominent features and their relations. The fusion of separate datasets, which provide a multimodal description of the same pathology, represents an innovative, promising avenue, enhancing robust composite biomarker derivation and promoting the interpretation of the biomedical problem studied. Konstantinos Moutselos, Ilias Maglogiannis, and Aristotelis Chatziioannou Copyright © 2014 Konstantinos Moutselos et al. All rights reserved. Network Analysis of Neurodegenerative Disease Highlights a Role of Toll-Like Receptor Signaling Thu, 16 Jan 2014 13:33:49 +0000 Despite significant advances in the study of the molecular mechanisms altered in the development and progression of neurodegenerative diseases (NDs), the etiology is still enigmatic and the distinctions between diseases are not always entirely clear. We present an efficient computational method based on protein-protein interaction network (PPI) to model the functional network of NDs. The aim of this work is fourfold: (i) reconstruction of a PPI network relating to the NDs, (ii) construction of an association network between diseases based on proximity in the disease PPI network, (iii) quantification of disease associations, and (iv) inference of potential molecular mechanism involved in the diseases. The functional links of diseases not only showed overlap with the traditional classification in clinical settings, but also offered new insight into connections between diseases with limited clinical overlap. To gain an expanded view of the molecular mechanisms involved in NDs, both direct and indirect connector proteins were investigated. The method uncovered molecular relationships that are in common apparently distinct diseases and provided important insight into the molecular networks implicated in disease pathogenesis. In particular, the current analysis highlighted the Toll-like receptor signaling pathway as a potential candidate pathway to be targeted by therapy in neurodegeneration. Thanh-Phuong Nguyen, Laura Caberlotto, Melissa J. Morine, and Corrado Priami Copyright © 2014 Thanh-Phuong Nguyen et al. All rights reserved. Computational Analysis of Transcriptional Circuitries in Human Embryonic Stem Cells Reveals Multiple and Independent Networks Thu, 09 Jan 2014 14:26:11 +0000 It has been known that three core transcription factors (TFs), NANOG, OCT4, and SOX2, collaborate to form a transcriptional circuitry to regulate pluripotency and self-renewal of human embryonic stem (ES) cells. Similarly, MYC also plays an important role in regulating pluripotency and self-renewal of human ES cells. However, the precise mechanism by which the transcriptional regulatory networks control the activity of ES cells remains unclear. In this study, we reanalyzed an extended core network, which includes the set of genes that are cobound by the three core TFs and additional TFs that also bind to these cobound genes. Our results show that beyond the core transcriptional network, additional transcriptional networks are potentially important in the regulation of the fate of human ES cells. Several gene families that encode TFs play a key role in the transcriptional circuitry of ES cells. We also demonstrate that MYC acts independently of the core module in the regulation of the fate of human ES cells, consistent with the established argument. We find that TP53 is a key connecting molecule between the core-centered and MYC-centered modules. This study provides additional insights into the underlying regulatory mechanisms involved in the fate determination of human ES cells. Xiaosheng Wang and Chittibabu Guda Copyright © 2014 Xiaosheng Wang and Chittibabu Guda. All rights reserved. De Novo Assembly and Characterization of Sophora japonica Transcriptome Using RNA-seq Thu, 02 Jan 2014 11:42:06 +0000 Sophora japonica Linn (Chinese Scholar Tree) is a shrub species belonging to the subfamily Faboideae of the pea family Fabaceae. In this study, RNA sequencing of S. japonica transcriptome was performed to produce large expression datasets for functional genomic analysis. Approximate 86.1 million high-quality clean reads were generated and assembled de novo into 143010 unique transcripts and 57614 unigenes. The average length of unigenes was 901 bps with an N50 of 545 bps. Four public databases, including the NCBI nonredundant protein (NR), Swiss-Prot, Kyoto Encyclopedia of Genes and Genomes (KEGG), and the Cluster of Orthologous Groups (COG), were used to annotate unigenes through NCBI BLAST procedure. A total of 27541 of 57614 unigenes (47.8%) were annotated for gene descriptions, conserved protein domains, or gene ontology. Moreover, an interaction network of unigenes in S. japonica was predicted based on known protein-protein interactions of putative orthologs of well-studied plant genomes. The transcriptome data of S. japonica reported here represents first genome-scale investigation of gene expressions in Faboideae plants. We expect that our study will provide a useful resource for further studies on gene expression, genomics, functional genomics, and protein-protein interaction in S. japonica. Liucun Zhu, Ying Zhang, Wenna Guo, Xin-Jian Xu, and Qiang Wang Copyright © 2014 Liucun Zhu et al. All rights reserved. Application of Systems Biology and Bioinformatics Methods in Biochemistry and Biomedicine Tue, 31 Dec 2013 11:31:30 +0000 Yudong Cai, Tao Huang, Lei Chen, and Bin Niu Copyright © 2013 Yudong Cai et al. All rights reserved. HGF Accelerates Wound Healing by Promoting the Dedifferentiation of Epidermal Cells through -Integrin/ILK Pathway Mon, 30 Dec 2013 13:52:26 +0000 Skin wound healing is a critical and complex biological process after trauma. This process is activated by signaling pathways of both epithelial and nonepithelial cells, which release a myriad of different cytokines and growth factors. Hepatocyte growth factor (HGF) is a cytokine known to play multiple roles during the various stages of wound healing. This study evaluated the benefits of HGF on reepithelialization during wound healing and investigated its mechanisms of action. Gross and histological results showed that HGF significantly accelerated reepithelialization in diabetic (DB) rats. HGF increased the expressions of the cell adhesion molecules -integrin and the cytoskeleton remodeling protein integrin-linked kinase (ILK) in epidermal cells in vivo and in vitro. Silencing of ILK gene expression by RNA interference reduced expression of -integrin, ILK, and c-met in epidermal cells, concomitantly decreasing the proliferation and migration ability of epidermal cells. -Integrin can be an important maker of poorly differentiated epidermal cells. Therefore, these data demonstrate that epidermal cells become poorly differentiated state and regained some characteristics of epidermal stem cells under the role of HGF after wound. Taken together, the results provide evidence that HGF can accelerate reepithelialization in skin wound healing by dedifferentiation of epidermal cells in a manner related to the -integrin/ILK pathway. Jin-Feng Li, Hai-Feng Duan, Chu-Tse Wu, Da-Jin Zhang, Youping Deng, Hong-Lei Yin, Bing Han, Hui-Cui Gong, Hong-Wei Wang, and Yun-Liang Wang Copyright © 2013 Jin-Feng Li et al. All rights reserved. Prediction of Substrate-Enzyme-Product Interaction Based on Molecular Descriptors and Physicochemical Properties Sun, 22 Dec 2013 18:10:05 +0000 It is important to correctly and efficiently predict the interaction of substrate-enzyme and to predict their product in metabolic pathway. In this work, a novel approach was introduced to encode substrate/product and enzyme molecules with molecular descriptors and physicochemical properties, respectively. Based on this encoding method, KNN was adopted to build the substrate-enzyme-product interaction network. After selecting the optimal features that are able to represent the main factors of substrate-enzyme-product interaction in our prediction, totally 160 features out of 290 features were attained which can be clustered into ten categories: elemental analysis, geometry, chemistry, amino acid composition, predicted secondary structure, hydrophobicity, polarizability, solvent accessibility, normalized van der Waals volume, and polarity. As a result, our predicting model achieved an MCC of 0.423 and an overall prediction accuracy of 89.1% for 10-fold cross-validation test. Bing Niu, Guohua Huang, Linfeng Zheng, Xueyuan Wang, Fuxue Chen, Yuhui Zhang, and Tao Huang Copyright © 2013 Bing Niu et al. All rights reserved. Identification of Age-Related Macular Degeneration Related Genes by Applying Shortest Path Algorithm in Protein-Protein Interaction Network Wed, 18 Dec 2013 12:38:15 +0000 This study attempted to find novel age-related macular degeneration (AMD) related genes based on 36 known AMD genes. The well-known shortest path algorithm, Dijkstra’s algorithm, was applied to find the shortest path connecting each pair of known AMD related genes in protein-protein interaction (PPI) network. The genes occurring in any shortest path were considered as candidate AMD related genes. As a result, 125 novel AMD genes were predicted. The further analysis based on betweenness and permutation test indicates that there are 10 genes involved in the formation or development of AMD and may be the actual AMD related genes with high probability. We hope that this contribution would promote the study of age-related macular degeneration and discovery of novel effective treatments. Jian Zhang, Min Jiang, Fei Yuan, Kai-Yan Feng, Yu-Dong Cai, Xun Xu, and Lei Chen Copyright © 2013 Jian Zhang et al. All rights reserved. Biometrics and Biosecurity 2013 Tue, 10 Dec 2013 13:42:09 +0000 Tai-hoon Kim, Sabah Mohammed, and Wai-Chi Fang Copyright © 2013 Tai-hoon Kim et al. All rights reserved. iEzy-Drug: A Web Server for Identifying the Interaction between Enzymes and Drugs in Cellular Networking Tue, 26 Nov 2013 18:00:45 +0000 With the features of extremely high selectivity and efficiency in catalyzing almost all the chemical reactions in cells, enzymes play vitally important roles for the life of an organism and hence have become frequent targets for drug design. An essential step in developing drugs by targeting enzymes is to identify drug-enzyme interactions in cells. It is both time-consuming and costly to do this purely by means of experimental techniques alone. Although some computational methods were developed in this regard based on the knowledge of the three-dimensional structure of enzyme, unfortunately their usage is quite limited because three-dimensional structures of many enzymes are still unknown. Here, we reported a sequence-based predictor, called “iEzy-Drug,” in which each drug compound was formulated by a molecular fingerprint with 258 feature components, each enzyme by the Chou’s pseudo amino acid composition generated via incorporating sequential evolution information and physicochemical features derived from its sequence, and the prediction engine was operated by the fuzzy -nearest neighbor algorithm. The overall success rate achieved by iEzy-Drug via rigorous cross-validations was about 91%. Moreover, to maximize the convenience for the majority of experimental scientists, a user-friendly web server was established, by which users can easily obtain their desired results. Jian-Liang Min, Xuan Xiao, and Kuo-Chen Chou Copyright © 2013 Jian-Liang Min et al. All rights reserved. Multiple Biomarker Panels for Early Detection of Breast Cancer in Peripheral Blood Tue, 26 Nov 2013 14:26:09 +0000 Detecting breast cancer at early stages can be challenging. Traditional mammography and tissue microarray that have been studied for early breast cancer detection and prediction have many drawbacks. Therefore, there is a need for more reliable diagnostic tools for early detection of breast cancer due to a number of factors and challenges. In the paper, we presented a five-marker panel approach based on SVM for early detection of breast cancer in peripheral blood and show how to use SVM to model the classification and prediction problem of early detection of breast cancer in peripheral blood. We found that the five-marker panel can improve the prediction performance (area under curve) in the testing data set from 0.5826 to 0.7879. Further pathway analysis showed that the top four five-marker panels are associated with signaling, steroid hormones, metabolism, immune system, and hemostasis, which are consistent with previous findings. Our prediction model can serve as a general model for multibiomarker panel discovery in early detection of other cancers. Fan Zhang, Youping Deng, and Renee Drabier Copyright © 2013 Fan Zhang et al. All rights reserved. Gene Prioritization of Resistant Rice Gene against Xanthomas oryzae pv. oryzae by Using Text Mining Technologies Mon, 25 Nov 2013 16:01:48 +0000 To effectively assess the possibility of the unknown rice protein resistant to Xanthomonas oryzae pv. oryzae, a hybrid strategy is proposed to enhance gene prioritization by combining text mining technologies with a sequence-based approach. The text mining technique of term frequency inverse document frequency is used to measure the importance of distinguished terms which reflect biomedical activity in rice before candidate genes are screened and vital terms are produced. Afterwards, a built-in classifier under the chaos games representation algorithm is used to sieve the best possible candidate gene. Our experiment results show that the combination of these two methods achieves enhanced gene prioritization. Jingbo Xia, Xing Zhang, Daojun Yuan, Lingling Chen, Jonathan Webster, and Alex Chengyu Fang Copyright © 2013 Jingbo Xia et al. All rights reserved. QSBR Study of Bitter Taste of Peptides: Application of GA-PLS in Combination with MLR, SVM, and ANN Approaches Mon, 25 Nov 2013 08:38:58 +0000 Detailed information about the relationships between structures and properties/activities of peptides as drugs and nutrients is useful in the development of drugs and functional foods containing peptides as active compounds. The bitterness of the peptides is an undesirable property which should be reduced during drug/nutrient production, and quantitative structure bitter taste relationship (QSBR) studies can help researchers to design less bitter peptides with higher target efficiency. Calculated structural parameters were used to develop three different QSBR models (i.e., multiple linear regression, support vector machine, and artificial neural network) to predict the bitterness of 229 peptides (containing 2–12 amino acids, obtained from the literature). The developed models were validated using internal and external validation methods, and the prediction errors were checked using mean percentage deviation and absolute average error values. All developed models predicted the activities successfully (with prediction errors less than experimental error values), whereas the prediction errors for nonlinear methods were less than those for linear methods. The selected structural descriptors successfully differentiated between bitter and nonbitter peptides. Somaieh Soltani, Hossein Haghaei, Ali Shayanfar, Javad Vallipour, Karim Asadpour Zeynali, and Abolghasem Jouyban Copyright © 2013 Somaieh Soltani et al. All rights reserved. Expression Sensitivity Analysis of Human Disease Related Genes Sun, 24 Nov 2013 11:16:16 +0000 Background. Genome-wide association studies (GWAS) have shown its revolutionary power in seeking the influenced loci on complex diseases genetically. Thousands of replicated loci for common traits are helpful in diseases risk assessment. However it is still difficult to elucidate the variations in these loci that directly cause susceptibility to diseases by disrupting the expression or function of a protein currently. Results. We evaluate the expression features of disease related genes and find that different diseases related genes show different expression perturbation sensitivities in various conditions. It is worth noting that the expression of some robust disease-genes doesn’t show significant change in their corresponding diseases, these genes might be easily ignored in the expression profile analysis. Conclusion. Gene ontology enrichment analysis indicates that robust disease-genes execute essential function in comparison with sensitive disease-genes. The diseases associated with robust genes seem to be relatively lethal like cancer and aging. On the other hand, the diseases associated with sensitive genes are apparently nonlethal like psych and chemical dependency diseases. Liang-Xiao Ma, Ya-Jun Wang, Jing-Fang Wang, Xuan Li, and Pei Hao Copyright © 2013 Liang-Xiao Ma et al. All rights reserved. Translational Biomedical Informatics and Computational Systems Medicine Thu, 21 Nov 2013 14:39:08 +0000 Zhongming Zhao, Bairong Shen, Xinghua Lu, and Wanwipa Vongsangnak Copyright © 2013 Zhongming Zhao et al. All rights reserved. An Improved Biometrics-Based Remote User Authentication Scheme with User Anonymity Thu, 21 Nov 2013 13:09:31 +0000 The authors review the biometrics-based user authentication scheme proposed by An in 2012. The authors show that there exist loopholes in the scheme which are detrimental for its security. Therefore the authors propose an improved scheme eradicating the flaws of An’s scheme. Then a detailed security analysis of the proposed scheme is presented followed by its efficiency comparison. The proposed scheme not only withstands security problems found in An’s scheme but also provides some extra features with mere addition of only two hash operations. The proposed scheme allows user to freely change his password and also provides user anonymity with untraceability. Muhammad Khurram Khan and Saru Kumari Copyright © 2013 Muhammad Khurram Khan and Saru Kumari. All rights reserved. Prediction of Drugs Target Groups Based on ChEBI Ontology Wed, 20 Nov 2013 17:06:28 +0000 Most drugs have beneficial as well as adverse effects and exert their biological functions by adjusting and altering the functions of their target proteins. Thus, knowledge of drugs target proteins is essential for the improvement of therapeutic effects and mitigation of undesirable side effects. In the study, we proposed a novel prediction method based on drug/compound ontology information extracted from ChEBI to identify drugs target groups from which the kind of functions of a drug may be deduced. By collecting data in KEGG, a benchmark dataset consisting of 876 drugs, categorized into four target groups, was constructed. To evaluate the method more thoroughly, the benchmark dataset was divided into a training dataset and an independent test dataset. It is observed by jackknife test that the overall prediction accuracy on the training dataset was 83.12%, while it was 87.50% on the test dataset—the predictor exhibited an excellent generalization. The good performance of the method indicates that the ontology information of the drugs contains rich information about their target groups, and the study may become an inspiration to solve the problems of this sort and bridge the gap between ChEBI ontology and drugs target groups. Yu-Fei Gao, Lei Chen, Guo-Hua Huang, Tao Zhang, Kai-Yan Feng, Hai-Peng Li, and Yang Jiang Copyright © 2013 Yu-Fei Gao et al. All rights reserved. Identifying Breast Cancer Subtype Related miRNAs from Two Constructed miRNAs Interaction Networks in Silico Method Wed, 20 Nov 2013 08:32:57 +0000 Background. It has been known that microRNAs (miRNAs) regulate the expression of multiple proteins and therefore are likely to emerge as more effective targets of selective therapeutic modalities for breast cancer. Although recent lines of evidence have approved that miRNAs are associated with the most common molecular breast cancer subtypes, the studies to breast cancer subtypes have not been well characterized. Objectives. In this study, we propose a silico method to identify breast cancer subtype related miRNAs based on two constructed miRNAs interaction networks using miRNA-mRNA dual expression profiling data arising from the same samples. Methods. Firstly, we used a new mutual information estimation method to construct two miRNAs interaction networks based on miRNA-mRNA dual expression profiling data. Secondly, we compared and analyzed the topological properties of these two networks. Finally, miRNAs showing the outstanding topological properties in both of the two networks were identified. Results. Further functional analysis and literature evidence confirm that the identified potential breast cancer subtype related miRNAs are essential to unraveling their biological function. Conclusions. This study provides a new silico method to predict candidate miRNAs of breast cancer subtype from a system biology level and can help exploit for functional studies of important breast cancer subtype related miRNAs. Lin Hua, Lin Li, and Ping Zhou Copyright © 2013 Lin Hua et al. All rights reserved. DeGNServer: Deciphering Genome-Scale Gene Networks through High Performance Reverse Engineering Analysis Sun, 17 Nov 2013 10:21:45 +0000 Analysis of genome-scale gene networks (GNs) using large-scale gene expression data provides unprecedented opportunities to uncover gene interactions and regulatory networks involved in various biological processes and developmental programs, leading to accelerated discovery of novel knowledge of various biological processes, pathways and systems. The widely used context likelihood of relatedness (CLR) method based on the mutual information (MI) for scoring the similarity of gene pairs is one of the accurate methods currently available for inferring GNs. However, the MI-based reverse engineering method can achieve satisfactory performance only when sample size exceeds one hundred. This in turn limits their applications for GN construction from expression data set with small sample size. We developed a high performance web server, DeGNServer, to reverse engineering and decipher genome-scale networks. It extended the CLR method by integration of different correlation methods that are suitable for analyzing data sets ranging from moderate to large scale such as expression profiles with tens to hundreds of microarray hybridizations, and implemented all analysis algorithms using parallel computing techniques to infer gene-gene association at extraordinary speed. In addition, we integrated the SNBuilder and GeNa algorithms for subnetwork extraction and functional module discovery. DeGNServer is publicly and freely available online. Jun Li, Hairong Wei, and Patrick Xuechun Zhao Copyright © 2013 Jun Li et al. All rights reserved. A Systems’ Biology Approach to Study MicroRNA-Mediated Gene Regulatory Networks Sun, 17 Nov 2013 09:00:43 +0000 MicroRNAs (miRNAs) are potent effectors in gene regulatory networks where aberrant miRNA expression can contribute to human diseases such as cancer. For a better understanding of the regulatory role of miRNAs in coordinating gene expression, we here present a systems biology approach combining data-driven modeling and model-driven experiments. Such an approach is characterized by an iterative process, including biological data acquisition and integration, network construction, mathematical modeling and experimental validation. To demonstrate the application of this approach, we adopt it to investigate mechanisms of collective repression on p21 by multiple miRNAs. We first construct a p21 regulatory network based on data from the literature and further expand it using algorithms that predict molecular interactions. Based on the network structure, a detailed mechanistic model is established and its parameter values are determined using data. Finally, the calibrated model is used to study the effect of different miRNA expression profiles and cooperative target regulation on p21 expression levels in different biological contexts. Xin Lai, Animesh Bhattacharya, Ulf Schmitz, Manfred Kunz, Julio Vera, and Olaf Wolkenhauer Copyright © 2013 Xin Lai et al. All rights reserved. Novel Natural Structure Corrector of ApoE4 for Checking Alzheimer’s Disease: Benefits from High Throughput Screening and Molecular Dynamics Simulations Wed, 13 Nov 2013 08:27:06 +0000 A major genetic suspect for Alzheimer’s disease is the pathological conformation assumed by apolipoprotein E4 (ApoE4) through intramolecular interaction. In the present study, a large library of natural compounds was screened against ApoE4 to identify novel therapeutic molecules that can prevent ApoE4 from being converted to its pathological conformation. We report two such natural compounds PHC and IAH that bound to the active site of ApoE4 during the docking process. The binding analysis suggested that they have a strong mechanistic ability to correct the pathological structural orientation of ApoE4 by preventing repulsion between Arg 61 and Arg 112, thus inhibiting the formation of a salt bridge between Arg 61 and Glu 255. However, when the molecular dynamics simulations were carried out, structural changes in the PHC-bound complex forced PHC to move out of the cavity thus destabilizing the complex. However, IAH was structurally stable inside the binding pocket throughout the simulations trajectory. Our simulations results indicate that the initial receptor-ligand interaction observed after docking could be limited due to the receptor rigid docking algorithm and that the conformations and interactions observed after simulation runs are more energetically favored and should be better representations of derivative poses in the receptor. Manisha Goyal, Sonam Grover, Jaspreet Kaur Dhanjal, Sukriti Goyal, Chetna Tyagi, Sajeev Chacko, and Abhinav Grover Copyright © 2013 Manisha Goyal et al. All rights reserved. Efficient Haplotype Block Partitioning and Tag SNP Selection Algorithms under Various Constraints Mon, 11 Nov 2013 14:36:46 +0000 Patterns of linkage disequilibrium plays a central role in genome-wide association studies aimed at identifying genetic variation responsible for common human diseases. These patterns in human chromosomes show a block-like structure, and regions of high linkage disequilibrium are called haplotype blocks. A small subset of SNPs, called tag SNPs, is sufficient to capture the haplotype patterns in each haplotype block. Previously developed algorithms completely partition a haplotype sample into blocks while attempting to minimize the number of tag SNPs. However, when resource limitations prevent genotyping all the tag SNPs, it is desirable to restrict their number. We propose two dynamic programming algorithms, incorporating many diversity evaluation functions, for haplotype block partitioning using a limited number of tag SNPs. We use the proposed algorithms to partition the chromosome 21 haplotype data. When the sample is fully partitioned into blocks by our algorithms, the 2,266 blocks and 3,260 tag SNPs are fewer than those identified by previous studies. We also demonstrate that our algorithms find the optimal solution by exploiting the nonmonotonic property of a common haplotype-evaluation function. Wen-Pei Chen, Che-Lun Hung, and Yaw-Ling Lin Copyright © 2013 Wen-Pei Chen et al. All rights reserved. QPLOT: A Quality Assessment Tool for Next Generation Sequencing Data Mon, 11 Nov 2013 11:16:47 +0000 Background. Next generation sequencing (NGS) is being widely used to identify genetic variants associated with human disease. Although the approach is cost effective, the underlying data is susceptible to many types of error. Importantly, since NGS technologies and protocols are rapidly evolving, with constantly changing steps ranging from sample preparation to data processing software updates, it is important to enable researchers to routinely assess the quality of sequencing and alignment data prior to downstream analyses. Results. Here we describe QPLOT, an automated tool that can facilitate the quality assessment of sequencing run performance. Taking standard sequence alignments as input, QPLOT generates a series of diagnostic metrics summarizing run quality and produces convenient graphical summaries for these metrics. QPLOT is computationally efficient, generates webpages for interactive exploration of detailed results, and can handle the joint output of many sequencing runs. Conclusion. QPLOT is an automated tool that facilitates assessment of sequence run quality. We routinely apply QPLOT to ensure quick detection of diagnostic of sequencing run problems. We hope that QPLOT will be useful to the community as well. Bingshan Li, Xiaowei Zhan, Mary-Kate Wing, Paul Anderson, Hyun Min Kang, and Goncalo R. Abecasis Copyright © 2013 Bingshan Li et al. All rights reserved. A Comparative Analysis of Biomarker Selection Techniques Sun, 10 Nov 2013 09:15:14 +0000 Feature selection has become the essential step in biomarker discovery from high-dimensional genomics data. It is recognized that different feature selection techniques may result in different set of biomarkers, that is, different groups of genes highly correlated to a given pathological condition, but few direct comparisons exist which quantify these differences in a systematic way. In this paper, we propose a general methodology for comparing the outcomes of different selection techniques in the context of biomarker discovery. The comparison is carried out along two dimensions: (i) measuring the similarity/dissimilarity of selected gene sets; (ii) evaluating the implications of these differences in terms of both predictive performance and stability of selected gene sets. As a case study, we considered three benchmarks deriving from DNA microarray experiments and conducted a comparative analysis among eight selection methods, representatives of different classes of feature selection techniques. Our results show that the proposed approach can provide useful insight about the pattern of agreement of biomarker discovery techniques. Nicoletta Dessì, Emanuele Pascariello, and Barbara Pes Copyright © 2013 Nicoletta Dessì et al. All rights reserved. Prediction of Gene Phenotypes Based on GO and KEGG Pathway Enrichment Scores Thu, 07 Nov 2013 14:53:49 +0000 Observing what phenotype the overexpression or knockdown of gene can cause is the basic method of investigating gene functions. Many advanced biotechnologies, such as RNAi, were developed to study the gene phenotype. But there are still many limitations. Besides the time and cost, the knockdown of some gene may be lethal which makes the observation of other phenotypes impossible. Due to ethical and technological reasons, the knockdown of genes in complex species, such as mammal, is extremely difficult. Thus, we proposed a new sequence-based computational method called kNNA-based method for gene phenotypes prediction. Different to the traditional sequence-based computational method, our method regards the multiphenotype as a whole network which can rank the possible phenotypes associated with the query protein and shows a more comprehensive view of the protein's biological effects. According to the prediction result of yeast, we also find some more related features, including GO and KEGG information, which are making more contributions in identifying protein phenotypes. This method can be applied in gene phenotype prediction in other species. Tao Zhang, Min Jiang, Lei Chen, Bing Niu, and Yudong Cai Copyright © 2013 Tao Zhang et al. All rights reserved. ASPic-GeneID: A Lightweight Pipeline for Gene Prediction and Alternative Isoforms Detection Thu, 07 Nov 2013 13:15:40 +0000 New genomes are being sequenced at an increasingly rapid rate, far outpacing the rate at which manual gene annotation can be performed. Automated genome annotation is thus necessitated by this growth in genome projects; however, full-fledged annotation systems are usually home-grown and customized to a particular genome. There is thus a renewed need for accurate ab initio gene prediction methods. However, it is apparent that fully ab initio methods fall short of the required level of sensitivity and specificity for a quality annotation. Evidence in the form of expressed sequences gives the single biggest improvement in accuracy when used to inform gene predictions. Here, we present a lightweight pipeline for first-pass gene prediction on newly sequenced genomes. The two main components are ASPic, a program that derives highly accurate, albeit not necessarily complete, EST-based transcript annotations from EST alignments, and GeneID, a standard gene prediction program, which we have modified to take as evidence intron annotations. The introns output by ASPic CDS predictions is given to GeneID to constrain the exon-chaining process and produce predictions consistent with the underlying EST alignments. The pipeline was successfully tested on the entire C. elegans genome and the 44 ENCODE human pilot regions. Tyler Alioto, Ernesto Picardi, Roderic Guigó, and Graziano Pesole Copyright © 2013 Tyler Alioto et al. All rights reserved. Comparative Study of Exome Copy Number Variation Estimation Tools Using Array Comparative Genomic Hybridization as Control Mon, 04 Nov 2013 14:18:32 +0000 Exome sequencing using next-generation sequencing technologies is a cost-efficient approach to selectively sequencing coding regions of the human genome for detection of disease variants. One of the lesser known yet important applications of exome sequencing data is to identify copy number variation (CNV). There have been many exome CNV tools developed over the last few years, but the performance and accuracy of these programs have not been thoroughly evaluated. In this study, we systematically compared four popular exome CNV tools (CoNIFER, cn.MOPS, exomeCopy, and ExomeDepth) and evaluated their effectiveness against array comparative genome hybridization (array CGH) platforms. We found that exome CNV tools are capable of identifying CNVs, but they can have problems such as high false positives, low sensitivity, and duplication bias when compared to array CGH platforms. While exome CNV tools do serve their purpose for data mining, careful evaluation and additional validation is highly recommended. Based on all these results, we recommend CoNIFER and cn.MOPs for nonpaired exome CNV detection over the other two tools due to a low false-positive rate, although none of the four exome CNV tools performed at an outstanding level when compared to array CGH. Yan Guo, Quanghu Sheng, David C. Samuels, Brian Lehmann, Joshua A. Bauer, Jennifer Pietenpol, and Yu Shyr Copyright © 2013 Yan Guo et al. All rights reserved. Enabling Large-Scale Biomedical Analysis in the Cloud Thu, 31 Oct 2013 09:08:49 +0000 Recent progress in high-throughput instrumentations has led to an astonishing growth in both volume and complexity of biomedical data collected from various sources. The planet-size data brings serious challenges to the storage and computing technologies. Cloud computing is an alternative to crack the nut because it gives concurrent consideration to enable storage and high-performance computing on large-scale data. This work briefly introduces the data intensive computing system and summarizes existing cloud-based resources in bioinformatics. These developments and applications would facilitate biomedical research to make the vast amount of diversification data meaningful and usable. Ying-Chih Lin, Chin-Sheng Yu, and Yen-Jen Lin Copyright © 2013 Ying-Chih Lin et al. All rights reserved. Classifying Human Voices by Using Hybrid SFX Time-Series Preprocessing and Ensemble Feature Selection Tue, 29 Oct 2013 15:28:36 +0000 Voice biometrics is one kind of physiological characteristics whose voice is different for each individual person. Due to this uniqueness, voice classification has found useful applications in classifying speakers’ gender, mother tongue or ethnicity (accent), emotion states, identity verification, verbal command control, and so forth. In this paper, we adopt a new preprocessing method named Statistical Feature Extraction (SFX) for extracting important features in training a classification model, based on piecewise transformation treating an audio waveform as a time-series. Using SFX we can faithfully remodel statistical characteristics of the time-series; together with spectral analysis, a substantial amount of features are extracted in combination. An ensemble is utilized in selecting only the influential features to be used in classification model induction. We focus on the comparison of effects of various popular data mining algorithms on multiple datasets. Our experiment consists of classification tests over four typical categories of human voice data, namely, Female and Male, Emotional Speech, Speaker Identification, and Language Recognition. The experiments yield encouraging results supporting the fact that heuristically choosing significant features from both time and frequency domains indeed produces better performance in voice classification than traditional signal processing techniques alone, like wavelets and LPC-to-CC. Simon Fong, Kun Lan, and Raymond Wong Copyright © 2013 Simon Fong et al. All rights reserved. New aQTL SNPs for the CYP2D6 Identified by a Novel Mediation Analysis of Genome-Wide SNP Arrays, Gene Expression Arrays, and CYP2D6 Activity Tue, 22 Oct 2013 09:11:04 +0000 Background. The genome-wide association studies (GWAS) have been successful during the last few years. A key challenge is that the interpretation of the results is not straightforward, especially for transacting SNPs. Integration of transcriptome data into GWAS may provide clues elucidating the mechanisms by which a genetic variant leads to a disease. Methods. Here, we developed a novel mediation analysis approach to identify new expression quantitative trait loci (eQTL) driving CYP2D6 activity by combining genotype, gene expression, and enzyme activity data. Results. 389,573 and 1,214,416 SNP-transcript-CYP2D6 activity trios are found strongly associated (, % and 11.7%) for two different genotype platforms, namely, Affymetrix and Illumina, respectively. The majority of eQTLs are trans-SNPs. A single polymorphism leads to widespread downstream changes in the expression of distant genes by affecting major regulators or transcription factors (TFs), which would be visible as an eQTL hotspot and can lead to large and consistent biological effects. Overlapped eQTL hotspots with the mediators lead to the discovery of 64 TFs. Conclusions. Our mediation analysis is a powerful approach in identifying the trans-QTL-phenotype associations. It improves our understanding of the functional genetic variations for the liver metabolism mechanisms. Guanglong Jiang, Arindom Chakraborty, Zhiping Wang, Malaz Boustani, Yunlong Liu, Todd Skaar, and Lang Li Copyright © 2013 Guanglong Jiang et al. All rights reserved. A Quantitative Analysis of the Impact on Chromatin Accessibility by Histone Modifications and Binding of Transcription Factors in DNase I Hypersensitive Sites Tue, 22 Oct 2013 08:36:13 +0000 It is known that chromatin features such as histone modifications and the binding of transcription factors exert a significant impact on the “openness” of chromatin. In this study, we present a quantitative analysis of the genome-wide relationship between chromatin features and chromatin accessibility in DNase I hypersensitive sites. We found that these features show distinct preference to localize in open chromatin. In order to elucidate the exact impact, we derived quantitative models to directly predict the “openness” of chromatin using histone modification features and transcription factor binding features, respectively. We show that these two types of features are highly predictive for chromatin accessibility in a statistical viewpoint. Moreover, our results indicate that these features are highly redundant and only a small number of features are needed to achieve a very high predictive power. Our study provides new insights into the true biological phenomena and the combinatorial effects of chromatin features to differential DNase I hypersensitivity. Peng Cui, Jing Li, Bo Sun, Menghuan Zhang, Baofeng Lian, Yixue Li, and Lu Xie Copyright © 2013 Peng Cui et al. All rights reserved. A Review for Detecting Gene-Gene Interactions Using Machine Learning Methods in Genetic Epidemiology Mon, 21 Oct 2013 14:59:30 +0000 Recently, the greatest statistical computational challenge in genetic epidemiology is to identify and characterize the genes that interact with other genes and environment factors that bring the effect on complex multifactorial disease. These gene-gene interactions are also denoted as epitasis in which this phenomenon cannot be solved by traditional statistical method due to the high dimensionality of the data and the occurrence of multiple polymorphism. Hence, there are several machine learning methods to solve such problems by identifying such susceptibility gene which are neural networks (NNs), support vector machine (SVM), and random forests (RFs) in such common and multifactorial disease. This paper gives an overview on machine learning methods, describing the methodology of each machine learning methods and its application in detecting gene-gene and gene-environment interactions. Lastly, this paper discussed each machine learning method and presents the strengths and weaknesses of each machine learning method in detecting gene-gene interactions in complex human disease. Ching Lee Koo, Mei Jing Liew, Mohd Saberi Mohamad, and Abdul Hakim Mohamed Salleh Copyright © 2013 Ching Lee Koo et al. All rights reserved. Systems Approaches to Modeling Chronic Mucosal Inflammation Mon, 21 Oct 2013 09:18:54 +0000 The respiratory mucosa is a major coordinator of the inflammatory response in chronic airway diseases, including asthma and chronic obstructive pulmonary disease (COPD). Signals produced by the chronic inflammatory process induce epithelial mesenchymal transition (EMT) that dramatically alters the epithelial cell phenotype. The effects of EMT on epigenetic reprogramming and the activation of transcriptional networks are known, its effects on the innate inflammatory response are underexplored. We used a multiplex gene expression profiling platform to investigate the perturbations of the innate pathways induced by TGFβ in a primary airway epithelial cell model of EMT. EMT had dramatic effects on the induction of the innate pathway and the coupling interval of the canonical and noncanonical NF-κB pathways. Simulation experiments demonstrate that rapid, coordinated cap-independent translation of TRAF-1 and NF-κB2 is required to reduce the noncanonical pathway coupling interval. Experiments using amantadine confirmed the prediction that TRAF-1 and NF-κB2/p100 production is mediated by an IRES-dependent mechanism. These data indicate that the epigenetic changes produced by EMT induce dynamic state changes of the innate signaling pathway. Further applications of systems approaches will provide understanding of this complex phenotype through deterministic modeling and multidimensional (genomic and proteomic) profiling. Mridul Kalita, Bing Tian, Boning Gao, Sanjeev Choudhary, Thomas G. Wood, Joseph R. Carmical, Istvan Boldogh, Sankar Mitra, John D. Minna, and Allan R. Brasier Copyright © 2013 Mridul Kalita et al. All rights reserved. Statistical Fractal Models Based on GND-PCA and Its Application on Classification of Liver Diseases Wed, 09 Oct 2013 17:37:31 +0000 A new method is proposed to establish the statistical fractal model for liver diseases classification. Firstly, the fractal theory is used to construct the high-order tensor, and then Generalized -dimensional Principal Component Analysis (GND-PCA) is used to establish the statistical fractal model and select the feature from the region of liver; at the same time different features have different weights, and finally, Support Vector Machine Optimized Ant Colony (ACO-SVM) algorithm is used to establish the classifier for the recognition of liver disease. In order to verify the effectiveness of the proposed method, PCA eigenface method and normal SVM method are chosen as the contrast methods. The experimental results show that the proposed method can reconstruct liver volume better and improve the classification accuracy of liver diseases. Huiyan Jiang, Tianjiao Feng, Di Zhao, Benqiang Yang, Libo Zhang, and Yenwei Chen Copyright © 2013 Huiyan Jiang et al. All rights reserved. Reducing the Complexity of Complex Gene Coexpression Networks by Coupling Multiweighted Labeling with Topological Analysis Mon, 07 Oct 2013 18:38:01 +0000 Undirected gene coexpression networks obtained from experimental expression data coupled with efficient computational procedures are increasingly used to identify potentially relevant biological information (e.g., biomarkers) for a particular disease. However, coexpression networks built from experimental expression data are in general large highly connected networks with an elevated number of false-positive interactions (nodes and edges). In order to infer relevant information, the network must be properly filtered and its complexity reduced. Given the complexity and the multivariate nature of the information contained in the network, this requires the development and application of efficient feature selection algorithms to be able to exploit the topological characteristics of the network to identify relevant nodes and edges. This paper proposes an efficient multivariate filtering designed to analyze the topological properties of a coexpression network in order to identify potential relevant genes for a given disease. The algorithm has been tested on three datasets for three well known and studied diseases: acute myeloid leukemia, breast cancer, and diffuse large B-cell lymphoma. Results have been validated resorting to bibliographic data automatically mined using the ProteinQuest literature mining tool. Alfredo Benso, Paolo Cornale, Stefano Di Carlo, Gianfranco Politano, and Alessandro Savino Copyright © 2013 Alfredo Benso et al. All rights reserved. Reconstruction and Analysis of Human Kidney-Specific Metabolic Network Based on Omics Data Sat, 05 Oct 2013 14:29:08 +0000 With the advent of the high-throughput data production, recent studies of tissue-specific metabolic networks have largely advanced our understanding of the metabolic basis of various physiological and pathological processes. However, for kidney, which plays an essential role in the body, the available kidney-specific model remains incomplete. This paper reports the reconstruction and characterization of the human kidney metabolic network based on transcriptome and proteome data. In silico simulations revealed that house-keeping genes were more essential than kidney-specific genes in maintaining kidney metabolism. Importantly, a total of 267 potential metabolic biomarkers for kidney-related diseases were successfully explored using this model. Furthermore, we found that the discrepancies in metabolic processes of different tissues are directly corresponding to tissue's functions. Finally, the phenotypes of the differentially expressed genes in diabetic kidney disease were characterized, suggesting that these genes may affect disease development through altering kidney metabolism. Thus, the human kidney-specific model constructed in this study may provide valuable information for the metabolism of kidney and offer excellent insights into complex kidney diseases. Ai-Di Zhang, Shao-Xing Dai, and Jing-Fei Huang Copyright © 2013 Ai-Di Zhang et al. All rights reserved. A Guide RNA Sequence Design Platform for the CRISPR/Cas9 System for Model Organism Genomes Thu, 03 Oct 2013 15:34:20 +0000 Cas9/CRISPR has been reported to efficiently induce targeted gene disruption and homologous recombination in both prokaryotic and eukaryotic cells. Thus, we developed a Guide RNA Sequence Design Platform for the Cas9/CRISPR silencing system for model organisms. The platform is easy to use for gRNA design with input query sequences. It finds potential targets by PAM and ranks them according to factors including uniqueness, SNP, RNA secondary structure, and AT content. The platform allows users to upload and share their experimental results. In addition, most guide RNA sequences from published papers have been put into our database. Ming Ma, Adam Y. Ye, Weiguo Zheng, and Lei Kong Copyright © 2013 Ming Ma et al. All rights reserved. Systems Approaches Evaluating the Perturbation of Xenobiotic Metabolism in Response to Cigarette Smoke Exposure in Nasal and Bronchial Tissues Thu, 03 Oct 2013 11:51:13 +0000 Capturing the effects of exposure in a specific target organ is a major challenge in risk assessment. Exposure to cigarette smoke (CS) implicates the field of tissue injury in the lung as well as nasal and airway epithelia. Xenobiotic metabolism in particular becomes an attractive tool for chemical risk assessment because of its responsiveness against toxic compounds, including those present in CS. This study describes an efficient integration from transcriptomic data to quantitative measures, which reflect the responses against xenobiotics that are captured in a biological network model. We show here that our novel systems approach can quantify the perturbation in the network model of xenobiotic metabolism. We further show that this approach efficiently compares the perturbation upon CS exposure in bronchial and nasal epithelial cells in vivo samples obtained from smokers. Our observation suggests the xenobiotic responses in the bronchial and nasal epithelial cells of smokers were similar to those observed in their respective organotypic models exposed to CS. Furthermore, the results suggest that nasal tissue is a reliable surrogate to measure xenobiotic responses in bronchial tissue. Anita R. Iskandar, Florian Martin, Marja Talikka, Walter K. Schlage, Radina Kostadinova, Carole Mathis, Julia Hoeng, and Manuel C. Peitsch Copyright © 2013 Anita R. Iskandar et al. All rights reserved. Biocloud: Cloud Computing for Biological, Genomics, and Drug Design Wed, 02 Oct 2013 15:53:48 +0000 Ching-Hsien Hsu, Chun-Yuan Lin, Ming Ouyang, and Yi Ke Guo Copyright © 2013 Ching-Hsien Hsu et al. All rights reserved. An Accurate Method for Prediction of Protein-Ligand Binding Site on Protein Surface Using SVM and Statistical Depth Function Mon, 30 Sep 2013 15:09:38 +0000 Since proteins carry out their functions through interactions with other molecules, accurately identifying the protein-ligand binding site plays an important role in protein functional annotation and rational drug discovery. In the past two decades, a lot of algorithms were present to predict the protein-ligand binding site. In this paper, we introduce statistical depth function to define negative samples and propose an SVM-based method which integrates sequence and structural information to predict binding site. The results show that the present method performs better than the existent ones. The accuracy, sensitivity, and specificity on training set are 77.55%, 56.15%, and 87.96%, respectively; on the independent test set, the accuracy, sensitivity, and specificity are 80.36%, 53.53%, and 92.38%, respectively. Kui Wang, Jianzhao Gao, Shiyi Shen, Jack A. Tuszynski, Jishou Ruan, and Gang Hu Copyright © 2013 Kui Wang et al. All rights reserved. Highly Ordered Architecture of MicroRNA Cluster Mon, 30 Sep 2013 09:44:09 +0000 Although it is known that the placement of genes in a cluster may be critical for proper expression patterns, it remains largely unclear whether the orders of members in an miRNA cluster have biological insights. By investigating the relationship between expression and orders for miRNAs from the oncogenic miR-17-92 cluster, we observed a highly ordered architecture in this cluster. A significant correlation between miRNA expression level and its placement was revealed. More importantly, the placement of these miRNAs is associated with their dysregulation in cancer. Here, we presented the opinion that miRNA clusters are not arranged randomly but show highly ordered architectures, which may have critical roles in physiology and pathology. Bing Shi, Mingxuan Zhu, Shuang Liu, and Mandun Zhang Copyright © 2013 Bing Shi et al. All rights reserved. Genome-Wide Analysis of Human MicroRNA Stability Sat, 28 Sep 2013 12:12:40 +0000 Increasing studies have shown that microRNA (miRNA) stability plays important roles in physiology. However, the global picture of miRNA stability remains largely unknown. Here, we had analyzed genome-wide miRNA stability across 10 diverse cell types using miRNA arrays. We found that miRNA stability shows high dynamics and diversity both within individual cells and across cell types. Strikingly, we observed a negative correlation between miRNA stability and miRNA expression level, which is different from current findings on other biological molecules such as proteins and mRNAs that show positive and not negative correlations between stability and expression level. This finding indicates that miRNA has a distinct action mode, which we called “rapid production, rapid turnover; slow production, slow turnover.” This mode further suggests that high expression miRNAs normally degrade fast and may endow the cell with special properties that facilitate cellular status-transition. Moreover, we revealed that the stability of miRNAs is affected by cohorts of factors that include miRNA targets, transcription factors, nucleotide content, evolution, associated disease, and environmental factors. Together, our results provided an extensive description of the global landscape, dynamics, and distinct mode of human miRNA stability, which provide help in investigating their functions in physiology and pathophysiology. Yang Li, Zhixin Li, Shixin Zhou, Jinhua Wen, Bin Geng, Jichun Yang, and Qinghua Cui Copyright © 2013 Yang Li et al. All rights reserved. Computer-Assisted System with Multiple Feature Fused Support Vector Machine for Sperm Morphology Diagnosis Thu, 26 Sep 2013 14:23:31 +0000 Sperm morphology is an important technique in identifying the health of sperms. In this paper we present a new system and novel approaches to classify different kinds of sperm images in order to assess their health. Our approach mainly relies on a one-dimensional feature which is extracted from the sperm’s contour with gray level information. Our approach can handle rotation and scaling of the image. Moreover, it is fused with SVM classification to improve its accuracy. In our evaluation, our method has better performance than the existing approaches to sperm classification. Kuo-Kun Tseng, Yifan Li, Chih-Yu Hsu, Huang-Nan Huang, Ming Zhao, and Mingyue Ding Copyright © 2013 Kuo-Kun Tseng et al. All rights reserved. Enzyme Reaction Annotation Using Cloud Techniques Thu, 26 Sep 2013 12:13:04 +0000 An understanding of the activities of enzymes could help to elucidate the metabolic pathways of thousands of chemical reactions that are catalyzed by enzymes in living systems. Sophisticated applications such as drug design and metabolic reconstruction could be developed using accurate enzyme reaction annotation. Because accurate enzyme reaction annotation methods create potential for enhanced production capacity in these applications, they have received greater attention in the global market. We propose the enzyme reaction prediction (ERP) method as a novel tool to deduce enzyme reactions from domain architecture. We used several frequency relationships between architectures and reactions to enhance the annotation rates for single and multiple catalyzed reactions. The deluge of information which arose from high-throughput techniques in the postgenomic era has improved our understanding of biological data, although it presents obstacles in the data-processing stage. The high computational capacity provided by cloud computing has resulted in an exponential growth in the volume of incoming data. Cloud services also relieve the requirement for large-scale memory space required by this approach to analyze enzyme kinetic data. Our tool is designed as a single execution file; thus, it could be applied to any cloud platform in which multiple queries are supported. Chuan-Ching Huang, Chun-Yuan Lin, Cheng-Wen Chang, and Chuan Yi Tang Copyright © 2013 Chuan-Ching Huang et al. All rights reserved. The Quantitative Overhead Analysis for Effective Task Migration in Biosensor Networks Thu, 26 Sep 2013 10:16:33 +0000 We present a quantitative overhead analysis for effective task migration in biosensor networks. A biosensor network is the key technology which can automatically provide accurate and specific parameters of a human in real time. Biosensor nodes are typically very small devices, so the use of computing resources is restricted. Due to the limitation of nodes, the biosensor network is vulnerable to an external attack against a system for exhausting system availability. Since biosensor nodes generally deal with sensitive and privacy data, their malfunction can bring unexpected damage to system. Therefore, we have to use a task migration process to avoid the malfunction of particular biosensor nodes. Also, it is essential to accurately analyze overhead to apply a proper migration process. In this paper, we calculated task processing time of nodes to analyze system overhead and compared the task processing time applied to a migration process and a general method. We focused on a cluster ratio and different processing time between biosensor nodes in our simulation environment. The results of performance evaluation show that task execution time is greatly influenced by a cluster ratio and different processing time of biosensor nodes. In the results, the proposed algorithm reduces total task execution time in a migration process. Sung-Min Jung, Tae-Kyung Kim, Jung-Ho Eom, and Tai-Myoung Chung Copyright © 2013 Sung-Min Jung et al. All rights reserved. Mixing Energy Models in Genetic Algorithms for On-Lattice Protein Structure Prediction Wed, 25 Sep 2013 11:32:53 +0000 Protein structure prediction (PSP) is computationally a very challenging problem. The challenge largely comes from the fact that the energy function that needs to be minimised in order to obtain the native structure of a given protein is not clearly known. A high resolution energy model could better capture the behaviour of the actual energy function than a low resolution energy model such as hydrophobic polar. However, the fine grained details of the high resolution interaction energy matrix are often not very informative for guiding the search. In contrast, a low resolution energy model could effectively bias the search towards certain promising directions. In this paper, we develop a genetic algorithm that mainly uses a high resolution energy model for protein structure evaluation but uses a low resolution HP energy model in focussing the search towards exploring structures that have hydrophobic cores. We experimentally show that this mixing of energy models leads to significant lower energy structures compared to the state-of-the-art results. Mahmood A. Rashid, M. A. Hakim Newton, Md. Tamjidul Hoque, and Abdul Sattar Copyright © 2013 Mahmood A. Rashid et al. All rights reserved. Advanced Systems Biology Methods in Drug Discovery and Translational Biomedicine Thu, 19 Sep 2013 13:38:59 +0000 Systems biology is in an exponential development stage in recent years and has been widely utilized in biomedicine to better understand the molecular basis of human disease and the mechanism of drug action. Here, we discuss the fundamental concept of systems biology and its two computational methods that have been commonly used, that is, network analysis and dynamical modeling. The applications of systems biology in elucidating human disease are highlighted, consisting of human disease networks, treatment response prediction, investigation of disease mechanisms, and disease-associated gene prediction. In addition, important advances in drug discovery, to which systems biology makes significant contributions, are discussed, including drug-target networks, prediction of drug-target interactions, investigation of drug adverse effects, drug repositioning, and drug combination prediction. The systems biology methods and applications covered in this review provide a framework for addressing disease mechanism and approaching drug discovery, which will facilitate the translation of research findings into clinical benefits such as novel biomarkers and promising therapies. Jun Zou, Ming-Wu Zheng, Gen Li, and Zhi-Guang Su Copyright © 2013 Jun Zou et al. All rights reserved. Evaluation of Stream Mining Classifiers for Real-Time Clinical Decision Support System: A Case Study of Blood Glucose Prediction in Diabetes Therapy Thu, 19 Sep 2013 10:05:03 +0000 Earlier on, a conceptual design on the real-time clinical decision support system (rt-CDSS) with data stream mining was proposed and published. The new system is introduced that can analyze medical data streams and can make real-time prediction. This system is based on a stream mining algorithm called VFDT. The VFDT is extended with the capability of using pointers to allow the decision tree to remember the mapping relationship between leaf nodes and the history records. In this paper, which is a sequel to the rt-CDSS design, several popular machine learning algorithms are investigated for their suitability to be a candidate in the implementation of classifier at the rt-CDSS. A classifier essentially needs to accurately map the events inputted to the system into one of the several predefined classes of assessments, such that the rt-CDSS can follow up with the prescribed remedies being recommended to the clinicians. For a real-time system like rt-CDSS, the major technological challenges lie in the capability of the classifier to process, analyze and classify the dynamic input data, quickly and upmost reliably. An experimental comparison is conducted. This paper contributes to the insight of choosing and embedding a stream mining classifier into rt-CDSS with a case study of diabetes therapy. Simon Fong, Yang Zhang, Jinan Fiaidhi, Osama Mohammed, and Sabah Mohammed Copyright © 2013 Simon Fong et al. All rights reserved. New Optical Methods for Liveness Detection on Fingers Wed, 18 Sep 2013 19:13:30 +0000 This paper is devoted to new optical methods, which are supposed to be used for liveness detection on fingers. First we describe the basics about fake finger use in fingerprint recognition process and the possibilities of liveness detection. Then we continue with introducing three new liveness detection methods, which we developed and tested in the scope of our research activities—the first one is based on measurement of the pulse, the second one on variations of optical characteristics caused by pressure change, and the last one is based on reaction of skin to illumination with different wavelengths. The last part deals with the influence of skin diseases on fingerprint recognition, especially on liveness detection. Martin Drahansky, Michal Dolezel, Jan Vana, Eva Brezinova, Jaegeol Yim, and Kyubark Shim Copyright © 2013 Martin Drahansky et al. All rights reserved. CADe System Integrated within the Electronic Health Record Tue, 17 Sep 2013 13:34:59 +0000 The latest technological advances and information support systems for clinics and hospitals produce a wide range of possibilities in the storage and retrieval of an ever-growing amount of clinical information as well as in detection and diagnosis. In this work, an Electronic Health Record (EHR) combined with a Computer Aided Detection (CADe) system for breast cancer diagnosis has been implemented. Our objective is to provide to radiologists a comprehensive working environment that facilitates the integration, the image visualization, and the use of aided tools within the EHR. For this reason, a development methodology based on hardware and software system features in addition to system requirements must be present during the whole development process. This will lead to a complete environment for displaying, editing, and reporting results not only for the patient information but also for their medical images in standardised formats such as DICOM and DICOM-SR. As a result, we obtain a CADe system which helps in detecting breast cancer using mammograms and is completely integrated into an EHR. Noelia Vállez, Gloria Bueno, Óscar Déniz, María del Milagro Fernández, Carlos Pastor, Miguel Ángel Rienda, Pablo Esteve, and María Arias Copyright © 2013 Noelia Vállez et al. All rights reserved. Study of MicroRNAs Related to the Liver Regeneration of the Whitespotted Bamboo Shark, Chiloscyllium plagiosum Tue, 17 Sep 2013 09:55:52 +0000 To understand the mechanisms of liver regeneration better to promote research examining liver diseases and marine biology, normal and regenerative liver tissues of Chiloscyllium plagiosum were harvested 0 h and 24 h after partial hepatectomy (PH) and used to isolate small RNAs for miRNA sequencing. In total, 91 known miRNAs and 166 putative candidate (PC) miRNAs were identified for the first time in Chiloscyllium plagiosum. Through target prediction and GO analysis, 46 of 91 known miRNAs were screened specially for cellular proliferation and growth. Differential expression levels of three miRNAs (xtr-miR-125b, fru-miR-204, and hsa-miR-142-3p_R-1) related to cellular proliferation and apoptosis were measured in normal and regenerating liver tissues at 0 h, 6 h, 12 h, and 24 h using real-time PCR. The expression of these miRNAs showed a rising trend in regenerative liver tissues at 6 h and 12 h but exhibited a downward trend compared to normal levels at 24 h. Differentially expressed genes were screened in normal and regenerating liver tissues at 24 h by DDRT-PCR, and ten sequences were identified. This study provided information regarding the function of genes related to liver regeneration, deepened the understanding of mechanisms of liver regeneration, and resulted in the addition of a significant number of novel miRNAs sequences to GenBank. Conger Lu, Jie Zhang, Zuoming Nie, Jian Chen, Wenping Zhang, Xiaoyuan Ren, Wei Yu, Lili Liu, Caiying Jiang, Yaozhou Zhang, Jiangfeng Guo, Wutong Wu, Jianhong Shu, and Zhengbing Lv Copyright © 2013 Conger Lu et al. All rights reserved. A Study on User Authentication Methodology Using Numeric Password and Fingerprint Biometric Information Tue, 17 Sep 2013 08:35:22 +0000 The prevalence of computers and the development of the Internet made us able to easily access information. As people are concerned about user information security, the interest of the user authentication method is growing. The most common computer authentication method is the use of alphanumerical usernames and passwords. The password authentication systems currently used are easy, but only if you know the password, as the user authentication is vulnerable. User authentication using fingerprints, only the user with the information that is specific to the authentication security is strong. But there are disadvantage such as the user cannot change the authentication key. In this study, we proposed authentication methodology that combines numeric-based password and biometric-based fingerprint authentication system. Use the information in the user's fingerprint, authentication keys to obtain security. Also, using numeric-based password can to easily change the password; the authentication keys were designed to provide flexibility. Seung-hwan Ju, Hee-suk Seo, Sung-hyu Han, Jae-cheol Ryou, and Jin Kwak Copyright © 2013 Seung-hwan Ju et al. All rights reserved. Biomarker Selection and Classification of “-Omics” Data Using a Two-Step Bayes Classification Framework Wed, 11 Sep 2013 11:40:11 +0000 Identification of suitable biomarkers for accurate prediction of phenotypic outcomes is a goal for personalized medicine. However, current machine learning approaches are either too complex or perform poorly. Here, a novel two-step machine-learning framework is presented to address this need. First, a Naïve Bayes estimator is used to rank features from which the top-ranked will most likely contain the most informative features for prediction of the underlying biological classes. The top-ranked features are then used in a Hidden Naïve Bayes classifier to construct a classification prediction model from these filtered attributes. In order to obtain the minimum set of the most informative biomarkers, the bottom-ranked features are successively removed from the Naïve Bayes-filtered feature list one at a time, and the classification accuracy of the Hidden Naïve Bayes classifier is checked for each pruned feature set. The performance of the proposed two-step Bayes classification framework was tested on different types of -omics datasets including gene expression microarray, single nucleotide polymorphism microarray (SNParray), and surface-enhanced laser desorption/ionization time-of-flight (SELDI-TOF) proteomic data. The proposed two-step Bayes classification framework was equal to and, in some cases, outperformed other classification methods in terms of prediction accuracy, minimum number of classification markers, and computational time. Anunchai Assawamakin, Supakit Prueksaaroon, Supasak Kulawonganunchai, Philip James Shaw, Vara Varavithya, Taneth Ruangrajitpakorn, and Sissades Tongsima Copyright © 2013 Anunchai Assawamakin et al. All rights reserved. Cloud Infrastructures for In Silico Drug Discovery: Economic and Practical Aspects Tue, 10 Sep 2013 08:15:34 +0000 Cloud computing opens new perspectives for small-medium biotechnology laboratories that need to perform bioinformatics analysis in a flexible and effective way. This seems particularly true for hybrid clouds that couple the scalability offered by general-purpose public clouds with the greater control and ad hoc customizations supplied by the private ones. A hybrid cloud broker, acting as an intermediary between users and public providers, can support customers in the selection of the most suitable offers, optionally adding the provisioning of dedicated services with higher levels of quality. This paper analyses some economic and practical aspects of exploiting cloud computing in a real research scenario for the in silico drug discovery in terms of requirements, costs, and computational load based on the number of expected users. In particular, our work is aimed at supporting both the researchers and the cloud broker delivering an IaaS cloud infrastructure for biotechnology laboratories exposing different levels of nonfunctional requirements. Daniele D'Agostino, Andrea Clematis, Alfonso Quarati, Daniele Cesini, Federica Chiappori, Luciano Milanesi, and Ivan Merelli Copyright © 2013 Daniele D'Agostino et al. All rights reserved. Characterization of Schizophrenia Adverse Drug Interactions through a Network Approach and Drug Classification Mon, 09 Sep 2013 17:58:21 +0000 Antipsychotic drugs are medications commonly for schizophrenia (SCZ) treatment, which include two groups: typical and atypical. SCZ patients have multiple comorbidities, and the coadministration of drugs is quite common. This may result in adverse drug-drug interactions, which are events that occur when the effect of a drug is altered by the coadministration of another drug. Therefore, it is important to provide a comprehensive view of these interactions for further coadministration improvement. Here, we extracted SCZ drugs and their adverse drug interactions from the DrugBank and compiled a SCZ-specific adverse drug interaction network. This network included 28 SCZ drugs, 241 non-SCZs, and 991 interactions. By integrating the Anatomical Therapeutic Chemical (ATC) classification with the network analysis, we characterized those interactions. Our results indicated that SCZ drugs tended to have more adverse drug interactions than other drugs. Furthermore, SCZ typical drugs had significant interactions with drugs of the “alimentary tract and metabolism” category while SCZ atypical drugs had significant interactions with drugs of the categories “nervous system” and “antiinfectives for systemic uses.” This study is the first to characterize the adverse drug interactions in the course of SCZ treatment and might provide useful information for the future SCZ treatment. Jingchun Sun, Min Zhao, Ayman H. Fanous, and Zhongming Zhao Copyright © 2013 Jingchun Sun et al. All rights reserved. Structural and Sequence Similarities of Hydra Xeroderma Pigmentosum A Protein to Human Homolog Suggest Early Evolution and Conservation Thu, 05 Sep 2013 15:39:25 +0000 Xeroderma pigmentosum group A (XPA) is a protein that binds to damaged DNA, verifies presence of a lesion, and recruits other proteins of the nucleotide excision repair (NER) pathway to the site. Though its homologs from yeast, Drosophila, humans, and so forth are well studied, XPA has not so far been reported from protozoa and lower animal phyla. Hydra is a fresh-water cnidarian with a remarkable capacity for regeneration and apparent lack of organismal ageing. Cnidarians are among the first metazoa with a defined body axis, tissue grade organisation, and nervous system. We report here for the first time presence of XPA gene in hydra. Putative protein sequence of hydra XPA contains nuclear localization signal and bears the zinc-finger motif. It contains two conserved Pfam domains and various characterized features of XPA proteins like regions for binding to excision repair cross-complementing protein-1 (ERCC1) and replication protein A 70 kDa subunit (RPA70) proteins. Hydra XPA shows a high degree of similarity with vertebrate homologs and clusters with deuterostomes in phylogenetic analysis. Homology modelling corroborates the very close similarity between hydra and human XPA. The protein thus most likely functions in hydra in the same manner as in other animals, indicating that it arose early in evolution and has been conserved across animal phyla. Apurva Barve, Saroj Ghaskadbi, and Surendra Ghaskadbi Copyright © 2013 Apurva Barve et al. All rights reserved. In Silico Determination and Validation of Baumannii Acinetobactin Utilization A Structure and Ligand Binding Site Thu, 05 Sep 2013 15:17:04 +0000 Acinetobacter baumannii is a deadly nosocomial pathogen. Iron is an essential element for the pathogen. Under iron-restricted conditions, the bacterium expresses iron-regulated outer membrane proteins (IROMPs). Baumannii acinetobactin utilization (BauA) is the most important member of IROMPs in A. baumannii. Determination of its tertiary structure could help deduction of its functions and its interactions with ligands. The present study unveils BauA 3D structure via in silico approaches. Apart from ab initio, other rational methods such as homology modeling and threading were invoked to achieve the purpose. For homology modeling, BLAST was run on the sequence in order to find the best template. The template was then served to model the 3D structure. All the models built were evaluated qualitatively. The best model predicted by LOMETS was selected for analyses. Refinement of 3D structure as well as determination of its clefts and ligand binding sites was carried out on the structure. In contrast to the typical trimeric arrangement found in porins, BauA is monomeric. The barrel is formed by 22 antiparallel transmembrane β-strands. There are short periplasmic turns and longer surface-located loops. An N-terminal domain referred to either as the cork, the plug, or the hatch domain occludes the β-barrel. Fatemeh Sefid, Iraj Rasooli, and Abolfazl Jahangiri Copyright © 2013 Fatemeh Sefid et al. All rights reserved. Prediction of Effective Drug Combinations by Chemical Interaction, Protein Interaction and Target Enrichment of KEGG Pathways Thu, 05 Sep 2013 11:22:39 +0000 Drug combinatorial therapy could be more effective in treating some complex diseases than single agents due to better efficacy and reduced side effects. Although some drug combinations are being used, their underlying molecular mechanisms are still poorly understood. Therefore, it is of great interest to deduce a novel drug combination by their molecular mechanisms in a robust and rigorous way. This paper attempts to predict effective drug combinations by a combined consideration of: (1) chemical interaction between drugs, (2) protein interactions between drugs’ targets, and (3) target enrichment of KEGG pathways. A benchmark dataset was constructed, consisting of 121 confirmed effective combinations and 605 random combinations. Each drug combination was represented by 465 features derived from the aforementioned three properties. Some feature selection techniques, including Minimum Redundancy Maximum Relevance and Incremental Feature Selection, were adopted to extract the key features. Random forest model was built with its performance evaluated by 5-fold cross-validation. As a result, 55 key features providing the best prediction result were selected. These important features may help to gain insights into the mechanisms of drug combinations, and the proposed prediction model could become a useful tool for screening possible drug combinations. Lei Chen, Bi-Qing Li, Ming-Yue Zheng, Jian Zhang, Kai-Yan Feng, and Yu-Dong Cai Copyright © 2013 Lei Chen et al. All rights reserved. Predicting Drugs Side Effects Based on Chemical-Chemical Interactions and Protein-Chemical Interactions Wed, 04 Sep 2013 08:31:26 +0000 A drug side effect is an undesirable effect which occurs in addition to the intended therapeutic effect of the drug. The unexpected side effects that many patients suffer from are the major causes of large-scale drug withdrawal. To address the problem, it is highly demanded by pharmaceutical industries to develop computational methods for predicting the side effects of drugs. In this study, a novel computational method was developed to predict the side effects of drug compounds by hybridizing the chemical-chemical and protein-chemical interactions. Compared to most of the previous works, our method can rank the potential side effects for any query drug according to their predicted level of risk. A training dataset and test datasets were constructed from the benchmark dataset that contains 835 drug compounds to evaluate the method. By a jackknife test on the training dataset, the 1st order prediction accuracy was 86.30%, while it was 89.16% on the test dataset. It is expected that the new method may become a useful tool for drug design, and that the findings obtained by hybridizing various interactions in a network system may provide useful insights for conducting in-depth pharmacological research as well, particularly at the level of systems biomedicine. Lei Chen, Tao Huang, Jian Zhang, Ming-Yue Zheng, Kai-Yan Feng, Yu-Dong Cai, and Kuo-Chen Chou Copyright © 2013 Lei Chen et al. All rights reserved. Information Content-Based Gene Ontology Semantic Similarity Approaches: Toward a Unified Framework Theory Mon, 02 Sep 2013 14:28:42 +0000 Several approaches have been proposed for computing term information content (IC) and semantic similarity scores within the gene ontology (GO) directed acyclic graph (DAG). These approaches contributed to improving protein analyses at the functional level. Considering the recent proliferation of these approaches, a unified theory in a well-defined mathematical framework is necessary in order to provide a theoretical basis for validating these approaches. We review the existing IC-based ontological similarity approaches developed in the context of biomedical and bioinformatics fields to propose a general framework and unified description of all these measures. We have conducted an experimental evaluation to assess the impact of IC approaches, different normalization models, and correction factors on the performance of a functional similarity metric. Results reveal that considering only parents or only children of terms when assessing information content or semantic similarity scores negatively impacts the approach under consideration. This study produces a unified framework for current and future GO semantic similarity measures and provides theoretical basics for comparing different approaches. The experimental evaluation of different approaches based on different term information content models paves the way towards a solution to the issue of scoring a term’s specificity in the GO DAG. Gaston K. Mazandu and Nicola J. Mulder Copyright © 2013 Gaston K. Mazandu and Nicola J. Mulder. All rights reserved. Network-Based Inference Framework for Identifying Cancer Genes from Gene Expression Data Sun, 01 Sep 2013 13:24:33 +0000 Great efforts have been devoted to alleviate uncertainty of detected cancer genes as accurate identification of oncogenes is of tremendous significance and helps unravel the biological behavior of tumors. In this paper, we present a differential network-based framework to detect biologically meaningful cancer-related genes. Firstly, a gene regulatory network construction algorithm is proposed, in which a boosting regression based on likelihood score and informative prior is employed for improving accuracy of identification. Secondly, with the algorithm, two gene regulatory networks are constructed from case and control samples independently. Thirdly, by subtracting the two networks, a differential-network model is obtained and then used to rank differentially expressed hub genes for identification of cancer biomarkers. Compared with two existing gene-based methods (t-test and lasso), the method has a significant improvement in accuracy both on synthetic datasets and two real breast cancer datasets. Furthermore, identified six genes (TSPYL5, CD55, CCNE2, DCK, BBC3, and MUC1) susceptible to breast cancer were verified through the literature mining, GO analysis, and pathway functional enrichment analysis. Among these oncogenes, TSPYL5 and CCNE2 have been already known as prognostic biomarkers in breast cancer, CD55 has been suspected of playing an important role in breast cancer prognosis from literature evidence, and other three genes are newly discovered breast cancer biomarkers. More generally, the differential-network schema can be extended to other complex diseases for detection of disease associated-genes. Bo Yang, Junying Zhang, Yaling Yin, and Yuanyuan Zhang Copyright © 2013 Bo Yang et al. All rights reserved. Secure Encapsulation and Publication of Biological Services in the Cloud Computing Environment Sun, 01 Sep 2013 11:35:18 +0000 Secure encapsulation and publication for bioinformatics software products based on web service are presented, and the basic function of biological information is realized in the cloud computing environment. In the encapsulation phase, the workflow and function of bioinformatics software are conducted, the encapsulation interfaces are designed, and the runtime interaction between users and computers is simulated. In the publication phase, the execution and management mechanisms and principles of the GRAM components are analyzed. The functions such as remote user job submission and job status query are implemented by using the GRAM components. The services of bioinformatics software are published to remote users. Finally the basic prototype system of the biological cloud is achieved. Weizhe Zhang, Xuehui Wang, Bo Lu, and Tai-hoon Kim Copyright © 2013 Weizhe Zhang et al. All rights reserved. Selecting Summary Statistics in Approximate Bayesian Computation for Calibrating Stochastic Models Sun, 01 Sep 2013 09:47:51 +0000 Approximate Bayesian computation (ABC) is an approach for using measurement data to calibrate stochastic computer models, which are common in biology applications. ABC is becoming the “go-to” option when the data and/or parameter dimension is large because it relies on user-chosen summary statistics rather than the full data and is therefore computationally feasible. One technical challenge with ABC is that the quality of the approximation to the posterior distribution of model parameters depends on the user-chosen summary statistics. In this paper, the user requirement to choose effective summary statistics in order to accurately estimate the posterior distribution of model parameters is investigated and illustrated by example, using a model and corresponding real data of mitochondrial DNA population dynamics. We show that for some choices of summary statistics, the posterior distribution of model parameters is closely approximated and for other choices of summary statistics, the posterior distribution is not closely approximated. A strategy to choose effective summary statistics is suggested in cases where the stochastic computer model can be run at many trial parameter settings, as in the example. Tom Burr and Alexei Skurikhin Copyright © 2013 Tom Burr and Alexei Skurikhin. All rights reserved. Molecular Dynamic Simulation and Inhibitor Prediction of Cysteine Synthase Structured Model as a Potential Drug Target for Trichomoniasis Sun, 01 Sep 2013 08:07:54 +0000 In our presented research, we made an attempt to predict the 3D model for cysteine synthase (A2GMG5_TRIVA) using homology-modeling approaches. To investigate deeper into the predicted structure, we further performed a molecular dynamics simulation for 10 ns and calculated several supporting analysis for structural properties such as RMSF, radius of gyration, and the total energy calculation to support the predicted structured model of cysteine synthase. The present findings led us to conclude that the proposed model is stereochemically stable. The overall PROCHECK G factor for the homology-modeled structure was −0.04. On the basis of the virtual screening for cysteine synthase against the NCI subset II molecule, we present the molecule 1-N, 4-N-bis [3-(1H-benzimidazol-2-yl) phenyl] benzene-1,4-dicarboxamide (ZINC01690699) having the minimum energy score (−13.0 Kcal/Mol) and a log P value of 6 as a potential inhibitory molecule used to inhibit the growth of T. vaginalis infection. Satendra Singh, Gaurav Sablok, Rohit Farmer, Atul Kumar Singh, Budhayash Gautam, and Sunil Kumar Copyright © 2013 Satendra Singh et al. All rights reserved. Prokaryotic Phylogenies Inferred from Whole-Genome Sequence and Annotation Data Thu, 29 Aug 2013 15:03:53 +0000 Phylogenetic trees are used to represent the evolutionary relationship among various groups of species. In this paper, a novel method for inferring prokaryotic phylogenies using multiple genomic information is proposed. The method is called CGCPhy and based on the distance matrix of orthologous gene clusters between whole-genome pairs. CGCPhy comprises four main steps. First, orthologous genes are determined by sequence similarity, genomic function, and genomic structure information. Second, genes involving potential HGT events are eliminated, since such genes are considered to be the highly conserved genes across different species and the genes located on fragments with abnormal genome barcode. Third, we calculate the distance of the orthologous gene clusters between each genome pair in terms of the number of orthologous genes in conserved clusters. Finally, the neighbor-joining method is employed to construct phylogenetic trees across different species. CGCPhy has been examined on different datasets from 617 complete single-chromosome prokaryotic genomes and achieved applicative accuracies on different species sets in agreement with Bergey's taxonomy in quartet topologies. Simulation results show that CGCPhy achieves high average accuracy and has a low standard deviation on different datasets, so it has an applicative potential for phylogenetic analysis. Wei Du, Zhongbo Cao, Yan Wang, Ying Sun, Enrico Blanzieri, and Yanchun Liang Copyright © 2013 Wei Du et al. All rights reserved. SeedSeq: Off-Target Transcriptome Database Thu, 29 Aug 2013 08:34:55 +0000 Detection of potential cross-reaction between a short oligonucleotide sequence and a longer (unintended) sequence is crucial for many biological applications, such as high content screening (HCS), microarray nucleotide probes, or short interfering RNAs (siRNAs). However, owing to a tolerance for mismatches and gaps in base-pairing with target transcripts, siRNAs could have up to hundreds of potential target sequences in a genome, and some small RNAs in mammalian systems have been shown to affect the levels of many messenger RNAs (off-targets) besides their intended target transcripts (on-targets). The reference sequence (RefSeq) collection aims to provide a comprehensive, integrated, nonredundant, well-annotated set of sequences, including mRNA transcripts. We performed a detailed off-target analysis of three most commonly used kinome siRNA libraries based on the latest RefSeq version. To simplify the access to off-target transcripts, we created a SeedSeq database, a new unique format to store off-target information. Shaoli Das, Suman Ghosal, Jayprokas Chakrabarti, and Karol Kozak Copyright © 2013 Shaoli Das et al. All rights reserved. Position-Specific Analysis and Prediction of Protein Pupylation Sites Based on Multiple Features Mon, 26 Aug 2013 14:43:31 +0000 Pupylation is one of the most important posttranslational modifications of proteins; accurate identification of pupylation sites will facilitate the understanding of the molecular mechanism of pupylation. Besides the conventional experimental approaches, computational prediction of pupylation sites is much desirable for their convenience and fast speed. In this study, we developed a novel predictor to predict the pupylation sites. First, the maximum relevance minimum redundancy (mRMR) and incremental feature selection methods were made on five kinds of features to select the optimal feature set. Then the prediction model was built based on the optimal feature set with the assistant of the support vector machine algorithm. As a result, the overall jackknife success rate by the new predictor on a newly constructed benchmark dataset was 0.764, and the Mathews correlation coefficient was 0.522, indicating a good prediction. Feature analysis showed that all features types contributed to the prediction of protein pupylation sites. Further site-specific features analysis revealed that the features of sites surrounding the central lysine contributed more to the determination of pupylation sites than the other sites. Xiaowei Zhao, Jiangyan Dai, Qiao Ning, Zhiqiang Ma, Minghao Yin, and Pingping Sun Copyright © 2013 Xiaowei Zhao et al. All rights reserved. Recognition of Multiple Imbalanced Cancer Types Based on DNA Microarray Data Using Ensemble Classifiers Mon, 26 Aug 2013 13:41:52 +0000 DNA microarray technology can measure the activities of tens of thousands of genes simultaneously, which provides an efficient way to diagnose cancer at the molecular level. Although this strategy has attracted significant research attention, most studies neglect an important problem, namely, that most DNA microarray datasets are skewed, which causes traditional learning algorithms to produce inaccurate results. Some studies have considered this problem, yet they merely focus on binary-class problem. In this paper, we dealt with multiclass imbalanced classification problem, as encountered in cancer DNA microarray, by using ensemble learning. We utilized one-against-all coding strategy to transform multiclass to multiple binary classes, each of them carrying out feature subspace, which is an evolving version of random subspace that generates multiple diverse training subsets. Next, we introduced one of two different correction technologies, namely, decision threshold adjustment or random undersampling, into each training subset to alleviate the damage of class imbalance. Specifically, support vector machine was used as base classifier, and a novel voting rule called counter voting was presented for making a final decision. Experimental results on eight skewed multiclass cancer microarray datasets indicate that unlike many traditional classification approaches, our methods are insensitive to class imbalance. Hualong Yu, Shufang Hong, Xibei Yang, Jun Ni, Yuanyuan Dan, and Bin Qin Copyright © 2013 Hualong Yu et al. All rights reserved. SubMito-PSPCP: Predicting Protein Submitochondrial Locations by Hybridizing Positional Specific Physicochemical Properties with Pseudoamino Acid Compositions Wed, 21 Aug 2013 11:35:47 +0000 Knowing the submitochondrial location of a mitochondrial protein is an important step in understanding its function. We developed a new method for predicting protein submitochondrial locations by introducing a new concept: positional specific physicochemical properties. With the framework of general form pseudoamino acid compositions, our method used only about 100 features to represent protein sequences, which is much simpler than the existing methods. On the dataset of SubMito, our method achieved over 93% overall accuracy, with 98.60% for inner membrane, 93.90% for matrix, and 70.70% for outer membrane, which are comparable to all state-of-the-art methods. As our method can be used as a general method to upgrade all pseudoamino-acid-composition-based methods, it should be very useful in future studies. We implement our method as an online service: SubMito-PSPCP. Pufeng Du and Yuan Yu Copyright © 2013 Pufeng Du and Yuan Yu. All rights reserved. An Approach for Identifying Cytokines Based on a Novel Ensemble Classifier Wed, 21 Aug 2013 10:26:33 +0000 Biology is meaningful and important to identify cytokines and investigate their various functions and biochemical mechanisms. However, several issues remain, including the large scale of benchmark datasets, serious imbalance of data, and discovery of new gene families. In this paper, we employ the machine learning approach based on a novel ensemble classifier to predict cytokines. We directly selected amino acids sequences as research objects. First, we pretreated the benchmark data accurately. Next, we analyzed the physicochemical properties and distribution of whole amino acids and then extracted a group of 120-dimensional (120D) valid features to represent sequences. Third, in the view of the serious imbalance in benchmark datasets, we utilized a sampling approach based on the synthetic minority oversampling technique algorithm and K-means clustering undersampling algorithm to rebuild the training set. Finally, we built a library for dynamic selection and circulating combination based on clustering (LibD3C) and employed the new training set to realize cytokine classification. Experiments showed that the geometric mean of sensitivity and specificity obtained through our approach is as high as 93.3%, which proves that our approach is effective for identifying cytokines. Quan Zou, Zhen Wang, Xinjun Guan, Bin Liu, Yunfeng Wu, and Ziyu Lin Copyright © 2013 Quan Zou et al. All rights reserved. Optimal Control of Gene Regulatory Networks with Effectiveness of Multiple Drugs: A Boolean Network Approach Wed, 21 Aug 2013 09:10:43 +0000 Developing control theory of gene regulatory networks is one of the significant topics in the field of systems biology, and it is expected to apply the obtained results to gene therapy technologies in the future. In this paper, a control method using a Boolean network (BN) is studied. A BN is widely used as a model of gene regulatory networks, and gene expression is expressed by a binary value (0 or 1). In the control problem, we assume that the concentration level of a part of genes is arbitrarily determined as the control input. However, there are cases that no gene satisfying this assumption exists, and it is important to consider structural control via external stimuli. Furthermore, these controls are realized by multiple drugs, and it is also important to consider multiple effects such as duration of effect and side effects. In this paper, we propose a BN model with two types of the control inputs and an optimal control method with duration of drug effectiveness. First, a BN model and duration of drug effectiveness are discussed. Next, the optimal control problem is formulated and is reduced to an integer linear programming problem. Finally, numerical simulations are shown. Koichi Kobayashi and Kunihiko Hiraishi Copyright © 2013 Koichi Kobayashi and Kunihiko Hiraishi. All rights reserved. Image Analysis of Endosocopic Ultrasonography in Submucosal Tumor Using Fuzzy Inference Mon, 19 Aug 2013 08:30:54 +0000 Endoscopists usually make a diagnosis in the submucosal tumor depending on the subjective evaluation about general images obtained by endoscopic ultrasonography. In this paper, we propose a method to extract areas of gastrointestinal stromal tumor (GIST) and lipoma automatically from the ultrasonic image to assist those specialists. We also propose an algorithm to differentiate GIST from non-GIST by fuzzy inference from such images after applying ROC curve with mean and standard deviation of brightness information. In experiments using real images that medical specialists use, we verify that our method is sufficiently helpful for such specialists for efficient classification of submucosal tumors. Kwang Baek Kim and Gwang Ha Kim Copyright © 2013 Kwang Baek Kim and Gwang Ha Kim. All rights reserved. An Efficient Ensemble Learning Method for Gene Microarray Classification Wed, 14 Aug 2013 09:09:15 +0000 The gene microarray analysis and classification have demonstrated an effective way for the effective diagnosis of diseases and cancers. However, it has been also revealed that the basic classification techniques have intrinsic drawbacks in achieving accurate gene classification and cancer diagnosis. On the other hand, classifier ensembles have received increasing attention in various applications. Here, we address the gene classification issue using RotBoost ensemble methodology. This method is a combination of Rotation Forest and AdaBoost techniques which in turn preserve both desirable features of an ensemble architecture, that is, accuracy and diversity. To select a concise subset of informative genes, 5 different feature selection algorithms are considered. To assess the efficiency of the RotBoost, other nonensemble/ensemble techniques including Decision Trees, Support Vector Machines, Rotation Forest, AdaBoost, and Bagging are also deployed. Experimental results have revealed that the combination of the fast correlation-based feature selection method with ICA-based RotBoost ensemble is highly effective for gene classification. In fact, the proposed method can create ensemble classifiers which outperform not only the classifiers produced by the conventional machine learning but also the classifiers generated by two widely used conventional ensemble learning methods, that is, Bagging and AdaBoost. Alireza Osareh and Bita Shadgar Copyright © 2013 Alireza Osareh and Bita Shadgar. All rights reserved. Designing a Bioengine for Detection and Analysis of Base String on an Affected Sequence in High-Concentration Regions Tue, 13 Aug 2013 11:38:18 +0000 We design an Algorithm for bioengine. As a program are enable optimal alignments searching between two sequences, the host sequence (normal plant) as well as query sequence (virus). Searching for homologues has become a routine operation of biological sequences in 4 × 4 combination with different subsequence (word size). This program takes the advantage of the high degree of homology between such sequences to construct an alignment of the matching regions. There is a main aim which is to detect the overlapping reading frames. This program also enables to find out the highly infected colones selection highest matching region with minimum gap or mismatch zones and unique virus colones matches. This is a small, portable, interactive, front-end program intended to be used to find out the regions of matching between host sequence and query subsequences. All the operations are carried out in fraction of seconds, depending on the required task and on the sequence length. Debnath Bhattacharyya, Bijoy Kumar Mandal, and Tai-hoon Kim Copyright © 2013 Debnath Bhattacharyya et al. All rights reserved. Prediction and Analysis of Retinoblastoma Related Genes through Gene Ontology and KEGG Tue, 13 Aug 2013 10:15:12 +0000 One of the most important and challenging problems in biomedicine is how to predict the cancer related genes. Retinoblastoma (RB) is the most common primary intraocular malignancy usually occurring in childhood. Early detection of RB could reduce the morbidity and promote the probability of disease-free survival. Therefore, it is of great importance to identify RB genes. In this study, we developed a computational method to predict RB related genes based on Dagging, with the maximum relevance minimum redundancy (mRMR) method followed by incremental feature selection (IFS). 119 RB genes were compiled from two previous RB related studies, while 5,500 non-RB genes were randomly selected from Ensemble genes. Ten datasets were constructed based on all these RB and non-RB genes. Each gene was encoded with a 13,126-dimensional vector including 12,887 Gene Ontology enrichment scores and 239 KEGG enrichment scores. Finally, an optimal feature set including 1061 GO terms and 8 KEGG pathways was obtained. Analysis showed that these features were closely related to RB. It is anticipated that the method can be applied to predict the other cancer related genes as well. Zhen Li, Bi-Qing Li, Min Jiang, Lei Chen, Jian Zhang, Lin Liu, and Tao Huang Copyright © 2013 Zhen Li et al. All rights reserved. Method for Rapid Protein Identification in a Large Database Tue, 13 Aug 2013 08:44:10 +0000 Protein identification is an integral part of proteomics research. The available tools to identify proteins in tandem mass spectrometry experiments are not optimized to face current challenges in terms of identification scale and speed owing to the exponential growth of the protein database and the accelerated generation of mass spectrometry data, as well as the demand for nonspecific digestion and post-modifications in complex-sample identification. As a result, a rapid method is required to mitigate such complexity and computation challenges. This paper thus aims to present an open method to prevent enzyme and modification specificity on a large database. This paper designed and developed a distributed program to facilitate application to computer resources. With this optimization, nearly linear speedup and real-time support are achieved on a large database with nonspecific digestion, thus enabling testing with two classical large protein databases in a 20-blade cluster. This work aids in the discovery of more significant biological results, such as modification sites, and enables the identification of more complex samples, such as metaproteomics samples. Wenli Zhang and Xiaofang Zhao Copyright © 2013 Wenli Zhang and Xiaofang Zhao. All rights reserved. Diagnosis Value of the Serum Amyloid A Test in Neonatal Sepsis: A Meta-Analysis Mon, 05 Aug 2013 12:20:23 +0000 Neonatal sepsis (NS), a common disorder for humans, is recognized as a leading global public health challenge. This meta-analysis was performed to assess the accuracy of the serum amyloid A (SAA) test for diagnosing NS. The studies that evaluated the SAA test as a diagnotic marker were searched in Pubmed, EMBASE, the Cochrane Library, and Google Network between January 1996 and June 2013. A total of nine studies including 823 neonates were included in our meta-analysis. Quality of each study was evaluated by the quality assessment of diagnostic accuracy studies tool (QUADAS). The SAA test showed moderate accuracy in the diagnosis of NS both at the first suspicion of sepsis and 8–96 h after the sepsis onset, both with , which is similar to the PCT and CRP tests for the diagnosis of NS in the same period. Heterogeneity between studies was also explained by cut-off point, SAA assay, and age of included neonates. On the basis of our meta-analysis, therefore, SAA could be promising and meaningful in the diagnosis of NS. Haining Yuan, Jie Huang, Bokun Lv, Wenying Yan, Guang Hu, Jian Wang, and Bairong Shen Copyright © 2013 Haining Yuan et al. All rights reserved. Identifying Interacting Genetic Variations by Fish-Swarm Logic Regression Mon, 05 Aug 2013 12:07:01 +0000 Understanding associations between genotypes and complex traits is a fundamental problem in human genetics. A major open problem in mapping phenotypes is that of identifying a set of interacting genetic variants, which might contribute to complex traits. Logic regression (LR) is a powerful multivariant association tool. Several LR-based approaches have been successfully applied to different datasets. However, these approaches are not adequate with regard to accuracy and efficiency. In this paper, we propose a new LR-based approach, called fish-swarm logic regression (FSLR), which improves the logic regression process by incorporating swarm optimization. In our approach, a school of fish agents are conducted in parallel. Each fish agent holds a regression model, while the school searches for better models through various preset behaviors. A swarm algorithm improves the accuracy and the efficiency by speeding up the convergence and preventing it from dropping into local optimums. We apply our approach on a real screening dataset and a series of simulation scenarios. Compared to three existing LR-based approaches, our approach outperforms them by having lower type I and type II error rates, being able to identify more preset causal sites, and performing at faster speeds. Xuanping Zhang, Jiayin Wang, Aiyuan Yang, Chunxia Yan, Feng Zhu, Zhongmeng Zhao, and Zhi Cao Copyright © 2013 Xuanping Zhang et al. All rights reserved. A Novel Framework for the Identification and Analysis of Duplicons between Human and Chimpanzee Thu, 01 Aug 2013 13:19:43 +0000 Human and other primate genomes consist of many segmental duplications (SDs) due to fixation of copy number variations (CNVs). Structure of these duplications within the human genome has been shown to be a complex mosaic composed of juxtaposed subunits (called duplicons). These duplicons are difficult to be uncovered from the mosaic repeat structure. In addition, the distribution and evolution of duplicons among primates are still poorly investigated. In this paper, we develop a statistical framework for discovering duplicons via integration of a Hidden Markov Model (HMM) and a permutation test. Our comparative analysis indicates that the mosaic structure of duplicons is common in CNV/SD regions of both human and chimpanzee genomes, and a subset of core duplicons is shared by the majority of CNVs/SDs. Phylogenetic analyses using duplicons suggested that most CNVs/SDs share common duplication ancestry. Many human/chimpanzee duplicons flank both ends of CNVs, which may be hotspots of nonallelic homologous recombination. Trees-Juen Chuang, Shian-Zu Wu, and Yao-Ting Huang Copyright © 2013 Trees-Juen Chuang et al. All rights reserved. Immunoinformatic Docking Approach for the Analysis of KIR3DL1/HLA-B Interaction Thu, 01 Aug 2013 12:54:01 +0000 KIR3DL1 is among the most interesting receptors studied, within the killer immunoglobulin receptor (KIR) family. Human leukocyte antigen (HLA) class I Bw4 epitope inhibits strongly Natural Killer (NK) cell’s activity through interaction with KIR3DL1 receptor, while Bw6 generally does not. This interaction has been indicated to play an important role in the immune control of different viral infectious diseases. However, the structural interaction between the KIR3DL1 receptor and different HLA-B alleles has been scarcely studied. To understand the complexity of KIR3DL1-HLA-B interaction, HLA-B alleles carrying Bw4/Bw6 epitope and KIR3DL1*001 allele in presence of different peptides has been evaluated by using a structural immunoinformatic approach. Different energy minimization force fields (ff) have been tested and NOVA ff enables the successful prediction of ligand-receptor interaction. HLA-B alleles carrying Bw4 epitope present the highest capability of interaction with KIR3DL1*001 compared to the HLA-B alleles presenting Bw6. The presence of the epitope Bw4 determines a conformational change which leads to a stronger interaction between nonpolymorphic arginine at position 79 of HLA-B and KIR3DL1*001 136–142 loop. The data shed new light on the modalities of KIR3DL1 interaction with HLA-B alleles essential for the modulation of NK immune-mediated response. Alba Grifoni, Carla Montesano, Atanas Patronov, Vittorio Colizzi, and Massimo Amicosante Copyright © 2013 Alba Grifoni et al. All rights reserved. Optimized Periocular Template Selection for Human Recognition Wed, 31 Jul 2013 08:18:54 +0000 A novel approach for selecting a rectangular template around periocular region optimally potential for human recognition is proposed. A comparatively larger template of periocular image than the optimal one can be slightly more potent for recognition, but the larger template heavily slows down the biometric system by making feature extraction computationally intensive and increasing the database size. A smaller template, on the contrary, cannot yield desirable recognition though the smaller template performs faster due to low computation for feature extraction. These two contradictory objectives (namely, (a) to minimize the size of periocular template and (b) to maximize the recognition through the template) are aimed to be optimized through the proposed research. This paper proposes four different approaches for dynamic optimal template selection from periocular region. The proposed methods are tested on publicly available unconstrained UBIRISv2 and FERET databases and satisfactory results have been achieved. Thus obtained template can be used for recognition of individuals in an organization and can be generalized to recognize every citizen of a nation. Sambit Bakshi, Pankaj K. Sa, and Banshidhar Majhi Copyright © 2013 Sambit Bakshi et al. All rights reserved. HyDEn: A Hybrid Steganocryptographic Approach for Data Encryption Using Randomized Error-Correcting DNA Codes Sun, 28 Jul 2013 08:40:06 +0000 This paper presents a novel hybrid DNA encryption (HyDEn) approach that uses randomized assignments of unique error-correcting DNA Hamming code words for single characters in the extended ASCII set. HyDEn relies on custom-built quaternary codes and a private key used in the randomized assignment of code words and the cyclic permutations applied on the encoded message. Along with its ability to detect and correct errors, HyDEn equals or outperforms existing cryptographic methods and represents a promising in silico DNA steganographic approach. Dan Tulpan, Chaouki Regoui, Guillaume Durand, Luc Belliveau, and Serge Léger Copyright © 2013 Dan Tulpan et al. All rights reserved. NCBI2RDF: Enabling Full RDF-Based Access to NCBI Databases Sun, 28 Jul 2013 08:28:23 +0000 RDF has become the standard technology for enabling interoperability among heterogeneous biomedical databases. The NCBI provides access to a large set of life sciences databases through a common interface called Entrez. However, the latter does not provide RDF-based access to such databases, and, therefore, they cannot be integrated with other RDF-compliant databases and accessed via SPARQL query interfaces. This paper presents the NCBI2RDF system, aimed at providing RDF-based access to the complete NCBI data repository. This API creates a virtual endpoint for servicing SPARQL queries over different NCBI repositories and presenting to users the query results in SPARQL results format, thus enabling this data to be integrated and/or stored with other RDF-compliant repositories. SPARQL queries are dynamically resolved, decomposed, and forwarded to the NCBI-provided E-utilities programmatic interface to access the NCBI data. Furthermore, we show how our approach increases the expressiveness of the native NCBI querying system, allowing several databases to be accessed simultaneously. This feature significantly boosts productivity when working with complex queries and saves time and effort to biomedical researchers. Our approach has been validated with a large number of SPARQL queries, thus proving its reliability and enhanced capabilities in biomedical environments. Alberto Anguita, Miguel García-Remesal, Diana de la Iglesia, and Victor Maojo Copyright © 2013 Alberto Anguita et al. All rights reserved. A Universal Model for Predicting Dynamics of the Epidemics Caused by Special Pathogens Wed, 24 Jul 2013 13:27:20 +0000 A universal model intended primarily for predicting dynamics of the mass epidemics (outbreaks) caused by special pathogens is being developed at the State Research Center of Virology and Biotechnology Vector. The model includes the range of major countermeasures: preventive and emergency mass vaccination, vaccination of risk groups as well as search for and isolation/observation of infected cases, contacts, and suspects, and quarantine. The intensity of interventions depends on the availability of the relevant resources. The effect of resource limitations on the development of a putative epidemic of Ebola hemorrhagic fever is demonstrated. The modeling results allow for estimation of the material and human resources necessary for eradication of an epidemic. Alexander G. Bachinsky and Lily Ph. Nizolenko Copyright © 2013 Alexander G. Bachinsky and Lily Ph. Nizolenko. All rights reserved. Cloud Prediction of Protein Structure and Function with PredictProtein for Debian Thu, 18 Jul 2013 12:12:11 +0000 We report the release of PredictProtein for the Debian operating system and derivatives, such as Ubuntu, Bio-Linux, and Cloud BioLinux. The PredictProtein suite is available as a standard set of open source Debian packages. The release covers the most popular prediction methods from the Rost Lab, including methods for the prediction of secondary structure and solvent accessibility (profphd), nuclear localization signals (predictnls), and intrinsically disordered regions (norsnet). We also present two case studies that successfully utilize PredictProtein packages for high performance computing in the cloud: the first analyzes protein disorder for whole organisms, and the second analyzes the effect of all possible single sequence variants in protein coding regions of the human genome. László Kaján, Guy Yachdav, Esmeralda Vicedo, Martin Steinegger, Milot Mirdita, Christof Angermüller, Ariane Böhm, Simon Domke, Julia Ertl, Christian Mertes, Eva Reisinger, Cedric Staniewski, and Burkhard Rost Copyright © 2013 László Kaján et al. All rights reserved. Translational Bioinformatics for Diagnostic and Prognostic Prediction of Prostate Cancer in the Next-Generation Sequencing Era Mon, 15 Jul 2013 14:25:48 +0000 The discovery of prostate cancer biomarkers has been boosted by the advent of next-generation sequencing (NGS) technologies. Nevertheless, many challenges still exist in exploiting the flood of sequence data and translating them into routine diagnostics and prognosis of prostate cancer. Here we review the recent developments in prostate cancer biomarkers by high throughput sequencing technologies. We highlight some fundamental issues of translational bioinformatics and the potential use of cloud computing in NGS data processing for the improvement of prostate cancer treatment. Jiajia Chen, Daqing Zhang, Wenying Yan, Dongrong Yang, and Bairong Shen Copyright © 2013 Jiajia Chen et al. All rights reserved. Identification of Interconnected Markers for T-Cell Acute Lymphoblastic Leukemia Mon, 15 Jul 2013 13:03:06 +0000 T-cell acute lymphoblastic leukemia (T-ALL) is a complex disease, resulting from proliferation of differentially arrested immature T cells. The molecular mechanisms and the genes involved in the generation of T-ALL remain largely undefined. In this study, we propose a set of genes to differentiate individuals with T-ALL from the nonleukemia/healthy ones and genes that are not differential themselves but interconnected with highly differentially expressed ones. We provide new suggestions for pathways involved in the cause of T-ALL and show that network-based classification techniques produce fewer genes with more meaningful and successful results than expression-based approaches. We have identified 19 significant subnetworks, containing 102 genes. The classification/prediction accuracies of subnetworks are considerably high, as high as 98%. Subnetworks contain 6 nondifferentially expressed genes, which could potentially participate in pathogenesis of T-ALL. Although these genes are not differential, they may serve as biomarkers if their loss/gain of function contributes to generation of T-ALL via SNPs. We conclude that transcription factors, zinc-ion-binding proteins, and tyrosine kinases are the important protein families to trigger T-ALL. These potential disease-causing genes in our subnetworks may serve as biomarkers, alternative to the traditional ones used for the diagnosis of T-ALL, and help understand the pathogenesis of the disease. Emine Guven Maiorov, Ozlem Keskin, Ozden Hatirnaz Ng, Ugur Ozbek, and Attila Gursoy Copyright © 2013 Emine Guven Maiorov et al. All rights reserved. Robust Cell Size Checkpoint from Spatiotemporal Positive Feedback Loop in Fission Yeast Thu, 11 Jul 2013 14:11:49 +0000 Cells must maintain appropriate cell size during proliferation. Size control may be regulated by a size checkpoint that couples cell size to cell division. Biological experimental data suggests that the cell size is coupled to the cell cycle in two ways: the rates of protein synthesis and the cell polarity protein kinase Pom1 provide spatial information that is used to regulate mitosis inhibitor Wee1. Here a mathematical model involving these spatiotemporal regulations was developed and used to explore the mechanisms underlying the size checkpoint in fission yeast. Bifurcation analysis shows that when the spatiotemporal regulation is coupled to the positive feedback loops (active Cdc2 promotes its activator, Cdc25, and suppress its inhibitor, Wee1), the mitosis-promoting factor (MPF) exhibits a bistable steady-state relationship with the cell size. The switch-like response from the positive feedback loops naturally generates the cell size checkpoint. Further analysis indicated that the spatial regulation provided by Pom1 enhances the robustness of the size checkpoint in fission yeast. This was consistent with experimental data. Jie Yan, Xin Ni, and Ling Yang Copyright © 2013 Jie Yan et al. All rights reserved. Integrative Analysis of Methylome and Transcriptome Reveals the Importance of Unmethylated CpGs in Non-CpG Island Gene Activation Wed, 10 Jul 2013 09:02:08 +0000 Background. Promoter methylation is associated with gene repression; however, little is known about its mechanism. It was proposed that the repression of methylated genes is achieved through the recruitment of methyl binding proteins (MBPs) that participate in closing the chromatin. An alternative mechanism suggests that methylation interferes with the binding of either site specific activators or more general activators that bind to the CpG dinucleotide. However, the relative contribution of these two mechanisms to gene repression is not known. Results. Bioinformatics analyses of genome-wide transcriptome and methylome data support the latter hypothesis by demonstrating a strong association between transcription and the number of unmethylated CpGs at the promoter of genes lacking CpG islands. Conclusions. Our results suggest that methylation represses gene expression mainly by preventing the binding of CpG binding activators. Amichai Marx, Tamar Kahan, and Itamar Simon Copyright © 2013 Amichai Marx et al. All rights reserved. Molecular Dynamics Studies on the Conformational Transitions of Adenylate Kinase: A Computational Evidence for the Conformational Selection Mechanism Thu, 27 Jun 2013 08:36:37 +0000 Escherichia coli adenylate kinase (ADK) is a monomeric phosphotransferase enzyme that catalyzes reversible transfer of phosphoryl group from ATP to AMP with a large-scale domain motion. The detailed mechanism for this conformational transition remains unknown. In the current study, we performed long time-scale molecular dynamics simulations on both open and closed states of ADK. Based on the structural analyses of the simulation trajectories, we detected over 20 times conformational transitions between the open and closed states of ADK and identified two novel conformations as intermediate states in the catalytic processes. With these findings, we proposed a possible mechanism for the large-scale domain motion of Escherichia coli ADK and its catalytic process: (1) the substrate free ADK adopted an open conformation; (2) ATP bound with LID domain closure; (3) AMP bound with NMP domain closure; (4) phosphoryl transfer occurred with ATP, and AMP converted into two ADPs, and no conformational transition was detected in the enzyme; (5) LID domain opened with one ADP released; (6) another ADP released with NMP domain open. As both open and closed states sampled a wide range of conformation transitions, our simulation strongly supported the conformational selection mechanism for Escherichia coli ADK. Jie Ping, Pei Hao, Yi-Xue Li, and Jing-Fang Wang Copyright © 2013 Jie Ping et al. All rights reserved. Application of Improved Three-Dimensional Kernel Approach to Prediction of Protein Structural Class Wed, 26 Jun 2013 13:44:22 +0000 Kernel methods, such as kernel PCA, kernel PLS, and support vector machines, are widely known machine learning techniques in biology, medicine, chemistry, and material science. Based on nonlinear mapping and Coulomb function, two 3D kernel approaches were improved and applied to predictions of the four protein tertiary structural classes of domains (all-α, all-β, α/β, and α + β) and five membrane protein types with satisfactory results. In a benchmark test, the performances of improved 3D kernel approach were compared with those of neural networks, support vector machines, and ensemble algorithm. Demonstration through leave-one-out cross-validation on working datasets constructed by investigators indicated that new kernel approaches outperformed other predictors. It has not escaped our notice that 3D kernel approaches may hold a high potential for improving the quality in predicting the other protein features as well. Or at the very least, it will play a complementary role to many of the existing algorithms in this regard. Xu Liu, Yuchao Zhang, Hua Yang, Lisheng Wang, and Shuaibing Liu Copyright © 2013 Xu Liu et al. All rights reserved. MicroRNA-Mediated Regulation in Biological Systems with Oscillatory Behavior Wed, 26 Jun 2013 11:16:56 +0000 As a class of small noncoding RNAs, microRNAs (miRNAs) regulate stability or translation of mRNA transcripts. Some reports bring new insights into possible roles of microRNAs in modulating cell cycle. In this paper, we focus on the mechanism and effectiveness of microRNA-mediated regulation in the cell cycle. We first describe two specific regulatory circuits that incorporate base-pairing microRNAs and show their fine-tuning roles in the modulation of periodic behavior. Furthermore, we analyze the effects of miR369-3 on the modulation of the cell cycle, confirming that miR369-3 plays a role in shortening the period of the cell cycle. These results are consistent with experimental observations. Zhiyong Zhang, Fengdan Xu, Zengrong Liu, Ruiqi Wang, and Tieqiao Wen Copyright © 2013 Zhiyong Zhang et al. All rights reserved. Dynamic Folding Pathway Models of the Trp-Cage Protein Mon, 24 Jun 2013 15:27:11 +0000 Using action-derived molecular dynamics (ADMD), we study the dynamic folding pathway models of the Trp-cage protein by providing its sequential conformational changes from its initial disordered structure to the final native structure at atomic details. We find that the numbers of native contacts and native hydrogen bonds are highly correlated, implying that the native structure of Trp-cage is achieved through the concurrent formations of native contacts and native hydrogen bonds. In early stage, an unfolded state appears with partially formed native contacts (~40%) and native hydrogen bonds (~30%). Afterward, the folding is initiated by the contact of the side chain of Tyr3 with that of Trp6, together with the formation of the N-terminal α-helix. Then, the C-terminal polyproline structure docks onto the Trp6 and Tyr3 rings, resulting in the formations of the hydrophobic core of Trp-cage and its near-native state. Finally, the slow adjustment processes of the near-native states into the native structure are dominant in later stage. The ADMD results are in agreement with those of the experimental folding studies on Trp-cage and consistent with most of other computational studies. In-Ho Lee and Seung-Yeon Kim Copyright © 2013 In-Ho Lee and Seung-Yeon Kim. All rights reserved. Predicting the DPP-IV Inhibitory Activity Based on Their Physicochemical Properties Thu, 20 Jun 2013 08:34:06 +0000 The second development program developed in this work was introduced to obtain physicochemical properties of DPP-IV inhibitors. Based on the computation of molecular descriptors, a two-stage feature selection method called mRMR-BFS (minimum redundancy maximum relevance-backward feature selection) was adopted. Then, the support vector regression (SVR) was used in the establishment of the model to map DPP-IV inhibitors to their corresponding inhibitory activity possible. The squared correlation coefficient for the training set of LOOCV and the test set are 0.815 and 0.884, respectively. An online server for predicting inhibitory activity pIC50 of the DPP-IV inhibitors as described in this paper has been given in the introduction. Tianhong Gu, Xiaoyan Yang, Minjie Li, Milin Wu, Qiang Su, Wencong Lu, and Yuhui Zhang Copyright © 2013 Tianhong Gu et al. All rights reserved. In Silico Screening and Molecular Dynamics Simulation of Disease-Associated nsSNP in TYRP1 Gene and Its Structural Consequences in OCA3 Wed, 19 Jun 2013 14:17:46 +0000 Oculocutaneous albinism type III (OCA3), caused by mutations of TYRP1 gene, is an autosomal recessive disorder characterized by reduced biosynthesis of melanin pigment in the hair, skin, and eyes. The TYRP1 gene encodes a protein called tyrosinase-related protein-1 (Tyrp1). Tyrp1 is involved in maintaining the stability of tyrosinase protein and modulating its catalytic activity in eumelanin synthesis. Tyrp1 is also involved in maintenance of melanosome structure and affects melanocyte proliferation and cell death. In this work we implemented computational analysis to filter the most probable mutation that might be associated with OCA3. We found R326H and R356Q as most deleterious and disease associated by using PolyPhen 2.0, SIFT, PANTHER, I-mutant 3.0, PhD-SNP, SNP&GO, Pmut, and Mutpred tools. To understand the atomic arrangement in 3D space, the native and mutant (R326H and R356Q) structures were modelled. Finally the structural analyses of native and mutant Tyrp1 proteins were investigated using molecular dynamics simulation (MDS) approach. MDS results showed more flexibility in native Tyrp1 structure. Due to mutation in Tyrp1 protein, it became more rigid and might disturb the structural conformation and catalytic function of the structure and might also play a significant role in inducing OCA3. The results obtained from this study would facilitate wet-lab researches to develop a potent drug therapies against OCA3. Balu Kamaraj and Rituraj Purohit Copyright © 2013 Balu Kamaraj and Rituraj Purohit. All rights reserved. Dynamic Actin Gene Family Evolution in Primates Thu, 06 Jun 2013 16:22:42 +0000 Actin is one of the most highly conserved proteins and plays crucial roles in many vital cellular functions. In most eukaryotes, it is encoded by a multigene family. Although the actin gene family has been studied a lot, few investigators focus on the comparison of actin gene family in relative species. Here, the purpose of our study is to systematically investigate characteristics and evolutionary pattern of actin gene family in primates. We identified 233 actin genes in human, chimpanzee, gorilla, orangutan, gibbon, rhesus monkey, and marmoset genomes. Phylogenetic analysis showed that actin genes in the seven species could be divided into two major types of clades: orthologous group versus complex group. Codon usages and gene expression patterns of actin gene copies were highly consistent among the groups because of basic functions needed by the organisms, but much diverged within species due to functional diversification. Besides, many great potential pseudogenes were found with incomplete open reading frames due to frameshifts or early stop codons. These results implied that actin gene family in primates went through “birth and death” model of evolution process. Under this model, actin genes experienced strong negative selection and increased the functional complexity by reproducing themselves. Liucun Zhu, Ying Zhang, Yijun Hu, Tieqiao Wen, and Qiang Wang Copyright © 2013 Liucun Zhu et al. All rights reserved. Exploring the Cooccurrence Patterns of Multiple Sets of Genomic Intervals Tue, 28 May 2013 13:55:49 +0000 Background. Exploring the spatial relationship of different genomic features has been of great interest since the early days of genomic research. The relationship sometimes provides useful information for understanding certain biological processes. Recent advances in high-throughput technologies such as ChIP-seq produce large amount of data in the form of genomic intervals. Most of the existing methods for assessing spatial relationships among the intervals are designed for pairwise comparison and cannot be easily scaled up. Results. We present a statistical method and software tool to characterize the cooccurrence patterns of multiple sets of genomic intervals. The occurrences of genomic intervals are described by a simple finite mixture model, where each component represents a distinct cooccurrence pattern. The model parameters are estimated via an EM algorithm and can be viewed as sufficient statistics of the cooccurrence patterns. Simulation and real data results show that the model can accurately capture the patterns and provide biologically meaningful results. The method is implemented in a freely available R package giClust. Conclusions. The method and the software provide a convenient way for biologists to explore the cooccurrence patterns among a relatively large number of sets of genomic intervals. Hao Wu and Zhaohui S. Qin Copyright © 2013 Hao Wu and Zhaohui S. Qin. All rights reserved. Identification of Lung-Cancer-Related Genes with the Shortest Path Approach in a Protein-Protein Interaction Network Wed, 22 May 2013 15:44:42 +0000 Lung cancer is one of the leading causes of cancer mortality worldwide. The main types of lung cancer are small cell lung cancer (SCLC) and nonsmall cell lung cancer (NSCLC). In this work, a computational method was proposed for identifying lung-cancer-related genes with a shortest path approach in a protein-protein interaction (PPI) network. Based on the PPI data from STRING, a weighted PPI network was constructed. 54 NSCLC- and 84 SCLC-related genes were retrieved from associated KEGG pathways. Then the shortest paths between each pair of these 54 NSCLC genes and 84 SCLC genes were obtained with Dijkstra’s algorithm. Finally, all the genes on the shortest paths were extracted, and 25 and 38 shortest genes with a permutation value less than 0.05 for NSCLC and SCLC were selected for further analysis. Some of the shortest path genes have been reported to be related to lung cancer. Intriguingly, the candidate genes we identified from the PPI network contained more cancer genes than those identified from the gene expression profiles. Furthermore, these genes possessed more functional similarity with the known cancer genes than those identified from the gene expression profiles. This study proved the efficiency of the proposed method and showed promising results. Bi-Qing Li, Jin You, Lei Chen, Jian Zhang, Ning Zhang, Hai-Peng Li, Tao Huang, Xiang-Yin Kong, and Yu-Dong Cai Copyright © 2013 Bi-Qing Li et al. All rights reserved. Cloud Computing for Protein-Ligand Binding Site Comparison Thu, 16 May 2013 15:18:47 +0000 The proteome-wide analysis of protein-ligand binding sites and their interactions with ligands is important in structure-based drug design and in understanding ligand cross reactivity and toxicity. The well-known and commonly used software, SMAP, has been designed for 3D ligand binding site comparison and similarity searching of a structural proteome. SMAP can also predict drug side effects and reassign existing drugs to new indications. However, the computing scale of SMAP is limited. We have developed a high availability, high performance system that expands the comparison scale of SMAP. This cloud computing service, called Cloud-PLBS, combines the SMAP and Hadoop frameworks and is deployed on a virtual cloud computing platform. To handle the vast amount of experimental data on protein-ligand binding site pairs, Cloud-PLBS exploits the MapReduce paradigm as a management and parallelizing tool. Cloud-PLBS provides a web portal and scalability through which biologists can address a wide range of computer-intensive questions in biology and drug discovery. Che-Lun Hung and Guan-Jie Hua Copyright © 2013 Che-Lun Hung and Guan-Jie Hua. All rights reserved. Secure Method for Biometric-Based Recognition with Integrated Cryptographic Functions Wed, 15 May 2013 15:33:02 +0000 Biometric systems refer to biometric technologies which can be used to achieve authentication. Unlike cryptography-based technologies, the ratio for certification in biometric systems needs not to achieve 100% accuracy. However, biometric data can only be directly compared through proximal access to the scanning device and cannot be combined with cryptographic techniques. Moreover, repeated use, improper storage, or transmission leaks may compromise security. Prior studies have attempted to combine cryptography and biometrics, but these methods require the synchronization of internal systems and are vulnerable to power analysis attacks, fault-based cryptanalysis, and replay attacks. This paper presents a new secure cryptographic authentication method using biometric features. The proposed system combines the advantages of biometric identification and cryptographic techniques. By adding a subsystem to existing biometric recognition systems, we can simultaneously achieve the security of cryptographic technology and the error tolerance of biometric recognition. This method can be used for biometric data encryption, signatures, and other types of cryptographic computation. The method offers a high degree of security with protection against power analysis attacks, fault-based cryptanalysis, and replay attacks. Moreover, it can be used to improve the confidentiality of biological data storage and biodata identification processes. Remote biometric authentication can also be safely applied. Shin-Yan Chiou Copyright © 2013 Shin-Yan Chiou. All rights reserved. A High Performance Cloud-Based Protein-Ligand Docking Prediction Algorithm Tue, 14 May 2013 17:44:36 +0000 The potential of predicting druggability for a particular disease by integrating biological and computer science technologies has witnessed success in recent years. Although the computer science technologies can be used to reduce the costs of the pharmaceutical research, the computation time of the structure-based protein-ligand docking prediction is still unsatisfied until now. Hence, in this paper, a novel docking prediction algorithm, named fast cloud-based protein-ligand docking prediction algorithm (FCPLDPA), is presented to accelerate the docking prediction algorithm. The proposed algorithm works by leveraging two high-performance operators: (1) the novel migration (information exchange) operator is designed specially for cloud-based environments to reduce the computation time; (2) the efficient operator is aimed at filtering out the worst search directions. Our simulation results illustrate that the proposed method outperforms the other docking algorithms compared in this paper in terms of both the computation time and the quality of the end result. Jui-Le Chen, Chun-Wei Tsai, Ming-Chao Chiang, and Chu-Sing Yang Copyright © 2013 Jui-Le Chen et al. All rights reserved. Structural Adaptation of Cold-Active RTX Lipase from Pseudomonas sp. Strain AMS8 Revealed via Homology and Molecular Dynamics Simulation Approaches Tue, 07 May 2013 17:36:51 +0000 The psychrophilic enzyme is an interesting subject to study due to its special ability to adapt to extreme temperatures, unlike typical enzymes. Utilizing computer-aided software, the predicted structure and function of the enzyme lipase AMS8 (LipAMS8) (isolated from the psychrophilic Pseudomonas sp., obtained from the Antarctic soil) are studied. The enzyme shows significant sequence similarities with lipases from Pseudomonas sp. MIS38 and Serratia marcescens. These similarities aid in the prediction of the 3D molecular structure of the enzyme. In this study, 12 ns MD simulation is performed at different temperatures for structural flexibility and stability analysis. The results show that the enzyme is most stable at 0°C and 5°C. In terms of stability and flexibility, the catalytic domain (N-terminus) maintained its stability more than the noncatalytic domain (C-terminus), but the non-catalytic domain showed higher flexibility than the catalytic domain. The analysis of the structure and function of LipAMS8 provides new insights into the structural adaptation of this protein at low temperatures. The information obtained could be a useful tool for low temperature industrial applications and molecular engineering purposes, in the near future. Mohd. Shukuri Mohamad Ali, Siti Farhanie Mohd Fuzi, Menega Ganasen, Raja Noor Zaliha Raja Abdul Rahman, Mahiran Basri, and Abu Bakar Salleh Copyright © 2013 Mohd. Shukuri Mohamad Ali et al. All rights reserved. Streaming Support for Data Intensive Cloud-Based Sequence Analysis Wed, 24 Apr 2013 15:16:37 +0000 Cloud computing provides a promising solution to the genomics data deluge problem resulting from the advent of next-generation sequencing (NGS) technology. Based on the concepts of “resources-on-demand” and “pay-as-you-go”, scientists with no or limited infrastructure can have access to scalable and cost-effective computational resources. However, the large size of NGS data causes a significant data transfer latency from the client’s site to the cloud, which presents a bottleneck for using cloud computing services. In this paper, we provide a streaming-based scheme to overcome this problem, where the NGS data is processed while being transferred to the cloud. Our scheme targets the wide class of NGS data analysis tasks, where the NGS sequences can be processed independently from one another. We also provide the elastream package that supports the use of this scheme with individual analysis programs or with workflow systems. Experiments presented in this paper show that our solution mitigates the effect of data transfer latency and saves both time and cost of computation. Shadi A. Issa, Romeo Kienzler, Mohamed El-Kalioby, Peter J. Tonellato, Dennis Wall, Rémy Bruggmann, and Mohamed Abouelhoda Copyright © 2013 Shadi A. Issa et al. All rights reserved. Intelligent Informatics in Biomedicine Mon, 22 Apr 2013 12:52:50 +0000 Hao-Teng Chang, Raffaele A. Calogero, Sorin Draghici, Oliver Ray, and Tun-Wen Pai Copyright © 2013 Hao-Teng Chang et al. All rights reserved. A Novel Method of Predicting Protein Disordered Regions Based on Sequence Features Mon, 22 Apr 2013 10:42:21 +0000 With a large number of disordered proteins and their important functions discovered, it is highly desired to develop effective methods to computationally predict protein disordered regions. In this study, based on Random Forest (RF), Maximum Relevancy Minimum Redundancy (mRMR), and Incremental Feature Selection (IFS), we developed a new method to predict disordered regions in proteins. The mRMR criterion was used to rank the importance of all candidate features. Finally, top 128 features were selected from the ranked feature list to build the optimal model, including 92 Position Specific Scoring Matrix (PSSM) conservation score features and 36 secondary structure features. As a result, Matthews correlation coefficient (MCC) of 0.3895 was achieved on the training set by 10-fold cross-validation. On the basis of predicting results for each query sequence by using the method, we used the scanning and modification strategy to improve the performance. The accuracy (ACC) and MCC were increased by 4% and almost 0.2%, respectively, compared with other three popular predictors: DISOPRED, DISOclust, and OnD-CRF. The selected features may shed some light on the understanding of the formation mechanism of disordered structures, providing guidelines for experimental validation. Tong-Hui Zhao, Min Jiang, Tao Huang, Bi-Qing Li, Ning Zhang, Hai-Peng Li, and Yu-Dong Cai Copyright © 2013 Tong-Hui Zhao et al. All rights reserved. Exploiting GPUs in Virtual Machine for BioCloud Mon, 22 Apr 2013 09:41:58 +0000 Recently, biological applications start to be reimplemented into the applications which exploit many cores of GPUs for better computation performance. Therefore, by providing virtualized GPUs to VMs in cloud computing environment, many biological applications will willingly move into cloud environment to enhance their computation performance and utilize infinite cloud computing resource while reducing expenses for computations. In this paper, we propose a BioCloud system architecture that enables VMs to use GPUs in cloud environment. Because much of the previous research has focused on the sharing mechanism of GPUs among VMs, they cannot achieve enough performance for biological applications of which computation throughput is more crucial rather than sharing. The proposed system exploits the pass-through mode of PCI express (PCI-E) channel. By making each VM be able to access underlying GPUs directly, applications can show almost the same performance as when those are in native environment. In addition, our scheme multiplexes GPUs by using hot plug-in/out device features of PCI-E channel. By adding or removing GPUs in each VM in on-demand manner, VMs in the same physical host can time-share their GPUs. We implemented the proposed system using the Xen VMM and NVIDIA GPUs and showed that our prototype is highly effective for biological GPU applications in cloud environment. Heeseung Jo, Jinkyu Jeong, Myoungho Lee, and Dong Hoon Choi Copyright © 2013 Heeseung Jo et al. All rights reserved. wFReDoW: A Cloud-Based Web Environment to Handle Molecular Docking Simulations of a Fully Flexible Receptor Model Thu, 11 Apr 2013 14:43:53 +0000 Molecular docking simulations of fully flexible protein receptor (FFR) models are coming of age. In our studies, an FFR model is represented by a series of different conformations derived from a molecular dynamic simulation trajectory of the receptor. For each conformation in the FFR model, a docking simulation is executed and analyzed. An important challenge is to perform virtual screening of millions of ligands using an FFR model in a sequential mode since it can become computationally very demanding. In this paper, we propose a cloud-based web environment, called web Flexible Receptor Docking Workflow (wFReDoW), which reduces the CPU time in the molecular docking simulations of FFR models to small molecules. It is based on the new workflow data pattern called self-adaptive multiple instances (P-SaMIs) and on a middleware built on Amazon EC2 instances. P-SaMI reduces the number of molecular docking simulations while the middleware speeds up the docking experiments using a High Performance Computing (HPC) environment on the cloud. The experimental results show a reduction in the total elapsed time of docking experiments and the quality of the new reduced receptor models produced by discarding the nonpromising conformations from an FFR model ruled by the P-SaMI data pattern. Renata De Paris, Fábio A. Frantz, Osmar Norberto de Souza, and Duncan D. A. Ruiz Copyright © 2013 Renata De Paris et al. All rights reserved. GPU-Based Cloud Service for Smith-Waterman Algorithm Using Frequency Distance Filtration Scheme Wed, 03 Apr 2013 14:15:40 +0000 As the conventional means of analyzing the similarity between a query sequence and database sequences, the Smith-Waterman algorithm is feasible for a database search owing to its high sensitivity. However, this algorithm is still quite time consuming. CUDA programming can improve computations efficiently by using the computational power of massive computing hardware as graphics processing units (GPUs). This work presents a novel Smith-Waterman algorithm with a frequency-based filtration method on GPUs rather than merely accelerating the comparisons yet expending computational resources to handle such unnecessary comparisons. A user friendly interface is also designed for potential cloud server applications with GPUs. Additionally, two data sets, H1N1 protein sequences (query sequence set) and human protein database (database set), are selected, followed by a comparison of CUDA-SW and CUDA-SW with the filtration method, referred to herein as CUDA-SWf. Experimental results indicate that reducing unnecessary sequence alignments can improve the computational time by up to 41%. Importantly, by using CUDA-SWf as a cloud service, this application can be accessed from any computing environment of a device with an Internet connection without time constraints. Sheng-Ta Lee, Chun-Yuan Lin, and Che Lun Hung Copyright © 2013 Sheng-Ta Lee et al. All rights reserved. Time Series Expression Analyses Using RNA-seq: A Statistical Approach Sun, 24 Mar 2013 12:07:25 +0000 RNA-seq is becoming the de facto standard approach for transcriptome analysis with ever-reducing cost. It has considerable advantages over conventional technologies (microarrays) because it allows for direct identification and quantification of transcripts. Many time series RNA-seq datasets have been collected to study the dynamic regulations of transcripts. However, statistically rigorous and computationally efficient methods are needed to explore the time-dependent changes of gene expression in biological systems. These methods should explicitly account for the dependencies of expression patterns across time points. Here, we discuss several methods that can be applied to model timecourse RNA-seq data, including statistical evolutionary trajectory index (SETI), autoregressive time-lagged regression (AR(1)), and hidden Markov model (HMM) approaches. We use three real datasets and simulation studies to demonstrate the utility of these dynamic methods in temporal analysis. Sunghee Oh, Seongho Song, Gregory Grabowski, Hongyu Zhao, and James P. Noonan Copyright © 2013 Sunghee Oh et al. All rights reserved. Signal Propagation in Protein Interaction Network during Colorectal Cancer Progression Wed, 20 Mar 2013 09:24:19 +0000 Colorectal cancer is generally categorized into the following four stages according to its development or serious degree: Dukes A, B, C, and D. Since different stage of colorectal cancer actually corresponds to different activated region of the network, the transition of different network states may reflect its pathological changes. In view of this, we compared the gene expressions among the colorectal cancer patients in the aforementioned four stages and obtained the early and late stage biomarkers, respectively. Subsequently, the two kinds of biomarkers were both mapped onto the protein interaction network. If an early biomarker and a late biomarker were close in the network and also if their expression levels were correlated in the Dukes B and C patients, then a signal propagation path from the early stage biomarker to the late one was identified. Many transition genes in the signal propagation paths were involved with the signal transduction, cell communication, and cellular process regulation. Some transition hubs were known as colorectal cancer genes. The findings reported here may provide useful insights for revealing the mechanism of colorectal cancer progression at the cellular systems biology level. Yang Jiang, Tao Huang, Lei Chen, Yu-Fei Gao, Yudong Cai, and Kuo-Chen Chou Copyright © 2013 Yang Jiang et al. All rights reserved. Translational Biomedical Informatics in the Cloud: Present and Future Sun, 17 Mar 2013 15:32:35 +0000 Next generation sequencing and other high-throughput experimental techniques of recent decades have driven the exponential growth in publicly available molecular and clinical data. This information explosion has prepared the ground for the development of translational bioinformatics. The scale and dimensionality of data, however, pose obvious challenges in data mining, storage, and integration. In this paper we demonstrated the utility and promise of cloud computing for tackling the big data problems. We also outline our vision that cloud computing could be an enabling tool to facilitate translational bioinformatics research. Jiajia Chen, Fuliang Qian, Wenying Yan, and Bairong Shen Copyright © 2013 Jiajia Chen et al. All rights reserved. Gene Entropy-Fractal Dimension Informatics with Application to Mouse-Human Translational Medicine Sun, 17 Mar 2013 15:01:52 +0000 DNA informatics represented by Shannon entropy and fractal dimension have been used to form 2D maps of related genes in various mammals. The distance between points on these maps for corresponding mRNA sequences in different species is used to study evolution. By quantifying the similarity of genes between species, this distance might be indicated when studies on one species (mouse) would tend to be valid in the other (human). The hypothesis that a small distance from mouse to human could facilitate mouse to human translational medicine success is supported by the studied ESR-1, LMNA, Myc, and RNF4 sequences. ID1 and PLCZ1 have larger separation. The collinearity of displacement vectors is further analyzed with a regression model, and the ID1 result suggests a mouse-chimp-human translational medicine approach. Further inference was found in the tumor suppression gene, p53, with a new hypothesis of including the bovine PKM2 pathways for targeting the glycolysis preference in many types of cancerous cells, consistent with quantum metabolism models. The distance between mRNA and protein coding CDS is proposed as a measure of the pressure associated with noncoding processes. The Y-chromosome DYS14 in fetal micro chimerism that could offer protection from Alzheimer's disease is given as an example. T. Holden, E. Cheung, S. Dehipawala, J. Ye, G. Tremberger Jr., D. Lieberman, and T. Cheung Copyright © 2013 T. Holden et al. All rights reserved. Predicting -Turns in Protein Using Kernel Logistic Regression Tue, 19 Feb 2013 11:28:34 +0000 A β-turn is a secondary protein structure type that plays a significant role in protein configuration and function. On average 25% of amino acids in protein structures are located in β-turns. It is very important to develope an accurate and efficient method for β-turns prediction. Most of the current successful β-turns prediction methods use support vector machines (SVMs) or neural networks (NNs). The kernel logistic regression (KLR) is a powerful classification technique that has been applied successfully in many classification problems. However, it is often not found in β-turns classification, mainly because it is computationally expensive. In this paper, we used KLR to obtain sparse β-turns prediction in short evolution time. Secondary structure information and position-specific scoring matrices (PSSMs) are utilized as input features. We achieved of 80.7% and MCC of 50% on BT426 dataset. These results show that KLR method with the right algorithm can yield performance equivalent to or even better than NNs and SVMs in β-turns prediction. In addition, KLR yields probabilistic outcome and has a well-defined extension to multiclass case. Murtada Khalafallah Elbashir, Yu Sheng, Jianxin Wang, FangXiang Wu, and Min Li Copyright © 2013 Murtada Khalafallah Elbashir et al. All rights reserved. State-of-the-Art Fusion-Finder Algorithms Sensitivity and Specificity Sun, 17 Feb 2013 07:59:18 +0000 Background. Gene fusions arising from chromosomal translocations have been implicated in cancer. RNA-seq has the potential to discover such rearrangements generating functional proteins (chimera/fusion). Recently, many methods for chimeras detection have been published. However, specificity and sensitivity of those tools were not extensively investigated in a comparative way. Results. We tested eight fusion-detection tools (FusionHunter, FusionMap, FusionFinder, MapSplice, deFuse, Bellerophontes, ChimeraScan, and TopHat-fusion) to detect fusion events using synthetic and real datasets encompassing chimeras. The comparison analysis run only on synthetic data could generate misleading results since we found no counterpart on real dataset. Furthermore, most tools report a very high number of false positive chimeras. In particular, the most sensitive tool, ChimeraScan, reports a large number of false positives that we were able to significantly reduce by devising and applying two filters to remove fusions not supported by fusion junction-spanning reads or encompassing large intronic regions. Conclusions. The discordant results obtained using synthetic and real datasets suggest that synthetic datasets encompassing fusion events may not fully catch the complexity of RNA-seq experiment. Moreover, fusion detection tools are still limited in sensitivity or specificity; thus, there is space for further improvement in the fusion-finder algorithms. Matteo Carrara, Marco Beccuti, Fulvio Lazzarato, Federica Cavallo, Francesca Cordero, Susanna Donatelli, and Raffaele A. Calogero Copyright © 2013 Matteo Carrara et al. All rights reserved. Simpute: An Efficient Solution for Dense Genotypic Data Sun, 03 Feb 2013 13:35:37 +0000 Single nucleotide polymorphism (SNP) data derived from array-based technology or massive parallel sequencing are often flawed with missing data. Missing SNPs can bias the results of association analyses. To maximize information usage, imputation is often adopted to compensate for the missing data by filling in the most probable values. To better understand the available tools for this purpose, we compare the imputation performances among BEAGLE, IMPUTE, BIMBAM, SNPMStat, MACH, and PLINK with data generated by randomly masking the genotype data from the International HapMap Phase III project. In addition, we propose a new algorithm called simple imputation (Simpute) that benefits from the high resolution of the SNPs in the array platform. Simpute does not require any reference data. The best feature of Simpute is its computational efficiency with complexity of order , where is the number of missing SNPs, is the number of the positions of the missing SNPs, and is the number of people considered. Simpute is suitable for regular screening of the large-scale SNP genotyping particularly when the sample size is large, and efficiency is a major concern in the analysis. Yen-Jen Lin, Chun-Tien Chang, Chuan Yi Tang, and Wen-Ping Hsieh Copyright © 2013 Yen-Jen Lin et al. All rights reserved. On the Structural Context and Identification of Enzyme Catalytic Residues Sun, 03 Feb 2013 07:58:50 +0000 Enzymes play important roles in most of the biological processes. Although only a small fraction of residues are directly involved in catalytic reactions, these catalytic residues are the most crucial parts in enzymes. The study of the fundamental and unique features of catalytic residues benefits the understanding of enzyme functions and catalytic mechanisms. In this work, we analyze the structural context of catalytic residues based on theoretical and experimental structure flexibility. The results show that catalytic residues have distinct structural features and context. Their neighboring residues, whether sequence or structure neighbors within specific range, are usually structurally more rigid than those of noncatalytic residues. The structural context feature is combined with support vector machine to identify catalytic residues from enzyme structure. The prediction results are better or comparable to those of recent structure-based prediction methods. Yu-Tung Chien and Shao-Wei Huang Copyright © 2013 Yu-Tung Chien and Shao-Wei Huang. All rights reserved. In Silico Prediction and In Vitro Characterization of Multifunctional Human RNase3 Thu, 17 Jan 2013 14:51:29 +0000 Human ribonucleases A (hRNaseA) superfamily consists of thirteen members with high-structure similarities but exhibits divergent physiological functions other than RNase activity. Evolution of hRNaseA superfamily has gained novel functions which may be preserved in a unique region or domain to account for additional molecular interactions. hRNase3 has multiple functions including ribonucleolytic, heparan sulfate (HS) binding, cellular binding, endocytic, lipid destabilization, cytotoxic, and antimicrobial activities. In this study, three putative multifunctional regions, 34RWRCK38 (HBR1), 75RSRFR79 (HBR2), and 101RPGRR105 (HBR3), of hRNase3 have been identified employing in silico sequence analysis and validated employing in vitro activity assays. A heparin binding peptide containing HBR1 is characterized to act as a key element associated with HS binding, cellular binding, and lipid binding activities. In this study, we provide novel insights to identify functional regions of hRNase3 that may have implications for all hRNaseA superfamily members. Pei-Chun Lien, Ping-Hsueh Kuo, Chien-Jung Chen, Hsiu-Hui Chang, Shun-lung Fang, Wei-Shuo Wu, Yiu-Kay Lai, Tun-Wen Pai, and Margaret Dah-Tsyr Chang Copyright © 2013 Pei-Chun Lien et al. All rights reserved. Using Nanoinformatics Methods for Automatically Identifying Relevant Nanotoxicology Entities from the Literature Thu, 27 Dec 2012 14:16:33 +0000 Nanoinformatics is an emerging research field that uses informatics techniques to collect, process, store, and retrieve data, information, and knowledge on nanoparticles, nanomaterials, and nanodevices and their potential applications in health care. In this paper, we have focused on the solutions that nanoinformatics can provide to facilitate nanotoxicology research. For this, we have taken a computational approach to automatically recognize and extract nanotoxicology-related entities from the scientific literature. The desired entities belong to four different categories: nanoparticles, routes of exposure, toxic effects, and targets. The entity recognizer was trained using a corpus that we specifically created for this purpose and was validated by two nanomedicine/nanotoxicology experts. We evaluated the performance of our entity recognizer using 10-fold cross-validation. The precisions range from 87.6% (targets) to 93.0% (routes of exposure), while recall values range from 82.6% (routes of exposure) to 87.4% (toxic effects). These results prove the feasibility of using computational approaches to reliably perform different named entity recognition (NER)-dependent tasks, such as for instance augmented reading or semantic searches. This research is a “proof of concept” that can be expanded to stimulate further developments that could assist researchers in managing data, information, and knowledge at the nanolevel, thus accelerating research in nanomedicine. Miguel García-Remesal, Alejandro García-Ruiz, David Pérez-Rey, Diana de la Iglesia, and Víctor Maojo Copyright © 2013 Miguel García-Remesal et al. All rights reserved. On the Difference in Quality between Current Heuristic and Optimal Solutions to the Protein Structure Alignment Problem Sun, 23 Dec 2012 13:19:38 +0000 The importance of pairwise protein structural comparison in biomedical research is fueling the search for algorithms capable of finding more accurate structural match of two input proteins in a timely manner. In recent years, we have witnessed rapid advances in the development of methods for approximate and optimal solutions to the protein structure matching problem. Albeit slow, these methods can be extremely useful in assessing the accuracy of more efficient, heuristic algorithms. We utilize a recently developed approximation algorithm for protein structure matching to demonstrate that a deep search of the protein superposition space leads to increased alignment accuracy with respect to many well-established measures of alignment quality. The results of our study suggest that a large and important part of the protein superposition space remains unexplored by current techniques for protein structure alignment. Mauricio Arriagada and Aleksandar Poleksic Copyright © 2013 Mauricio Arriagada and Aleksandar Poleksic. All rights reserved. Cancer Vaccines: State of the Art of the Computational Modeling Approaches Sun, 23 Dec 2012 10:50:59 +0000 Cancer vaccines are a real application of the extensive knowledge of immunology to the field of oncology. Tumors are dynamic complex systems in which several entities, events, and conditions interact among them resulting in growth, invasion, and metastases. The immune system includes many cells and molecules that cooperatively act to protect the host organism from foreign agents. Interactions between the immune system and the tumor mass include a huge number of biological factors. Testing of some cancer vaccine features, such as the best conditions for vaccine administration or the identification of candidate antigenic stimuli, can be very difficult or even impossible only through experiments with biological models simply because a high number of variables need to be considered at the same time. This is where computational models, and, to this extent, immunoinformatics, can prove handy as they have shown to be able to reproduce enough biological complexity to be of use in suggesting new experiments. Indeed, computational models can be used in addition to biological models. We now experience that biologists and medical doctors are progressively convinced that modeling can be of great help in understanding experimental results and planning new experiments. This will boost this research in the future. Francesco Pappalardo, Ferdinando Chiacchio, and Santo Motta Copyright © 2013 Francesco Pappalardo et al. All rights reserved. Three-Dimensional Visualization with Large Data Sets: A Simulation of Spreading Cortical Depression in Human Brain Wed, 31 Oct 2012 13:17:01 +0000 We developed 3D simulation software of human organs/tissues; we developed a database to store the related data, a data management system to manage the created data, and a metadata system for the management of data. This approach provides two benefits: first of all the developed system does not require to keep the patient's/subject's medical images on the system, providing less memory usage. Besides the system also provides 3D simulation and modification options, which will help clinicians to use necessary tools for visualization and modification operations. The developed system is tested in a case study, in which a 3D human brain model is created and simulated from 2D MRI images of a human brain, and we extended the 3D model to include the spreading cortical depression (SCD) wave front, which is an electrical phoneme that is believed to cause the migraine. Korhan Levent Ertürk and Gökhan Şengül Copyright © 2012 Korhan Levent Ertürk and Gökhan Şengül. All rights reserved. Signal-BNF: A Bayesian Network Fusing Approach to Predict Signal Peptides Mon, 15 Oct 2012 16:08:34 +0000 A signal peptide is a short peptide chain that directs the transport of a protein and has become the crucial vehicle in finding new drugs or reprogramming cells for gene therapy. As the avalanche of new protein sequences generated in the postgenomic era, the challenge of identifying new signal sequences has become even more urgent and critical in biomedical engineering. In this paper, we propose a novel predictor called Signal-BNF to predict the N-terminal signal peptide as well as its cleavage site based on Bayesian reasoning network. Signal-BNF is formed by fusing the results of different Bayesian classifiers which used different feature datasets as its input through weighted voting system. Experiment results show that Signal-BNF is superior to the popular online predictors such as Signal-3L and PrediSi. Signal-BNF is featured by high prediction accuracy that may serve as a useful tool for further investigating many unclear details regarding the molecular mechanism of the zip code protein-sorting system in cells. Zhi Zheng, Youying Chen, Liping Chen, Gongde Guo, Yongxian Fan, and Xiangzeng Kong Copyright © 2012 Zhi Zheng et al. All rights reserved. Erratum to “Unsupervised Two-Way Clustering of Metagenomic Sequences” Wed, 03 Oct 2012 10:17:54 +0000 Shruthi Prabhakara and Raj Acharya Copyright © 2012 Shruthi Prabhakara and Raj Acharya. All rights reserved. Biometrics and Biosecurity Mon, 03 Sep 2012 13:31:21 +0000 Tai-hoon Kim, Sabah Mohammed, Carlos Ramos, Osvaldo Gervasi, Wai-Chi Fang, and Adrian Stoica Copyright © 2012 Tai-hoon Kim et al. All rights reserved. A Privacy-Preserved Analytical Method for eHealth Database with Minimized Information Loss Thu, 30 Aug 2012 09:58:45 +0000 Digitizing medical information is an emerging trend that employs information and communication technology (ICT) to manage health records, diagnostic reports, and other medical data more effectively, in order to improve the overall quality of medical services. However, medical information is highly confidential and involves private information, even legitimate access to data raises privacy concerns. Medical records provide health information on an as-needed basis for diagnosis and treatment, and the information is also important for medical research and other health management applications. Traditional privacy risk management systems have focused on reducing reidentification risk, and they do not consider information loss. In addition, such systems cannot identify and isolate data that carries high risk of privacy violations. This paper proposes the Hiatus Tailor (HT) system, which ensures low re-identification risk for medical records, while providing more authenticated information to database users and identifying high-risk data in the database for better system management. The experimental results demonstrate that the HT system achieves much lower information loss than traditional risk management methods, with the same risk of re-identification. Ya-Ling Chen, Bo-Chao Cheng, Hsueh-Lin Chen, Chia-I Lin, Guo-Tan Liao, Bo-Yu Hou, and Shih-Chun Hsu Copyright © 2012 Ya-Ling Chen et al. All rights reserved. Anatomy of Biometric Passports Sun, 26 Aug 2012 10:04:18 +0000 Travelling is becoming available for more and more people. Millions of people are on a way every day. That is why a better control over global human transfer and a more reliable identity check is desired. A recent trend in a field of personal identification documents is to use RFID (Radio Frequency Identification) technology and biometrics, especially (but not only) in passports. This paper provides an insight into the electronic passports (also called e-passport or ePassport) implementation chosen in the Czech Republic. Such a summary is needed for further studies of biometric passports implementation security and biometric passports analysis. A separate description of the Czech solution is a prerequisite for a planned analysis, because of the uniqueness of each implementation. (Each country can choose the implementation details within a range specified by the ICAO (International Civil Aviation Organisation); moreover, specific security mechanisms are optional and can be omitted). Dominik Malčík and Martin Drahanský Copyright © 2012 Dominik Malčík and Martin Drahanský. All rights reserved. Advanced Pulse Oximetry System for Remote Monitoring and Management Thu, 09 Aug 2012 09:11:00 +0000 Pulse oximetry data such as saturation of peripheral oxygen (SpO2) and pulse rate are vital signals for early diagnosis of heart disease. Therefore, various pulse oximeters have been developed continuously. However, some of the existing pulse oximeters are not equipped with communication capabilities, and consequently, the continuous monitoring of patient health is restricted. Moreover, even though certain oximeters have been built as network models, they focus on exchanging only pulse oximetry data, and they do not provide sufficient device management functions. In this paper, we propose an advanced pulse oximetry system for remote monitoring and management. The system consists of a networked pulse oximeter and a personal monitoring server. The proposed pulse oximeter measures a patient’s pulse oximetry data and transmits the data to the personal monitoring server. The personal monitoring server then analyzes the received data and displays the results to the patient. Furthermore, for device management purposes, operational errors that occur in the pulse oximeter are reported to the personal monitoring server, and the system configurations of the pulse oximeter, such as thresholds and measurement targets, are modified by the server. We verify that the proposed pulse oximetry system operates efficiently and that it is appropriate for monitoring and managing a pulse oximeter in real time. Ju Geon Pak and Kee Hyun Park Copyright © 2012 Ju Geon Pak and Kee Hyun Park. All rights reserved. A Collaborative Molecular Modeling Environment Using a Virtual Tunneling Service Wed, 08 Aug 2012 10:25:56 +0000 Collaborative researches of three-dimensional molecular modeling can be limited by different time zones and locations. A networked virtual environment can be utilized to overcome the problem caused by the temporal and spatial differences. However, traditional approaches did not sufficiently consider integration of different computing environments, which were characterized by types of applications, roles of users, and so on. We propose a collaborative molecular modeling environment to integrate different molecule modeling systems using a virtual tunneling service. We integrated Co-Coot, which is a collaborative crystallographic object-oriented toolkit, with VRMMS, which is a virtual reality molecular modeling system, through a collaborative tunneling system. The proposed system showed reliable quantitative and qualitative results through pilot experiments. Jun Lee, Jee-In Kim, and Lin-Woo Kang Copyright © 2012 Jun Lee et al. All rights reserved. A Classification Method of Normal and Overweight Females Based on Facial Features for Automated Medical Applications Sun, 05 Aug 2012 09:12:46 +0000 Obesity and overweight have become serious public health problems worldwide. Obesity and abdominal obesity are associated with type 2 diabetes, cardiovascular diseases, and metabolic syndrome. In this paper, we first suggest a method of predicting normal and overweight females according to body mass index (BMI) based on facial features. A total of 688 subjects participated in this study. We obtained the area under the ROC curve (AUC) value of 0.861 and kappa value of 0.521 in Female: 21–40 (females aged 21–40 years) group, and AUC value of 0.76 and kappa value of 0.401 in Female: 41–60 (females aged 41–60 years) group. In two groups, we found many features showing statistical differences between normal and overweight subjects by using an independent two-sample t-test. We demonstrated that it is possible to predict BMI status using facial characteristics. Our results provide useful information for studies of obesity and facial characteristics, and may provide useful clues in the development of applications for alternative diagnosis of obesity in remote healthcare. Bum Ju Lee, Jun-Hyeong Do, and Jong Yeol Kim Copyright © 2012 Bum Ju Lee et al. All rights reserved. A New Method of Diagnosing Constitutional Types Based on Vocal and Facial Features for Personalized Medicine Tue, 31 Jul 2012 13:55:41 +0000 The aim of the present study is to develop an accurate constitution diagnostic method based solely on the individual’s physical characteristics, irrespective of psychologic traits, characteristics of clinical medicine, and genetic factors. In this paper, we suggest a novel method for diagnosing constitutional types using only speech and face characteristics. Based on 514 subjects, the area under the receiver operating characteristics curve (AUC) values of classification models in age and gender groups ranged from 0.64 to 0.89. We identified significant features showing statistical differences among three constitutional types by performing statistical analysis. Also, we selected a compact and discriminative feature subset for constitution diagnosis in each age and gender group. Our method may support the direction of improved diagnosis prediction and will serve to develop a personal and automatic constitution diagnosis software for improvement of the effectiveness of prescribed medications and development of personalized medicine. Bum Ju Lee, Boncho Ku, Kihyun Park, Keun Ho Kim, and Jong Yeol Kim Copyright © 2012 Bum Ju Lee et al. All rights reserved. An Integrated Gateway for Various PHDs in U-Healthcare Environments Tue, 31 Jul 2012 13:40:24 +0000 We propose an integrated gateway for various personal health devices (PHDs). This gateway receives measurements from various PHDs and conveys them to a remote monitoring server (MS). It provides two kinds of transmission modes: immediate transmission and integrated transmission. The former mode operates if a measurement exceeds a predetermined threshold or in the case of an emergency. In the latter mode, the gateway retains the measurements instead of forwarding them. When the reporting time comes, the gateway extracts all the stored measurements, integrates them into one message, and transmits the integrated message to the MS. Through this mechanism, the transmission overhead can be reduced. On the basis of the proposed gateway, we construct a u-healthcare system comprising an activity monitor, a medication dispenser, and a pulse oximeter. The evaluation results show that the size of separate messages from various PHDs is reduced through the integration process, and the process does not require much time; the integration time is negligible. KeeHyun Park and JuGeon Pak Copyright © 2012 KeeHyun Park and JuGeon Pak. All rights reserved. Security Analysis and Enhancements of an Effective Biometric-Based Remote User Authentication Scheme Using Smart Cards Tue, 31 Jul 2012 08:42:22 +0000 Recently, many biometrics-based user authentication schemes using smart cards have been proposed to improve the security weaknesses in user authentication system. In 2011, Das proposed an efficient biometric-based remote user authentication scheme using smart cards that can provide strong authentication and mutual authentication. In this paper, we analyze the security of Das’s authentication scheme, and we have shown that Das’s authentication scheme is still insecure against the various attacks. Also, we proposed the enhanced scheme to remove these security problems of Das’s authentication scheme, even if the secret information stored in the smart card is revealed to an attacker. As a result of security analysis, we can see that the enhanced scheme is secure against the user impersonation attack, the server masquerading attack, the password guessing attack, and the insider attack and provides mutual authentication between the user and the server. Younghwa An Copyright © 2012 Younghwa An. All rights reserved. A Hybrid Technique for Medical Image Segmentation Mon, 30 Jul 2012 13:14:26 +0000 Medical image segmentation is an essential and challenging aspect in computer-aided diagnosis and also in pattern recognition research. This paper proposes a hybrid method for magnetic resonance (MR) image segmentation. We first remove impulsive noise inherent in MR images by utilizing a vector median filter. Subsequently, Otsu thresholding is used as an initial coarse segmentation method that finds the homogeneous regions of the input image. Finally, an enhanced suppressed fuzzy c-means is used to partition brain MR images into multiple segments, which employs an optimal suppression factor for the perfect clustering in the given data set. To evaluate the robustness of the proposed approach in noisy environment, we add different types of noise and different amount of noise to T1-weighted brain MR images. Experimental results show that the proposed algorithm outperforms other FCM based algorithms in terms of segmentation accuracy for both noise-free and noise-inserted MR images. Alamgir Nyma, Myeongsu Kang, Yung-Keun Kwon, Cheol-Hong Kim, and Jong-Myon Kim Copyright © 2012 Alamgir Nyma et al. All rights reserved. Construction of a Smart Medication Dispenser with High Degree of Scalability and Remote Manageability Thu, 26 Jul 2012 14:06:03 +0000 We propose a smart medication dispenser having a high degree of scalability and remote manageability. We construct the dispenser to have extensible hardware architecture for achieving scalability, and we install an agent program in it for achieving remote manageability. The dispenser operates as follows: when the real-time clock reaches the predetermined medication time and the user presses the dispense button at that time, the predetermined medication is dispensed from the medication dispensing tray (MDT). In the proposed dispenser, the medication for each patient is stored in an MDT. One smart medication dispenser contains mainly one MDT; however, the dispenser can be extended to include more MDTs in order to support multiple users using one dispenser. For remote management, the proposed dispenser transmits the medication status and the system configurations to the monitoring server. In the case of a specific event such as a shortage of medication, memory overload, software error, or non-adherence, the event is transmitted immediately. All these operations are performed automatically without the intervention of patients, through the agent program installed in the dispenser. Results of implementation and verification show that the proposed dispenser operates normally and performs the management operations from the medication monitoring server suitably. JuGeon Pak and KeeHyun Park Copyright © 2012 JuGeon Pak and KeeHyun Park. All rights reserved. Real-Time Clinical Decision Support System with Data Stream Mining Wed, 18 Jul 2012 15:51:18 +0000 This research aims to describe a new design of data stream mining system that can analyze medical data stream and make real-time prediction. The motivation of the research is due to a growing concern of combining software technology and medical functions for the development of software application that can be used in medical field of chronic disease prognosis and diagnosis, children healthcare, diabetes diagnosis, and so forth. Most of the existing software technologies are case-based data mining systems. They only can analyze finite and structured data set and can only work well in their early years and can hardly meet today's medical requirement. In this paper, we describe a clinical-support-system based data stream mining technology; the design has taken into account all the shortcomings of the existing clinical support systems. Yang Zhang, Simon Fong, Jinan Fiaidhi, and Sabah Mohammed Copyright © 2012 Yang Zhang et al. All rights reserved. A Survey and Proposed Framework on the Soft Biometrics Technique for Human Identification in Intelligent Video Surveillance System Mon, 16 Jul 2012 14:23:28 +0000 Biometrics verification can be efficiently used for intrusion detection and intruder identification in video surveillance systems. Biometrics techniques can be largely divided into traditional and the so-called soft biometrics. Whereas traditional biometrics deals with physical characteristics such as face features, eye iris, and fingerprints, soft biometrics is concerned with such information as gender, national origin, and height. Traditional biometrics is versatile and highly accurate. But it is very difficult to get traditional biometric data from a distance and without personal cooperation. Soft biometrics, although featuring less accuracy, can be used much more freely though. Recently, many researchers have been made on human identification using soft biometrics data collected from a distance. In this paper, we use both traditional and soft biometrics for human identification and propose a framework for solving such problems as lighting, occlusion, and shadowing. Min-Gu Kim, Hae-Min Moon, Yongwha Chung, and Sung Bum Pan Copyright © 2012 Min-Gu Kim et al. All rights reserved. Bayesian Integration of Isotope Ratio for Geographic Sourcing of Castor Beans Sun, 15 Jul 2012 17:34:44 +0000 Recent years have seen an increase in the forensic interest associated with the poison ricin, which is extracted from the seeds of the Ricinus communis plant. Both light element (C, N, O, and H) and strontium (Sr) isotope ratios have previously been used to associate organic material with geographic regions of origin. We present a Bayesian integration methodology that can more accurately predict the region of origin for a castor bean than individual models developed independently for light element stable isotopes or Sr isotope ratios. Our results demonstrate a clear improvement in the ability to correctly classify regions based on the integrated model with a class accuracy of 60.9±2.1% versus 55.9±2.1% and 40.2±1.8% for the light element and strontium (Sr) isotope ratios, respectively. In addition, we show graphically the strengths and weaknesses of each dataset in respect to class prediction and how the integration of these datasets strengthens the overall model. Bobbie-Jo Webb-Robertson, Helen Kreuzer, Garret Hart, James Ehleringer, Jason West, Gary Gill, and Douglas Duckworth Copyright © 2012 Bobbie-Jo Webb-Robertson et al. All rights reserved. Secure Remote Health Monitoring with Unreliable Mobile Devices Sun, 15 Jul 2012 12:34:44 +0000 As the nation’s healthcare information infrastructure continues to evolve, new technologies promise to provide readily accessible health information that can help people address personal and community health concerns. In particular, wearable and implantable medical sensors and portable computing devices present many opportunities for providing timely health information to health providers, public health professionals, and consumers. Concerns about privacy and information quality, however, may impede the development and deployment of these technologies for remote health monitoring. Patients may fail to apply sensors correctly, device can be stolen or compromised (exposing the medical data therein to a malicious party), low-cost sensors controlled by a capable attacker might generate falsified data, and sensor data sent to the server can be captured in the air by an eavesdropper; there are many opportunities for sensitive health data to be lost, forged, or exposed. In this paper, we design a framework for secure remote health-monitoring systems; we build a realistic risk model for sensor-data quality and propose a new health-monitoring architecture that is secure despite the weaknesses of common personal devices. For evaluation, we plan to implement a proof of concept for secure health monitoring. Minho Shin Copyright © 2012 Minho Shin. All rights reserved. Criminal Genomic Pragmatism: Prisoners' Representations of DNA Technology and Biosecurity Mon, 25 Jun 2012 15:55:10 +0000 Background. Within the context of the use of DNA technology in crime investigation, biosecurity is perceived by different stakeholders according to their particular rationalities and interests. Very little is known about prisoners’ perceptions and assessments of the uses of DNA technology in solving crime. Aim. To propose a conceptual model that serves to analyse and interpret prisoners’ representations of DNA technology and biosecurity. Methods. A qualitative study using an interpretative approach based on 31 semi-structured tape-recorded interviews was carried out between May and September 2009, involving male inmates in three prisons located in the north of Portugal. The content analysis focused on the following topics: the meanings attributed to DNA and assessments of the risks and benefits of the uses of DNA technology and databasing in forensic applications. Results. DNA was described as a record of identity, an exceptional material, and a powerful biometric identifier. The interviewees believed that DNA can be planted to incriminate suspects. Convicted offenders argued for the need to extend the criteria for the inclusion of DNA profiles in forensic databases and to restrict the removal of profiles. Conclusions. The conceptual model entitled criminal genomic pragmatism allows for an understanding of the views of prison inmates regarding DNA technology and biosecurity. Helena Machado and Susana Silva Copyright © 2012 Helena Machado and Susana Silva. All rights reserved. Comparison of Two Suspension Arrays for Simultaneous Detection of Five Biothreat Bacterial in Powder Samples Tue, 29 May 2012 08:06:50 +0000 We have developed novel Bio-Plex assays for simultaneous detection of Bacillus anthracis, Yersinia pestis, Brucella spp., Francisella tularensis, and Burkholderia pseudomallei. Universal primers were used to amplify highly conserved region located within the 16S rRNA amplicon, followed by hybridized to pathogen-specific probes for identification of these five organisms. The other assay is based on multiplex PCR to simultaneously amplify five species-specific pathogen identification-targeted regions unique to individual pathogen. Both of the two arrays are validated to be flexible and sensitive for simultaneous detection of bioterrorism bacteria. However, universal primer PCR-based array could not identify Bacillus anthracis, Yersinia pestis, and Brucella spp. at the species level because of the high conservation of 16S rDNA of the same genus. The two suspension arrays can be utilized to detect Bacillus anthracis sterne spore and Yersinia pestis EV76 from mimic “write powder” samples, they also proved that the suspension array system will be valuable tools for diagnosis of bacterial biothreat agents in environmental samples. Yu Yang, Jing Wang, Haiyan Wen, and Hengchuan Liu Copyright © 2012 Yu Yang et al. All rights reserved. Advanced Computational Methods in Molecular Medicine Sun, 20 May 2012 15:26:46 +0000 Alejandro Giorgetti, Paolo Ruggerone, Sergio Pantano, and Paolo Carloni Copyright © 2012 Alejandro Giorgetti et al. All rights reserved. Finger Vein Recognition Based on (2D)2 PCA and Metric Learning Sun, 20 May 2012 10:21:57 +0000 Finger vein recognition is a promising biometric recognition technology, which verifies identities via the vein patterns in the fingers. In this paper, (2D)2 PCA is applied to extract features of finger veins, based on which a new recognition method is proposed in conjunction with metric learning. It learns a KNN classifier for each individual, which is different from the traditional methods where a fixed threshold is employed for all individuals. Besides, the SMOTE technology is adopted to solve the class-imbalance problem. Our experiments show that the proposed method is effective by achieving a recognition rate of 99.17%. Gongping Yang, Xiaoming Xi, and Yilong Yin Copyright © 2012 Gongping Yang et al. All rights reserved. Influence of Skin Diseases on Fingerprint Recognition Thu, 10 May 2012 09:07:53 +0000 There are many people who suffer from some of the skin diseases. These diseases have a strong influence on the process of fingerprint recognition. People with fingerprint diseases are unable to use fingerprint scanners, which is discriminating for them, since they are not allowed to use their fingerprints for the authentication purposes. First in this paper the various diseases, which might influence functionality of the fingerprint-based systems, are introduced, mainly from the medical point of view. This overview is followed by some examples of diseased finger fingerprints, acquired both from dactyloscopic card and electronic sensors. At the end of this paper the proposed fingerprint image enhancement algorithm is described. Martin Drahansky, Michal Dolezel, Jaroslav Urbanek, Eva Brezinova, and Tai-hoon Kim Copyright © 2012 Martin Drahansky et al. All rights reserved. Using Hierarchical Time Series Clustering Algorithm and Wavelet Classifier for Biometric Voice Classification Thu, 26 Apr 2012 11:42:16 +0000 Voice biometrics has a long history in biosecurity applications such as verification and identification based on characteristics of the human voice. The other application called voice classification which has its important role in grouping unlabelled voice samples, however, has not been widely studied in research. Lately voice classification is found useful in phone monitoring, classifying speakers’ gender, ethnicity and emotion states, and so forth. In this paper, a collection of computational algorithms are proposed to support voice classification; the algorithms are a combination of hierarchical clustering, dynamic time wrap transform, discrete wavelet transform, and decision tree. The proposed algorithms are relatively more transparent and interpretable than the existing ones, though many techniques such as Artificial Neural Networks, Support Vector Machine, and Hidden Markov Model (which inherently function like a black box) have been applied for voice verification and voice identification. Two datasets, one that is generated synthetically and the other one empirically collected from past voice recognition experiment, are used to verify and demonstrate the effectiveness of our proposed voice classification algorithm. Simon Fong Copyright © 2012 Simon Fong. All rights reserved. Facing Current Quantification Challenges in Protein Microarrays Tue, 24 Apr 2012 08:02:28 +0000 The proteome is highly variable and differs from cell to cell. The reasons are posttranslational modifications, splice variants, and polymorphisms. Techniques like next-generation sequencing can only give an inadequate picture of the protein status of a cell. Protein microarrays are able to track these changes on the level they occur: the proteomic level. Therefore, protein microarrays are powerful tools for relative protein quantification, to unveil new interaction partners and to track posttranslational modifications. This papers gives an overview on current protein microarray techniques and discusses recent advances in relative protein quantification. Robert Wellhausen and Harald Seitz Copyright © 2012 Robert Wellhausen and Harald Seitz. All rights reserved. An Integrative Approach to Infer Regulation Programs in a Transcription Regulatory Module Network Wed, 11 Apr 2012 15:06:11 +0000 The module network method, a special type of Bayesian network algorithms, has been proposed to infer transcription regulatory networks from gene expression data. In this method, a module represents a set of genes, which have similar expression profiles and are regulated by same transcription factors. The process of learning module networks consists of two steps: first clustering genes into modules and then inferring the regulation program (transcription factors) of each module. Many algorithms have been designed to infer the regulation program of a given gene module, and these algorithms show very different biases in detecting regulatory relationships. In this work, we explore the possibility of integrating results from different algorithms. The integration methods we select are union, intersection, and weighted rank aggregation. Experiments in a yeast dataset show that the union and weighted rank aggregation methods produce more accurate predictions than those given by individual algorithms, whereas the intersection method does not yield any improvement in the accuracy of predictions. In addition, somewhat surprisingly, the union method, which has a lower computational cost than rank aggregation, achieves comparable results as given by rank aggregation. Jianlong Qi, Tom Michoel, and Gregory Butler Copyright © 2012 Jianlong Qi et al. All rights reserved. FGF Receptor-Mediated Gene Delivery Using Ligands Coupled to PEI-β-CyD Wed, 11 Apr 2012 14:05:04 +0000 A novel vector with high gene delivery efficiency and special cell-targeting ability was developed using a good strategy that utilized low-molecular-weight polyethylenimine (PEI; molecular weight: 600 KDa [PEI600]) crosslinked to β-cyclodextrin (β-CyD) via a facile synthetic route. Fibroblast growth factor receptors (FGFRs) are highly expressed in a variety of human cancer cells and are potential targets for cancer therapy. In this paper, CY11 peptides, which have been proven to combine especially with FGFRs on cell membranes were coupled to PEI-β-CyD using N-succinimidyl-3-(2-pyridyldithio) propionate as a linker. The ratios of PEI600, β-CyD, and peptide were calculated based on proton integral values obtained from the 1H-NMR spectra of the resulting products. Electron microscope observations showed that CY11-PEI-β-CyD can efficiently condense plasmid DNA (pDNA) into nanoparticles of about 200 nm, and MTT assays suggested the decreased toxicity of the polymer. Experiments on gene delivery efficiency in vitro showed that CY11-PEI-β-CyD/pDNA polyplexes had significantly greater transgene activities than PEI-β-CyD/pDNA in the COS-7 and HepG2 cells, which positively expressed FGFR, whereas no such effect was observed in the PC-3 cells, which negatively expressed FGFR. Our current research indicated that the synthesized nonviral vector shows improved gene delivery efficiency and targeting specificity in FGFR-positive cells. Yiping Hu, Guping Tang, Jun Liu, Wenxiang Cheng, Ye Yue, Jinchao Li, and Peng Zhang Copyright © 2012 Yiping Hu et al. All rights reserved. Application of Serial Analysis of Gene Expression to the Study of the Gene Expression Profile of Leishmania infantum chagasi Promastigote Tue, 10 Apr 2012 17:30:10 +0000 This study describes the application of the LongSAGE methodology to study the gene expression profile in promastigotes of Leishmania infantum chagasi. A tag library was created using the LongSAGE method and consisted of 14,208 tags of 17 bases. Of these, 8,427 (59.3%) were distinct. BLAST research of the 1,645 most abundant tags showed that 12.8% of them identified the coding sequences of genes, while 82% (1,349/1,645) identified one or more genomic sequences that did not correspond with open reading frames. Only 5.2% (84/1,645) of the tags were not aligned to any position in the L. infantum genome. The UTR size of Leishmania and the lack of CATG sites in some transcripts were decisive for the generation of tags in these regions. Additional analysis will allow a better understanding of the expression profile and discovering the key genes in this life cycle. Adelino Soares Lima Neto, Osvaldo Pompílio de Melo Neto, and Carlos Henrique Nery Costa Copyright © 2012 Adelino Soares Lima Neto et al. All rights reserved. Unsupervised Two-Way Clustering of Metagenomic Sequences Thu, 05 Apr 2012 10:30:15 +0000 A major challenge facing metagenomics is the development of tools for the characterization of functional and taxonomic content of vast amounts of short metagenome reads. The efficacy of clustering methods depends on the number of reads in the dataset, the read length and relative abundances of source genomes in the microbial community. In this paper, we formulate an unsupervised naive Bayes multispecies, multidimensional mixture model for reads from a metagenome. We use the proposed model to cluster metagenomic reads by their species of origin and to characterize the abundance of each species. We model the distribution of word counts along a genome as a Gaussian for shorter, frequent words and as a Poisson for longer words that are rare. We employ either a mixture of Gaussians or mixture of Poissons to model reads within each bin. Further, we handle the high-dimensionality and sparsity associated with the data, by grouping the set of words comprising the reads, resulting in a two-way mixture model. Finally, we demonstrate the accuracy and applicability of this method on simulated and real metagenomes. Our method can accurately cluster reads as short as 100 bps and is robust to varying abundances, divergences and read lengths. Shruthi Prabhakara and Raj Acharya Copyright © 2012 Shruthi Prabhakara and Raj Acharya. All rights reserved. MarVis-Filter: Ranking, Filtering, Adduct and Isotope Correction of Mass Spectrometry Data Thu, 05 Apr 2012 10:28:56 +0000 Statistical ranking, filtering, adduct detection, isotope correction, and molecular formula calculation are essential tasks in processing mass spectrometry data in metabolomics studies. In order to obtain high-quality data sets, a framework which incorporates all these methods is required. We present the MarVis-Filter software, which provides well-established and specialized methods for processing mass spectrometry data. For the task of ranking and filtering multivariate intensity profiles, MarVis-Filter provides the ANOVA and Kruskal-Wallis tests with adjustment for multiple hypothesis testing. Adduct and isotope correction are based on a novel algorithm which takes the similarity of intensity profiles into account and allows user-defined ionization rules. The molecular formula calculation utilizes the results of the adduct and isotope correction. For a comprehensive analysis, MarVis-Filter provides an interactive interface to combine data sets deriving from positive and negative ionization mode. The software is exemplarily applied in a metabolic case study, where octadecanoids could be identified as markers for wounding in plants. Alexander Kaever, Manuel Landesfeind, Mareike Possienke, Kirstin Feussner, Ivo Feussner, and Peter Meinicke Copyright © 2012 Alexander Kaever et al. All rights reserved. Using Medical History Embedded in Biometrics Medical Card for User Identity Authentication: Privacy Preserving Authentication Model by Features Matching Thu, 05 Apr 2012 08:37:35 +0000 Many forms of biometrics have been proposed and studied for biometrics authentication. Recently researchers are looking into longitudinal pattern matching that based on more than just a singular biometrics; data from user’s activities are used to characterise the identity of a user. In this paper we advocate a novel type of authentication by using a user’s medical history which can be electronically stored in a biometric security card. This is a sequel paper from our previous work about defining abstract format of medical data to be queried and tested upon authentication. The challenge to overcome is preserving the user’s privacy by choosing only the useful features from the medical data for use in authentication. The features should contain less sensitive elements and they are implicitly related to the target illness. Therefore exchanging questions and answers about a few carefully chosen features in an open channel would not easily or directly expose the illness, but yet it can verify by inference whether the user has a record of it stored in his smart card. The design of a privacy preserving model by backward inference is introduced in this paper. Some live medical data are used in experiments for validation and demonstration. Simon Fong and Yan Zhuang Copyright © 2012 Simon Fong and Yan Zhuang. All rights reserved. Using Medical History Embedded in Biometrics Medical Card for User Identity Authentication: Data Representation by AVT Hierarchical Data Tree Wed, 04 Apr 2012 14:07:51 +0000 User authentication has been widely used by biometric applications that work on unique bodily features, such as fingerprints, retina scan, and palm vessels recognition. This paper proposes a novel concept of biometric authentication by exploiting a user’s medical history. Although medical history may not be absolutely unique to every individual person, the chances of having two persons who share an exactly identical trail of medical and prognosis history are slim. Therefore, in addition to common biometric identification methods, medical history can be used as ingredients for generating Q&A challenges upon user authentication. This concept is motivated by a recent advancement on smart-card technology that future identity cards are able to carry patents’ medical history like a mobile database. Privacy, however, may be a concern when medical history is used for authentication. Therefore in this paper, a new method is proposed for abstracting the medical data by using attribute value taxonomies, into a hierarchical data tree (h-Data). Questions can be abstracted to various level of resolution (hence sensitivity of private data) for use in the authentication process. The method is described and a case study is given in this paper. Simon Fong and Yan Zhuang Copyright © 2012 Simon Fong and Yan Zhuang. All rights reserved. Combined QM/MM Study of Thyroid and Steroid Hormone Analogue Interactions with 𝛼v𝛽3 Integrin Mon, 02 Apr 2012 09:57:23 +0000 Recent biochemical studies have identified a cell surface receptor for thyroid and steroid hormones that bind near the arginine-glycine-aspartate (RGD) recognition site on the heterodimeric αvβ3 integrin. To further characterize the intermolecular interactions for a series of hormone analogues, combined quantum mechanical and molecular mechanical (QM/MM) methods were used to calculate their interaction energies. All calculations were performed in the presence of either calcium (Ca2+) or magnesium (Mg2+) ions. These data reveal that 3,5′-triiodothyronine (T3) and 3,5,3′,5′-tetraiodothyroacetic acid (T4ac) bound in two different modes, occupying two alternate sites, one of which is along the Arg side chain of the RGD cyclic peptide site. These orientations differ from those of the other ligands whose alternate binding modes placed the ligands deeper within the RGD binding pocket. These observations are consistent with biological data that indicate the presence of two discrete binding sites that control distinct downstream signal transduction pathways for T3. Marek Freindorf, Thomas R. Furlani, Jing Kong, Vivian Cody, Faith B. Davis, and Paul J. Davis Copyright © 2012 Marek Freindorf et al. All rights reserved. Biological Applications of Hybrid Quantum Mechanics/Molecular Mechanics Calculation Wed, 28 Mar 2012 11:16:21 +0000 Since in most cases biological macromolecular systems including solvent water molecules are remarkably large, the computational costs of performing ab initio calculations for the entire structures are prohibitive. Accordingly, QM calculations that are jointed with MM calculations are crucial to evaluate the long-range electrostatic interactions, which significantly affect the electronic structures of biological macromolecules. A UNIX-shell-based interface program connecting the quantum mechanics (QMs) and molecular mechanics (MMs) calculation engines, GAMESS and AMBER, was developed in our lab. The system was applied to a metalloenzyme, azurin, and PU.1-DNA complex; thereby, the significance of the environmental effects on the electronic structures of the site of interest was elucidated. Subsequently, hybrid QM/MM molecular dynamics (MD) simulation using the calculation system was employed for investigation of mechanisms of hydrolysis (editing reaction) in leucyl-tRNA synthetase complexed with the misaminoacylated tRNALeu, and a novel mechanism of the enzymatic reaction was revealed. Thus, our interface program can play a critical role as a powerful tool for state-of-the-art sophisticated hybrid ab initio QM/MM MD simulations of large systems, such as biological macromolecules. Jiyoung Kang, Yohsuke Hagiwara, and Masaru Tateno Copyright © 2012 Jiyoung Kang et al. All rights reserved. Antilisterial Activity of Nisin-Like Bacteriocin-Producing Lactococcus lactis subsp. lactis Isolated from Traditional Sardinian Dairy Products Tue, 27 Mar 2012 14:36:36 +0000 With the aim of selecting LAB strains with antilisterial activity to be used as protective cultures to enhance the safety of dairy products, the antimicrobial properties of 117 Lactococcus lactis subsp. lactis isolated from artisanal Sardinian dairy products were evaluated, and six strains were found to produce bacteriocin-like substances. The capacity of these strains to antagonize Listeria monocytogenes during cocultivation in skimmed milk was evaluated, showing a reduction of L. monocytogenes counts of approximately 4 log units compared to the positive control after 24 h of incubation. In order for a strain to be used as bioprotective culture, it should be carefully evaluated for the presence of virulence factors, to determine what potential risks might be involved in its use. None of the strains tested was found to produce biogenic amines or to possess haemolytic activity. In addition, all strains were sensitive to clinically important antibiotics such as ampicillin, tetracycline, and vancomycin. Our results suggest that these bac+ strains could be potentially applied in cheese manufacturing to control the growth of L. monocytogenes. Sofia Cosentino, Maria Elisabetta Fadda, Maura Deplano, Roberta Melis, Rita Pomata, and Maria Barbara Pisano Copyright © 2012 Sofia Cosentino et al. All rights reserved. Investigation of Antimicrobial Activity and Statistical Optimization of Bacillus subtilis SPB1 Biosurfactant Production in Solid-State Fermentation Sat, 24 Mar 2012 20:07:09 +0000 During the last years, several applications of biosurfactants with medical purposes have been reported. Biosurfactants are considered relevant molecules for applications in combating many diseases. However, their use is currently extremely limited due to their high cost in relation to that of chemical surfactants. Use of inexpensive substrates can drastically decrease its production cost. Here, twelve solid substrates were screened for the production of Bacillus subtilis SPB1 biosurfactant and the maximum yield was found with millet. A Plackett-Burman design was then used to evaluate the effects of five variables (temperature, moisture, initial pH, inoculum age, and inoculum size). Statistical analyses showed that temperature, inoculum age, and moisture content had significantly positive effect on SPB1 biosurfactant production. Their values were further optimized using a central composite design and a response surface methodology. The optimal conditions of temperature, inoculum age, and moisture content obtained under the conditions of study were 37°C, 14 h, and 88%, respectively. The evaluation of the antimicrobial activity of this compound was carried out against 11 bacteria and 8 fungi. The results demonstrated that this biosurfactant exhibited an important antimicrobial activity against microorganisms with multidrug-resistant profiles. Its activity was very effective against Staphylococcus aureus, Staphylococcus xylosus, Enterococcus faecalis, Klebsiella pneumonia, and so forth. Dhouha Ghribi, Lobna Abdelkefi-Mesrati, Ines Mnif, Radhouan Kammoun, Imen Ayadi, Imen Saadaoui, Sameh Maktouf, and Semia Chaabouni-Ellouze Copyright © 2012 Dhouha Ghribi et al. All rights reserved. Computer-Based Annotation of Putative AraC/XylS-Family Transcription Factors of Known Structure but Unknown Function Tue, 13 Mar 2012 10:30:26 +0000 Currently, about 20 crystal structures per day are released and deposited in the Protein Data Bank. A significant fraction of these structures is produced by research groups associated with the structural genomics consortium. The biological function of many of these proteins is generally unknown or not validated by experiment. Therefore, a growing need for functional prediction of protein structures has emerged. Here we present an integrated bioinformatics method that combines sequence-based relationships and three-dimensional (3D) structural similarity of transcriptional regulators with computer prediction of their cognate DNA binding sequences. We applied this method to the AraC/XylS family of transcription factors, which is a large family of transcriptional regulators found in many bacteria controlling the expression of genes involved in diverse biological functions. Three putative new members of this family with known 3D structure but unknown function were identified for which a probable functional classification is provided. Our bioinformatics analyses suggest that they could be involved in plant cell wall degradation (Lin2118 protein from Listeria innocua, PDB code 3oou), symbiotic nitrogen fixation (protein from Chromobacterium violaceum, PDB code 3oio), and either metabolism of plant-derived biomass or nitrogen fixation (protein from Rhodopseudomonas palustris, PDB code 3mn2). Andreas Schüller, Alex W. Slater, Tomás Norambuena, Juan J. Cifuentes, Leonardo I. Almonacid, and Francisco Melo Copyright © 2012 Andreas Schüller et al. All rights reserved. Structural Insights into Interaction between Mammalian Methionine Sulfoxide Reductase B1 and Thioredoxin Tue, 13 Mar 2012 10:21:28 +0000 Maintenance of the cellular redox balance has vital importance for correcting organism functioning. Methionine sulfoxide reductases (Msrs) are among the key members of the cellular antioxidant defence system. To work properly, methionine sulfoxide reductases need to be reduced by their biological partner, thioredoxin (Trx). This process, according to the available kinetic data, represents the slowest step in the Msrs catalytic cycle. In the present paper, we investigated structural aspects of the intermolecular complex formation between mammalian MsrB1 and Trx. NMR spectroscopy and biocomputing were the two mostly used through the research approaches. The formation of NMR detectable MsrB1/Trx complex was monitored and studied in attempt to understand MsrB1 reduction mechanism. Using NMR data, molecular mechanics, protein docking, and molecular dynamics simulations, it was found that intermediate MsrB1/Trx complex is stabilized by interprotein β-layer. The complex formation accompanied by distortion of disulfide bond within MsrB1 facilitates the reduction of oxidized MsrB1 as it is evidenced by the obtained data. Olena Dobrovolska, Georgy Rychkov, Elena Shumilina, Kirill Nerinovski, Alexander Schmidt, Konstantin Shabalin, Alexander Yakimov, and Alexander Dikiy Copyright © 2012 Olena Dobrovolska et al. All rights reserved. Artificial Neural Network for the Prediction of Tyrosine-Based Sorting Signal Recognition by Adaptor Complexes Sun, 11 Mar 2012 09:20:29 +0000 Sorting of transmembrane proteins to various intracellular compartments depends on specific signals present within their cytosolic domains. Among these sorting signals, the tyrosine-based motif (YXXØ) is one of the best characterized and is recognized by 𝜇-subunits of the four clathrin-associated adaptor complexes (AP-1 to AP-4). Despite their overlap in specificity, each 𝜇-subunit has a distinct sequence preference dependent on the nature of the X-residues. Moreover, combinations of these residues exert cooperative or inhibitory effects towards interaction with the various APs. This complexity makes it impossible to predict a priori, the specificity of a given tyrosine-signal for a particular 𝜇-subunit. Here, we describe the results obtained with a computational approach based on the Artificial Neural Network (ANN) paradigm that addresses the issue of tyrosine-signal specificity, enabling the prediction of YXXØ-𝜇 interactions with accuracies over 90%. Therefore, this approach constitutes a powerful tool to help predict mechanisms of intracellular protein sorting. Debarati Mukherjee, Claudia B. Hanna, and R. Claudio Aguilar Copyright © 2012 Debarati Mukherjee et al. All rights reserved. Molecular Modeling of the M3 Acetylcholine Muscarinic Receptor and Its Binding Site Mon, 27 Feb 2012 12:42:20 +0000 The present study reports the results of a combined computational and site mutagenesis study designed to provide new insights into the orthosteric binding site of the human M3 muscarinic acetylcholine receptor. For this purpose a three-dimensional structure of the receptor at atomic resolution was built by homology modeling, using the crystallographic structure of bovine rhodopsin as a template. Then, the antagonist N-methylscopolamine was docked in the model and subsequently embedded in a lipid bilayer for its refinement using molecular dynamics simulations. Two different lipid bilayer compositions were studied: one component palmitoyl-oleyl phosphatidylcholine (POPC) and two-component palmitoyl-oleyl phosphatidylcholine/palmitoyl-oleyl phosphatidylserine (POPC-POPS). Analysis of the results suggested that residues F222 and T235 may contribute to the ligand-receptor recognition. Accordingly, alanine mutants at positions 222 and 235 were constructed, expressed, and their binding properties determined. The results confirmed the role of these residues in modulating the binding affinity of the ligand. Marlet Martinez-Archundia, Arnau Cordomi, Pere Garriga, and Juan J. Perez Copyright © 2012 Marlet Martinez-Archundia et al. All rights reserved. Studying Interactions by Molecular Dynamics Simulations at High Concentration Wed, 22 Feb 2012 19:10:32 +0000 Molecular dynamics simulations have been used to study molecular encounters and recognition. In recent works, simulations using high concentration of interacting molecules have been performed. In this paper, we consider the practical problems for setting up the simulation and to analyse the results of the simulation. The simulation of beta 2-microglobulin association and the simulation of the binding of hydrogen peroxide by glutathione peroxidase are provided as examples. Federico Fogolari, Alessandra Corazza, Stefano Toppo, Silvio C. E. Tosatto, Paolo Viglino, Fulvio Ursini, and Gennaro Esposito Copyright © 2012 Federico Fogolari et al. All rights reserved. Predicting Protein Interactions by Brownian Dynamics Simulations Wed, 15 Feb 2012 13:15:16 +0000 We present a newly adapted Brownian-Dynamics (BD)-based protein docking method for predicting native protein complexes. The approach includes global BD conformational sampling, compact complex selection, and local energy minimization. In order to reduce the computational costs for energy evaluations, a shell-based grid force field was developed to represent the receptor protein and solvation effects. The performance of this BD protein docking approach has been evaluated on a test set of 24 crystal protein complexes. Reproduction of experimental structures in the test set indicates the adequate conformational sampling and accurate scoring of this BD protein docking approach. Furthermore, we have developed an approach to account for the flexibility of proteins, which has been successfully applied to reproduce the experimental complex structure from the structure of two unbounded proteins. These results indicate that this adapted BD protein docking approach can be useful for the prediction of protein-protein interactions. Xuan-Yu Meng, Yu Xu, Hong-Xing Zhang, Mihaly Mezei, and Meng Cui Copyright © 2012 Xuan-Yu Meng et al. All rights reserved. Synergistic Applications of MD and NMR for the Study of Biological Systems Thu, 26 Jan 2012 11:18:26 +0000 Modern biological sciences are becoming more and more multidisciplinary. At the same time, theoretical and computational approaches gain in reliability and their field of application widens. In this short paper, we discuss recent advances in the areas of solution nuclear magnetic resonance (NMR) spectroscopy and molecular dynamics (MD) simulations that were made possible by the combination of both methods, that is, through their synergistic use. We present the main NMR observables and parameters that can be computed from simulations, and how they are used in a variety of complementary applications, including dynamics studies, model-free analysis, force field validation, and structural studies. Olivier Fisette, Patrick Lagüe, Stéphane Gagné, and Sébastien Morin Copyright © 2012 Olivier Fisette et al. All rights reserved. Robust Design of Biological Circuits: Evolutionary Systems Biology Approach Wed, 07 Dec 2011 12:00:26 +0000 Artificial gene circuits have been proposed to be embedded into microbial cells that function as switches, timers, oscillators, and the Boolean logic gates. Building more complex systems from these basic gene circuit components is one key advance for biologic circuit design and synthetic biology. However, the behavior of bioengineered gene circuits remains unstable and uncertain. In this study, a nonlinear stochastic system is proposed to model the biological systems with intrinsic parameter fluctuations and environmental molecular noise from the cellular context in the host cell. Based on evolutionary systems biology algorithm, the design parameters of target gene circuits can evolve to specific values in order to robustly track a desired biologic function in spite of intrinsic and environmental noise. The fitness function is selected to be inversely proportional to the tracking error so that the evolutionary biological circuit can achieve the optimal tracking mimicking the evolutionary process of a gene circuit. Finally, several design examples are given in silico with the Monte Carlo simulation to illustrate the design procedure and to confirm the robust performance of the proposed design method. The result shows that the designed gene circuits can robustly track desired behaviors with minimal errors even with nontrivial intrinsic and external noise. Bor-Sen Chen, Chih-Yuan Hsu, and Jing-Jia Liou Copyright © 2011 Bor-Sen Chen et al. All rights reserved. HKC: An Algorithm to Predict Protein Complexes in Protein-Protein Interaction Networks Sat, 26 Nov 2011 15:50:59 +0000 With the availability of more and more genome-scale protein-protein interaction (PPI) networks, research interests gradually shift to Systematic Analysis on these large data sets. A key topic is to predict protein complexes in PPI networks by identifying clusters that are densely connected within themselves but sparsely connected with the rest of the network. In this paper, we present a new topology-based algorithm, HKC, to detect protein complexes in genome-scale PPI networks. HKC mainly uses the concepts of highest k-core and cohesion to predict protein complexes by identifying overlapping clusters. The experiments on two data sets and two benchmarks show that our algorithm has relatively high F-measure and exhibits better performance compared with some other methods. Xiaomin Wang, Zhengzhi Wang, and Jun Ye Copyright © 2011 Xiaomin Wang et al. All rights reserved.