BioMed Research International: Bioinformatics http://www.hindawi.com The latest articles from Hindawi Publishing Corporation © 2014 , Hindawi Publishing Corporation . All rights reserved. The Mcm2-7 Replicative Helicase: A Promising Chemotherapeutic Target Thu, 28 Aug 2014 15:15:54 +0000 http://www.hindawi.com/journals/bmri/2014/549719/ Numerous eukaryotic replication factors have served as chemotherapeutic targets. One replication factor that has largely escaped drug development is the Mcm2-7 replicative helicase. This heterohexameric complex forms the licensing system that assembles the replication machinery at origins during initiation, as well as the catalytic core of the CMG (Cdc45-Mcm2-7-GINS) helicase that unwinds DNA during elongation. Emerging evidence suggests that Mcm2-7 is also part of the replication checkpoint, a quality control system that monitors and responds to DNA damage. As the only replication factor required for both licensing and DNA unwinding, Mcm2-7 is a major cellular regulatory target with likely cancer relevance. Mutations in at least one of the six MCM genes are particularly prevalent in squamous cell carcinomas of the lung, head and neck, and prostrate, and MCM mutations have been shown to cause cancer in mouse models. Moreover various cellular regulatory proteins, including the Rb tumor suppressor family members, bind Mcm2-7 and inhibit its activity. As a preliminary step toward drug development, several small molecule inhibitors that target Mcm2-7 have been recently discovered. Both its structural complexity and essential role at the interface between DNA replication and its regulation make Mcm2-7 a potential chemotherapeutic target. Nicholas E. Simon and Anthony Schwacha Copyright © 2014 Nicholas E. Simon and Anthony Schwacha. All rights reserved. Crystal Structure of a Conserved Hypothetical Protein MJ0927 from Methanocaldococcus jannaschii Reveals a Novel Quaternary Assembly in the Nif3 Family Thu, 28 Aug 2014 15:06:43 +0000 http://www.hindawi.com/journals/bmri/2014/171263/ A Nif3 family protein of Methanocaldococcus jannaschii, MJ0927, is highly conserved from bacteria to humans. Although several structures of bacterial Nif3 proteins are known, no structure representing archaeal Nif3 has yet been reported. The crystal structure of Methanocaldococcus jannaschii MJ0927 was determined at 2.47 Å resolution to understand the structural differences between the bacterial and archaeal Nif3 proteins. Intriguingly, MJ0927 is found to adopt an unusual assembly comprising a trimer of dimers that forms a cage-like architecture. Electrophoretic mobility-shift assays indicate that MJ0927 binds to both single-stranded and double-stranded DNA. Structural analysis of MJ0927 reveals a positively charged region that can potentially explain its DNA-binding capability. Taken together, these data suggest that MJ0927 adopts a novel quartenary architecture that could play various DNA-binding roles in Methanocaldococcus jannaschii. Sheng-Chia Chen, Chi-Hung Huang, Chia Shin Yang, Shu-Min Kuan, Ching-Ting Lin, Shan-Ho Chou, and Yeh Chen Copyright © 2014 Sheng-Chia Chen et al. All rights reserved. Relationship between CCR and NT-proBNP in Chinese HF Patients, and Their Correlations with Severity of HF Thu, 28 Aug 2014 09:42:10 +0000 http://www.hindawi.com/journals/bmri/2014/106252/ Aim. To evaluate the relationship between creatinine clearance rate (CCR) and the level of N-terminal pro-B-type natriuretic peptide (NT-proBNP) in heart failure (HF) patients and their correlations with HF severity. Methods and Results. Two hundred and one Chinese patients were grouped according to the New York Heart Association (NYHA) classification as NYHA 1-2 and 3-4 groups and 135 cases out of heart failure patients as control group. The following variables were compared among these three groups: age, sex, body mass index (BMI), smoking status, hypertension, diabetes, NT-proBNP, creatinine (Cr), uric acid (UA), left ventricular end-diastolic diameter (LVEDD), and CCR. The biomarkers of NT-proBNP, Cr, UA, LVEDD, and CCR varied significantly in the three groups, and these variables were positively correlated with the NHYA classification. The levels of NT-proBNP and CCR were closely related to the occurrence of HF and were independent risk factors for HF. At the same time, there was a significant negative correlation between the levels of NT-proBNP and CCR. The area under the receiver operating characteristic curve suggested that the NT-proBNP and CCR have high accuracy for diagnosis of HF and have clinical diagnostic value. Conclusion. NT-proBNP and CCR may be important biomarkers in evaluating the severity of HF. Zhigang Lu, Bo Wang, Yunliang Wang, Xueqing Qian, Wei Zheng, and Meng Wei Copyright © 2014 Zhigang Lu et al. All rights reserved. Establishing Standards for Studying Renal Function in Mice through Measurements of Body Size-Adjusted Creatinine and Urea Levels Wed, 27 Aug 2014 12:35:10 +0000 http://www.hindawi.com/journals/bmri/2014/872827/ Strategies for obtaining reliable results are increasingly implemented in order to reduce errors in the analysis of human and veterinary samples; however, further data are required for murine samples. Here, we determined an average factor from the murine body surface area for the calculation of biochemical renal parameters, assessed the effects of storage and freeze-thawing of C57BL/6 mouse samples on plasmatic and urinary urea, and evaluated the effects of using two different urea-measurement techniques. After obtaining 24 h urine samples, blood was collected, and body weight and length were established. The samples were evaluated after collection or stored at −20°C and −70°C. At different time points (0, 4, and 90 days), these samples were thawed, the creatinine and/or urea concentrations were analyzed, and samples were restored at these temperatures for further measurements. We show that creatinine clearance measurements should be adjusted according to the body surface area, which was calculated based on the weight and length of the animal. Repeated freeze-thawing cycles negatively affected the urea concentration; the urea concentration was more reproducible when using the modified Berthelot reaction rather than the ultraviolet method. Our findings will facilitate standardization and optimization of methodology as well as understanding of renal and other biochemical data obtained from mice. Wellington Francisco Rodrigues, Camila Botelho Miguel, Marcelo Henrique Napimoga, Carlo Jose Freire Oliveira, and Javier Emilio Lazo-Chica Copyright © 2014 Wellington Francisco Rodrigues et al. All rights reserved. Identification and Analysis of Driver Missense Mutations Using Rotation Forest with Feature Selection Wed, 27 Aug 2014 12:02:00 +0000 http://www.hindawi.com/journals/bmri/2014/905951/ Identifying cancer-associated mutations (driver mutations) is critical for understanding the cellular function of cancer genome that leads to activation of oncogenes or inactivation of tumor suppressor genes. Many approaches are proposed which use supervised machine learning techniques for prediction with features obtained by some databases. However, often we do not know which features are important for driver mutations prediction. In this study, we propose a novel feature selection method (called DX) from 126 candidate features’ set. In order to obtain the best performance, rotation forest algorithm was adopted to perform the experiment. On the train dataset which was collected from COSMIC and Swiss-Prot databases, we are able to obtain high prediction performance with 88.03% accuracy, 93.9% precision, and 81.35% recall when the 11 top-ranked features were used. Comparison with other various techniques in the TP53, EGFR, and Cosmic2plus datasets shows the generality of our method. Xiuquan Du and Jiaxing Cheng Copyright © 2014 Xiuquan Du and Jiaxing Cheng. All rights reserved. Crystal Structure of Deinococcus radiodurans RecQ Helicase Catalytic Core Domain: The Interdomain Flexibility Wed, 27 Aug 2014 08:21:26 +0000 http://www.hindawi.com/journals/bmri/2014/342725/ RecQ DNA helicases are key enzymes in the maintenance of genome integrity, and they have functions in DNA replication, recombination, and repair. In contrast to most RecQs, RecQ from Deinococcus radiodurans (DrRecQ) possesses an unusual domain architecture that is crucial for its remarkable ability to repair DNA. Here, we determined the crystal structures of the DrRecQ helicase catalytic core and its ADP-bound form, revealing interdomain flexibility in its first RecA-like and winged-helix (WH) domains. Additionally, the WH domain of DrRecQ is positioned in a different orientation from that of the E. coli RecQ (EcRecQ). These results suggest that the orientation of the protein during DNA-binding is significantly different when comparing DrRecQ and EcRecQ. Sheng-Chia Chen, Chi-Hung Huang, Chia Shin Yang, Tzong-Der Way, Ming-Chung Chang, and Yeh Chen Copyright © 2014 Sheng-Chia Chen et al. All rights reserved. Characterization of Putative cis-Regulatory Elements in Genes Preferentially Expressed in Arabidopsis Male Meiocytes Wed, 27 Aug 2014 08:05:05 +0000 http://www.hindawi.com/journals/bmri/2014/708364/ Meiosis is essential for plant reproduction because it is the process during which homologous chromosome pairing, synapsis, and meiotic recombination occur. The meiotic transcriptome is difficult to investigate because of the size of meiocytes and the confines of anther lobes. The recent development of isolation techniques has enabled the characterization of transcriptional profiles in male meiocytes of Arabidopsis. Gene expression in male meiocytes shows unique features. The direct interaction of transcription factors (TFs) with DNA regulatory sequences forms the basis for the specificity of transcriptional regulation. Here, we identified putative cis-regulatory elements (CREs) associated with male meiocyte-expressed genes using in silico tools. The upstream regions (1 kb) of the top 50 genes preferentially expressed in Arabidopsis meiocytes possessed conserved motifs. These motifs are putative binding sites of TFs, some of which share common functions, such as roles in cell division. In combination with cell-type-specific analysis, our findings could be a substantial aid for the identification and experimental verification of the protein-DNA interactions for the specific TFs that drive gene expression in meiocytes. Junhua Li, Jinhong Yuan, and Mingjun Li Copyright © 2014 Junhua Li et al. All rights reserved. Function Formula Oriented Construction of Bayesian Inference Nets for Diagnosis of Cardiovascular Disease Wed, 27 Aug 2014 06:47:48 +0000 http://www.hindawi.com/journals/bmri/2014/376378/ An intelligent cardiovascular disease (CVD) diagnosis system using hemodynamic parameters (HDPs) derived from sphygmogram (SPG) signal is presented to support the emerging patient-centric healthcare models. To replicate clinical approach of diagnosis through a staged decision process, the Bayesian inference nets (BIN) are adapted. New approaches to construct a hierarchical multistage BIN using defined function formulas and a method employing fuzzy logic (FL) technology to quantify inference nodes with dynamic values of statistical parameters are proposed. The suggested methodology is validated by constructing hierarchical Bayesian fuzzy inference nets (HBFIN) to diagnose various heart pathologies from the deduced HDPs. The preliminary diagnostic results show that the proposed methodology has salient validity and effectiveness in the diagnosis of cardiovascular disease. Booma Devi Sekar and Mingchui Dong Copyright © 2014 Booma Devi Sekar and Mingchui Dong. All rights reserved. High-Throughput Functional Screening of Steroid Substrates with Wild-Type and Chimeric P450 Enzymes Tue, 26 Aug 2014 10:40:59 +0000 http://www.hindawi.com/journals/bmri/2014/764102/ The promiscuity of a collection of enzymes consisting of 31 wild-type and synthetic variants of CYP1A enzymes was evaluated using a series of 14 steroids and 2 steroid-like chemicals, namely, nootkatone, a terpenoid, and mifepristone, a drug. For each enzyme-substrate couple, the initial steady-state velocity of metabolite formation was determined at a substrate saturating concentration. For that, a high-throughput approach was designed involving automatized incubations in 96-well microplate with sixteen 6-point kinetics per microplate and data acquisition using LC/MS system accepting 96-well microplate for injections. The resulting dataset was used for multivariate statistics aimed at sorting out the correlations existing between tested enzyme variants and ability to metabolize steroid substrates. Functional classifications of both CYP1A enzyme variants and steroid substrate structures were obtained allowing the delineation of global structural features for both substrate recognition and regioselectivity of oxidation. Philippe Urban, Gilles Truan, and Denis Pompon Copyright © 2014 Philippe Urban et al. All rights reserved. Large-Scale Protein-Protein Interactions Detection by Integrating Big Biosensing Data with Computational Model Mon, 18 Aug 2014 10:52:22 +0000 http://www.hindawi.com/journals/bmri/2014/598129/ Protein-protein interactions are the basis of biological functions, and studying these interactions on a molecular level is of crucial importance for understanding the functionality of a living cell. During the past decade, biosensors have emerged as an important tool for the high-throughput identification of proteins and their interactions. However, the high-throughput experimental methods for identifying PPIs are both time-consuming and expensive. On the other hand, high-throughput PPI data are often associated with high false-positive and high false-negative rates. Targeting at these problems, we propose a method for PPI detection by integrating biosensor-based PPI data with a novel computational model. This method was developed based on the algorithm of extreme learning machine combined with a novel representation of protein sequence descriptor. When performed on the large-scale human protein interaction dataset, the proposed method achieved 84.8% prediction accuracy with 84.08% sensitivity at the specificity of 85.53%. We conducted more extensive experiments to compare the proposed method with the state-of-the-art techniques, support vector machine. The achieved results demonstrate that our approach is very promising for detecting new PPIs, and it can be a helpful supplement for biosensor-based PPI data detection. Zhu-Hong You, Shuai Li, Xin Gao, Xin Luo, and Zhen Ji Copyright © 2014 Zhu-Hong You et al. All rights reserved. Drug Repositioning Discovery for Early- and Late-Stage Non-Small-Cell Lung Cancer Mon, 18 Aug 2014 07:02:32 +0000 http://www.hindawi.com/journals/bmri/2014/193817/ Drug repositioning is a popular approach in the pharmaceutical industry for identifying potential new uses for existing drugs and accelerating the development time. Non-small-cell lung cancer (NSCLC) is one of the leading causes of death worldwide. To reduce the biological heterogeneity effects among different individuals, both normal and cancer tissues were taken from the same patient, hence allowing pairwise testing. By comparing early- and late-stage cancer patients, we can identify stage-specific NSCLC genes. Differentially expressed genes are clustered separately to form up- and downregulated communities that are used as queries to perform enrichment analysis. The results suggest that pathways for early- and late-stage cancers are different. Sets of up- and downregulated genes were submitted to the cMap web resource to identify potential drugs. To achieve high confidence drug prediction, multiple microarray experimental results were merged by performing meta-analysis. The results of a few drug findings are supported by MTT assay or clonogenic assay data. In conclusion, we have been able to assess the potential existing drugs to identify novel anticancer drugs, which may be helpful in drug repositioning discovery for NSCLC. Chien-Hung Huang, Peter Mu-Hsin Chang, Yong-Jie Lin, Cheng-Hsu Wang, Chi-Ying F. Huang, and Ka-Lok Ng Copyright © 2014 Chien-Hung Huang et al. All rights reserved. Systematic Analysis of the Association between Gut Flora and Obesity through High-Throughput Sequencing and Bioinformatics Approaches Thu, 14 Aug 2014 12:10:54 +0000 http://www.hindawi.com/journals/bmri/2014/906168/ Eighty-one stool samples from Taiwanese were collected for analysis of the association between the gut flora and obesity. The supervised analysis showed that the most, abundant genera of bacteria in normal samples (from people with a body mass index (BMI) 24) were Bacteroides (27.7%), Prevotella (19.4%), Escherichia (12%), Phascolarctobacterium (3.9%), and Eubacterium (3.5%). The most abundant genera of bacteria in case samples (with a BMI 27) were Bacteroides (29%), Prevotella (21%), Escherichia (7.4%), Megamonas (5.1%), and Phascolarctobacterium (3.8%). A principal coordinate analysis (PCoA) demonstrated that normal samples were clustered more compactly than case samples. An unsupervised analysis demonstrated that bacterial communities in the gut were clustered into two main groups: N-like and OB-like groups. Remarkably, most normal samples (78%) were clustered in the N-like group, and most case samples (81%) were clustered in the OB-like group (Fisher’s ). The results showed that bacterial communities in the gut were highly associated with obesity. This is the first study in Taiwan to investigate the association between human gut flora and obesity, and the results provide new insights into the correlation of bacteria with the rising trend in obesity. Chih-Min Chiu, Wei-Chih Huang, Shun-Long Weng, Han-Chi Tseng, Chao Liang, Wei-Chi Wang, Ting Yang, Tzu-Ling Yang, Chen-Tsung Weng, Tzu-Hao Chang, and Hsien-Da Huang Copyright © 2014 Chih-Min Chiu et al. All rights reserved. FSim: A Novel Functional Similarity Search Algorithm and Tool for Discovering Functionally Related Gene Products Tue, 12 Aug 2014 10:16:15 +0000 http://www.hindawi.com/journals/bmri/2014/509149/ Background. During the analysis of genomics data, it is often required to quantify the functional similarity of genes and their products based on the annotation information from gene ontology (GO) with hierarchical structure. A flexible and user-friendly way to estimate the functional similarity of genes utilizing GO annotation is therefore highly desired. Results. We proposed a novel algorithm using a level coefficient-weighted model to measure the functional similarity of gene products based on multiple ontologies of hierarchical GO annotations. The performance of our algorithm was evaluated and found to be superior to the other tested methods. We implemented the proposed algorithm in a software package, FSim, based on statistical and computing environment. It can be used to discover functionally related genes for a given gene, group of genes, or set of function terms. Conclusions. FSim is a flexible tool to analyze functional gene groups based on the GO annotation databases. Qiang Hu, ZhiGang Wang, and ZhengGuo Zhang Copyright © 2014 Qiang Hu et al. All rights reserved. Prediction of S-Nitrosylation Modification Sites Based on Kernel Sparse Representation Classification and mRMR Algorithm Tue, 12 Aug 2014 00:00:00 +0000 http://www.hindawi.com/journals/bmri/2014/438341/ Protein S-nitrosylation plays a very important role in a wide variety of cellular biological activities. Hitherto, accurate prediction of S-nitrosylation sites is still of great challenge. In this paper, we presented a framework to computationally predict S-nitrosylation sites based on kernel sparse representation classification and minimum Redundancy Maximum Relevance algorithm. As much as 666 features derived from five categories of amino acid properties and one protein structure feature are used for numerical representation of proteins. A total of 529 protein sequences collected from the open-access databases and published literatures are used to train and test our predictor. Computational results show that our predictor achieves Matthews’ correlation coefficients of 0.1634 and 0.2919 for the training set and the testing set, respectively, which are better than those of k-nearest neighbor algorithm, random forest algorithm, and sparse representation classification algorithm. The experimental results also indicate that 134 optimal features can better represent the peptides of protein S-nitrosylation than the original 666 redundant features. Furthermore, we constructed an independent testing set of 113 protein sequences to evaluate the robustness of our predictor. Experimental result showed that our predictor also yielded good performance on the independent testing set with Matthews’ correlation coefficients of 0.2239. Guohua Huang, Lin Lu, Kaiyan Feng, Jun Zhao, Yuchao Zhang, Yaochen Xu, Ning Zhang, Bi-Qing Li, Weiping Huang, and Yu-Dong Cai Copyright © 2014 Guohua Huang et al. All rights reserved. Novel Approach for Coexpression Analysis of E2F1–3 and MYC Target Genes in Chronic Myelogenous Leukemia Sun, 10 Aug 2014 08:29:13 +0000 http://www.hindawi.com/journals/bmri/2014/439840/ Background. Chronic myelogenous leukemia (CML) is characterized by tremendous amount of immature myeloid cells in the blood circulation. E2F1–3 and MYC are important transcription factors that form positive feedback loops by reciprocal regulation in their own transcription processes. Since genes regulated by E2F1–3 or MYC are related to cell proliferation and apoptosis, we wonder if there exists difference in the coexpression patterns of genes regulated concurrently by E2F1–3 and MYC between the normal and the CML states. Results. We proposed a method to explore the difference in the coexpression patterns of those candidate target genes between the normal and the CML groups. A disease-specific cutoff point for coexpression levels that classified the coexpressed gene pairs into strong and weak coexpression classes was identified. Our developed method effectively identified the coexpression pattern differences from the overall structure. Moreover, we found that genes related to the cell adhesion and angiogenesis properties were more likely to be coexpressed in the normal group when compared to the CML group. Conclusion. Our findings may be helpful in exploring the underlying mechanisms of CML and provide useful information in cancer treatment. Fengfeng Wang, Lawrence W. C. Chan, William C. S. Cho, Petrus Tang, Jun Yu, Chi-Ren Shyu, Nancy B. Y. Tsui, S. C. Cesar Wong, Parco M. Siu, S. P. Yip, and Benjamin Y. M. Yung Copyright © 2014 Fengfeng Wang et al. All rights reserved. A Genome-Wide Identification of Genes Undergoing Recombination and Positive Selection in Neisseria Sun, 10 Aug 2014 08:23:34 +0000 http://www.hindawi.com/journals/bmri/2014/815672/ Currently, there is particular interest in the molecular mechanisms of adaptive evolution in bacteria. Neisseria is a genus of gram negative bacteria, and there has recently been considerable focus on its two human pathogenic species N. meningitidis and N. gonorrhoeae. Until now, no genome-wide studies have attempted to scan for the genes related to adaptive evolution. For this reason, we selected 18 Neisseria genomes (14 N. meningitidis, 3 N. gonorrhoeae and 1 commensal N. lactamics) to conduct a comparative genome analysis to obtain a comprehensive understanding of the roles of natural selection and homologous recombination throughout the history of adaptive evolution. Among the 1012 core orthologous genes, we identified 635 genes with recombination signals and 10 genes that showed significant evidence of positive selection. Further functional analyses revealed that no functional bias was found in the recombined genes. Positively selected genes are prone to DNA processing and iron uptake, which are essential for the fundamental life cycle. Overall, the results indicate that both recombination and positive selection play crucial roles in the adaptive evolution of Neisseria genomes. The positively selected genes and the corresponding amino acid sites provide us with valuable targets for further research into the detailed mechanisms of adaptive evolution in Neisseria. Dong Yu, Yuan Jin, Zhiqiu Yin, Hongguang Ren, Wei Zhou, Long Liang, and Junjie Yue Copyright © 2014 Dong Yu et al. All rights reserved. Gene Ontology and KEGG Enrichment Analyses of Genes Related to Age-Related Macular Degeneration Wed, 06 Aug 2014 08:37:56 +0000 http://www.hindawi.com/journals/bmri/2014/450386/ Identifying disease genes is one of the most important topics in biomedicine and may facilitate studies on the mechanisms underlying disease. Age-related macular degeneration (AMD) is a serious eye disease; it typically affects older adults and results in a loss of vision due to retina damage. In this study, we attempt to develop an effective method for distinguishing AMD-related genes. Gene ontology and KEGG enrichment analyses of known AMD-related genes were performed, and a classification system was established. In detail, each gene was encoded into a vector by extracting enrichment scores of the gene set, including it and its direct neighbors in STRING, and gene ontology terms or KEGG pathways. Then certain feature-selection methods, including minimum redundancy maximum relevance and incremental feature selection, were adopted to extract key features for the classification system. As a result, 720 GO terms and 11 KEGG pathways were deemed the most important factors for predicting AMD-related genes. Jian Zhang, ZhiHao Xing, Mingming Ma, Ning Wang, Yu-Dong Cai, Lei Chen, and Xun Xu Copyright © 2014 Jian Zhang et al. All rights reserved. C-Terminal Domain Swapping of SSB Changes the Size of the ssDNA Binding Site Mon, 04 Aug 2014 06:33:19 +0000 http://www.hindawi.com/journals/bmri/2014/573936/ Single-stranded DNA-binding protein (SSB) plays an important role in DNA metabolism, including DNA replication, repair, and recombination, and is therefore essential for cell survival. Bacterial SSB consists of an N-terminal ssDNA-binding/oligomerization domain and a flexible C-terminal protein-protein interaction domain. We characterized the ssDNA-binding properties of Klebsiella pneumoniae SSB (KpSSB), Salmonella enterica Serovar Typhimurium LT2 SSB (StSSB), Pseudomonas aeruginosa PAO1 SSB (PaSSB), and two chimeric KpSSB proteins, namely, KpSSBnStSSBc and KpSSBnPaSSBc. The C-terminal domain of StSSB or PaSSB was exchanged with that of KpSSB through protein chimeragenesis. By using the electrophoretic mobility shift assay, we characterized the stoichiometry of KpSSB, StSSB, PaSSB, KpSSBnStSSBc, and KpSSBnPaSSBc, complexed with a series of ssDNA homopolymers. The binding site sizes were determined to be , , , , and nucleotides (nt), respectively. Comparison of the binding site sizes of KpSSB, KpSSBnStSSBc, and KpSSBnPaSSBc showed that the C-terminal domain swapping of SSB changes the size of the binding site. Our observations suggest that not only the conserved N-terminal domain but also the C-terminal domain of SSB is an important determinant for ssDNA binding. Yen-Hua Huang and Cheng-Yang Huang Copyright © 2014 Yen-Hua Huang and Cheng-Yang Huang. All rights reserved. The Effects of the Context-Dependent Codon Usage Bias on the Structure of the nsp1α of Porcine Reproductive and Respiratory Syndrome Virus Sun, 03 Aug 2014 07:47:26 +0000 http://www.hindawi.com/journals/bmri/2014/765320/ The information about the crystal structure of porcine reproductive and respiratory syndrome virus (PRRSV) leader protease nsp1α is available to analyze the roles of tRNA abundance of pigs and codon usage of the nsp1α gene in the formation of this protease. The effects of tRNA abundance of the pigs and the synonymous codon usage and the context-dependent codon bias (CDCB) of the nsp1α on shaping the specific folding units (α-helix, β-strand, and the coil) in the nsp1α were analyzed based on the structural information about this protease from protein data bank (PDB: 3IFU) and the nsp1α of the 191 PRRSV strains. By mapping the overall tRNA abundance along the nsp1α, we found that there is no link between the fluctuation of the overall tRNA abundance and the specific folding units in the nsp1α, and the low translation speed of ribosome caused by the tRNA abundance exists in the nsp1α. The strong correlation between some synonymous codon usage and the specific folding units in the nsp1α was found, and the phenomenon of CDCB exists in the specific folding units of the nsp1α. These findings provide an insight into the roles of the synonymous codon usage and CDCB in the formation of PRRSV nsp1α structure. Yao-zhong Ding, Ya-nan You, Dong-jie Sun, Hao-tai Chen, Yong-lu Wang, Hui-yun Chang, Li Pan, Yu-zhen Fang, Zhong-wang Zhang, Peng Zhou, Jian-liang Lv, Xin-sheng Liu, Jun-jun Shao, Fu-rong Zhao, Tong Lin, Laszlo Stipkovits, Zygmunt Pejsak, Yong-guang Zhang, and Jie Zhang Copyright © 2014 Yao-zhong Ding et al. All rights reserved. Detecting Epistatic Interactions in Metagenome-Wide Association Studies by metaBOOST Thu, 24 Jul 2014 18:41:12 +0000 http://www.hindawi.com/journals/bmri/2014/398147/ Material and Methods. We recall the definition of epistasis and extend it for metagenomic biomarkers and then we describe the overview of our method metaBOOST and provide detailed information about each step of metaBOOST. Results. We describe the data sources for both simulation studies and real metagenomic datasets. Then, we describe the procedure of simulation studies and provide results for it. After that, we conduct real datasets studies and report the results. Conclusions and Discussion. Finally, we conclude our method and discuss some possible improvements for the future. Mengmeng Wu and Rui Jiang Copyright © 2014 Mengmeng Wu and Rui Jiang. All rights reserved. The N-Terminal Domain of Human DNA Helicase Rtel1 Contains a Redox Active Iron-Sulfur Cluster Thu, 24 Jul 2014 09:20:31 +0000 http://www.hindawi.com/journals/bmri/2014/285791/ Human telomere length regulator Rtel1 is a superfamily II DNA helicase and is essential for maintaining proper length of telomeres in chromosomes. Here we report that the N-terminal domain of human Rtel1 (RtelN) expressed in Escherichia coli cells produces a protein that contains a redox active iron-sulfur cluster with the redox midpoint potential of −248 ± 10 mV (pH 8.0). The iron-sulfur cluster in RtelN is sensitive to hydrogen peroxide and nitric oxide, indicating that reactive oxygen/nitrogen species may modulate the DNA helicase activity of Rtel1 via modification of its iron-sulfur cluster. Purified RtelN retains a weak binding affinity for the single-stranded (ss) and double-stranded (ds) DNA in vitro. However, modification of the iron-sulfur cluster by hydrogen peroxide or nitric oxide does not significantly affect the DNA binding activity of RtelN, suggesting that the iron-sulfur cluster is not directly involved in the DNA interaction in the N-terminal domain of Rtel1. Aaron P. Landry and Huangen Ding Copyright © 2014 Aaron P. Landry and Huangen Ding. All rights reserved. Security Mechanism Based on Hospital Authentication Server for Secure Application of Implantable Medical Devices Thu, 24 Jul 2014 07:55:14 +0000 http://www.hindawi.com/journals/bmri/2014/543051/ After two recent security attacks against implantable medical devices (IMDs) have been reported, the privacy and security risks of IMDs have been widely recognized in the medical device market and research community, since the malfunctioning of IMDs might endanger the patient’s life. During the last few years, a lot of researches have been carried out to address the security-related issues of IMDs, including privacy, safety, and accessibility issues. A physician accesses IMD through an external device called a programmer, for diagnosis and treatment. Hence, cryptographic key management between IMD and programmer is important to enforce a strict access control. In this paper, a new security architecture for the security of IMDs is proposed, based on a 3-Tier security model, where the programmer interacts with a Hospital Authentication Server, to get permissions to access IMDs. The proposed security architecture greatly simplifies the key management between IMDs and programmers. Also proposed is a security mechanism to guarantee the authenticity of the patient data collected from IMD and the nonrepudiation of the physician’s treatment based on it. The proposed architecture and mechanism are analyzed and compared with several previous works, in terms of security and performance. Chang-Seop Park Copyright © 2014 Chang-Seop Park. All rights reserved. An Intelligent System for Identifying Acetylated Lysine on Histones and Nonhistone Proteins Thu, 24 Jul 2014 00:00:00 +0000 http://www.hindawi.com/journals/bmri/2014/528650/ Lysine acetylation is an important and ubiquitous posttranslational modification conserved in prokaryotes and eukaryotes. This process, which is dynamically and temporally regulated by histone acetyltransferases and deacetylases, is crucial for numerous essential biological processes such as transcriptional regulation, cellular signaling, and stress response. Since the experimental identification of lysine acetylation sites within proteins is time-consuming and laboratory-intensive, several computational approaches have been developed to identify candidates for experimental validation. In this work, acetylated protein data collected from UniProtKB were categorized into histone or nonhistone proteins. Support vector machines (SVMs) were applied to build predictive models by using amino acid pair composition (AAPC) as a feature in a histone model. We combined BLOSUM62 and AAPC features in a nonhistone model. Furthermore, using maximal dependence decomposition (MDD) clustering can enhance the performance of the model on a fivefold cross-validation evaluation to yield a sensitivity of 0.863, specificity of 0.885, accuracy of 0.880, and MCC of 0.706. Additionally, the proposed method is evaluated using independent test sets resulting in a predictive accuracy of 74%. This indicates that the performance of our method is comparable with that of other acetylation prediction methods. Cheng-Tsung Lu, Tzong-Yi Lee, Yu-Ju Chen, and Yi-Ju Chen Copyright © 2014 Cheng-Tsung Lu et al. All rights reserved. Studying the Complex Expression Dependences between Sets of Coexpressed Genes Thu, 24 Jul 2014 00:00:00 +0000 http://www.hindawi.com/journals/bmri/2014/940821/ Organisms simplify the orchestration of gene expression by coregulating genes whose products function together in the cell. The use of clustering methods to obtain sets of coexpressed genes from expression arrays is very common; nevertheless there are no appropriate tools to study the expression networks among these sets of coexpressed genes. The aim of the developed tools is to allow studying the complex expression dependences that exist between sets of coexpressed genes. For this purpose, we start detecting the nonlinear expression relationships between pairs of genes, plus the coexpressed genes. Next, we form networks among sets of coexpressed genes that maintain nonlinear expression dependences between all of them. The expression relationship between the sets of coexpressed genes is defined by the expression relationship between the skeletons of these sets, where this skeleton represents the coexpressed genes with a well-defined nonlinear expression relationship with the skeleton of the other sets. As a result, we can study the nonlinear expression relationships between a target gene and other sets of coexpressed genes, or start the study from the skeleton of the sets, to study the complex relationships of activation and deactivation between the sets of coexpressed genes that carry out the different cellular processes present in the expression experiments. Mario Huerta, Oriol Casanova, Roberto Barchino, Jose Flores, Enrique Querol, and Juan Cedano Copyright © 2014 Mario Huerta et al. All rights reserved. An Efficient Parallel Algorithm for Multiple Sequence Similarities Calculation Using a Low Complexity Method Tue, 22 Jul 2014 09:07:46 +0000 http://www.hindawi.com/journals/bmri/2014/563016/ With the advance of genomic researches, the number of sequences involved in comparative methods has grown immensely. Among them, there are methods for similarities calculation, which are used by many bioinformatics applications. Due the huge amount of data, the union of low complexity methods with the use of parallel computing is becoming desirable. The k-mers counting is a very efficient method with good biological results. In this work, the development of a parallel algorithm for multiple sequence similarities calculation using the k-mers counting method is proposed. Tests show that the algorithm presents a very good scalability and a nearly linear speedup. For 14 nodes was obtained 12x speedup. This algorithm can be used in the parallelization of some multiple sequence alignment tools, such as MAFFT and MUSCLE. Evandro A. Marucci, Geraldo F. D. Zafalon, Julio C. Momente, Leandro A. Neves, Carlo R. Valêncio, Alex R. Pinto, Adriano M. Cansian, Rogeria C. G. de Souza, Yang Shiyou, and José M. Machado Copyright © 2014 Evandro A. Marucci et al. All rights reserved. Cell Type-Dependent RNA Recombination Frequency in the Japanese Encephalitis Virus Tue, 22 Jul 2014 00:00:00 +0000 http://www.hindawi.com/journals/bmri/2014/471323/ Japanese encephalitis virus (JEV) is one of approximately 70 flaviviruses, frequently causing symptoms involving the central nervous system. Mutations of its genomic RNA frequently occur during viral replication, which is believed to be a force contributing to viral evolution. Nevertheless, accumulating evidences show that some JEV strains may have actually arisen from RNA recombination between genetically different populations of the virus. We have demonstrated that RNA recombination in JEV occurs unequally in different cell types. In the present study, viral RNA fragments transfected into as well as viral RNAs synthesized in mosquito cells were shown not to be stable, especially in the early phase of infection possibly via cleavage by exoribonuclease. Such cleaved small RNA fragments may be further degraded through an RNA interference pathway triggered by viral double-stranded RNA during replication in mosquito cells, resulting in a lower frequency of RNA recombination in mosquito cells compared to that which occurs in mammalian cells. In fact, adjustment of viral RNA to an appropriately lower level in mosquito cells prevents overgrowth of the virus and is beneficial for cells to survive the infection. Our findings may also account for the slower evolution of arboviruses as reported previously. Wei-Wei Chiang, Ching-Kai Chuang, Mei Chao, and Wei-June Chen Copyright © 2014 Wei-Wei Chiang et al. All rights reserved. Structural Insight into the DNA-Binding Mode of the Primosomal Proteins PriA, PriB, and DnaT Mon, 21 Jul 2014 08:30:20 +0000 http://www.hindawi.com/journals/bmri/2014/195162/ Replication restart primosome is a complex dynamic system that is essential for bacterial survival. This system uses various proteins to reinitiate chromosomal DNA replication to maintain genetic integrity after DNA damage. The replication restart primosome in Escherichia coli is composed of PriA helicase, PriB, PriC, DnaT, DnaC, DnaB helicase, and DnaG primase. The assembly of the protein complexes within the forked DNA responsible for reloading the replicative DnaB helicase anywhere on the chromosome for genome duplication requires the coordination of transient biomolecular interactions. Over the last decade, investigations on the structure and mechanism of these nucleoproteins have provided considerable insight into primosome assembly. In this review, we summarize and discuss our current knowledge and recent advances on the DNA-binding mode of the primosomal proteins PriA, PriB, and DnaT. Yen-Hua Huang and Cheng-Yang Huang Copyright © 2014 Yen-Hua Huang and Cheng-Yang Huang. All rights reserved. Mass Spectrometry Based Proteomic Analysis of Salivary Glands of Urban Malaria Vector Anopheles stephensi Mon, 14 Jul 2014 11:31:37 +0000 http://www.hindawi.com/journals/bmri/2014/686319/ Salivary gland proteins of Anopheles mosquitoes offer attractive targets to understand interactions with sporozoites, blood feeding behavior, homeostasis, and immunological evaluation of malaria vectors and parasite interactions. To date limited studies have been carried out to elucidate salivary proteins of An. stephensi salivary glands. The aim of the present study was to provide detailed analytical attributives of functional salivary gland proteins of urban malaria vector An. stephensi. A proteomic approach combining one-dimensional electrophoresis (1DE), ion trap liquid chromatography mass spectrometry (LC/MS/MS), and computational bioinformatic analysis was adopted to provide the first direct insight into identification and functional characterization of known salivary proteins and novel salivary proteins of An. stephensi. Computational studies by online servers, namely, MASCOT and OMSSA algorithms, identified a total of 36 known salivary proteins and 123 novel proteins analysed by LC/MS/MS. This first report describes a baseline proteomic catalogue of 159 salivary proteins belonging to various categories of signal transduction, regulation of blood coagulation cascade, and various immune and energy pathways of An. stephensi sialotranscriptome by mass spectrometry. Our results may serve as basis to provide a putative functional role of proteins in concept of blood feeding, biting behavior, and other aspects of vector-parasite host interactions for parasite development in anopheline mosquitoes. Sonam Vijay, Manmeet Rawat, and Arun Sharma Copyright © 2014 Sonam Vijay et al. All rights reserved. PPI Network Analysis of mRNA Expression Profile of Ezrin Knockdown in Esophageal Squamous Cell Carcinoma Mon, 14 Jul 2014 08:56:44 +0000 http://www.hindawi.com/journals/bmri/2014/651954/ Ezrin, coding protein EZR which cross-links actin filaments, overexpresses and involves invasion, metastasis, and poor prognosis in various cancers including esophageal squamous cell carcinoma (ESCC). In our previous study, Ezrin was knock down and analyzed by mRNA expression profile which has not been fully mined. In this study, we applied protein-protein interactions (PPI) network knowledge and methods to explore our understanding of these differentially expressed genes (DEGs). PPI subnetworks showed that hundreds of DEGs interact with thousands of other proteins. Subcellular localization analyses found that the DEGs and their directly or indirectly interacting proteins distribute in multiple layers, which was applied to analyze the shortest paths between EZR and other DEGs. Gene ontology annotation generated a functional annotation map and found hundreds of significant terms, especially those associated with cytoskeleton organization of Ezrin protein, such as “cytoskeleton organization,” “regulation of actin filament-based process,” and “regulation of actin cytoskeleton organization.” The algorithm of Random Walk with Restart was applied to prioritize the DEGs and identified several cancer related DEGs ranked closest to EZR. These analyses based on PPI network have greatly expanded our comprehension of the mRNA expression profile of Ezrin knockdown for future examination of the roles and mechanisms of Ezrin. Bingli Wu, Jianjun Xie, Zepeng Du, Jianyi Wu, Pixian Zhang, Liyan Xu, and Enmin Li Copyright © 2014 Bingli Wu et al. All rights reserved. Identifying the Gene Signatures from Gene-Pathway Bipartite Network Guarantees the Robust Model Performance on Predicting the Cancer Prognosis Mon, 14 Jul 2014 08:20:49 +0000 http://www.hindawi.com/journals/bmri/2014/424509/ For the purpose of improving the prediction of cancer prognosis in the clinical researches, various algorithms have been developed to construct the predictive models with the gene signatures detected by DNA microarrays. Due to the heterogeneity of the clinical samples, the list of differentially expressed genes (DEGs) generated by the statistical methods or the machine learning algorithms often involves a number of false positive genes, which are not associated with the phenotypic differences between the compared clinical conditions, and subsequently impacts the reliability of the predictive models. In this study, we proposed a strategy, which combined the statistical algorithm with the gene-pathway bipartite networks, to generate the reliable lists of cancer-related DEGs and constructed the models by using support vector machine for predicting the prognosis of three types of cancers, namely, breast cancer, acute myeloma leukemia, and glioblastoma. Our results demonstrated that, combined with the gene-pathway bipartite networks, our proposed strategy can efficiently generate the reliable cancer-related DEG lists for constructing the predictive models. In addition, the model performance in the swap analysis was similar to that in the original analysis, indicating the robustness of the models in predicting the cancer outcomes. Li He, Yuelong Wang, Yongning Yang, Liqiu Huang, and Zhining Wen Copyright © 2014 Li He et al. All rights reserved. The Definition of a Prolonged Intensive Care Unit Stay for Spontaneous Intracerebral Hemorrhage Patients: An Application with National Health Insurance Research Database Mon, 14 Jul 2014 08:11:53 +0000 http://www.hindawi.com/journals/bmri/2014/891725/ Introduction. Length of stay (LOS) in the intensive care unit (ICU) of spontaneous intracerebral hemorrhage (sICH) patients is one of the most important issues. The disease severity, psychosocial factors, and institutional factors will influence the length of ICU stay. This study is used in the Taiwan National Health Insurance Research Database (NHIRD) to define the threshold of a prolonged ICU stay in sICH patients. Methods. This research collected the demographic data of sICH patients in the NHIRD from 2005 to 2009. The threshold of prolonged ICU stay was calculated using change point analysis. Results. There were 1599 sICH patients included. A prolonged ICU stay was defined as being equal to or longer than 10 days. There were 436 prolonged ICU stay cases and 1163 nonprolonged cases. Conclusion. This study showed that the threshold of a prolonged ICU stay is a good indicator of hospital utilization in ICH patients. Different hospitals have their own different care strategies that can be identified with a prolonged ICU stay. This indicator can be improved using quality control methods such as complications prevention and efficiency of ICU bed management. Patients’ stay in ICUs and in hospitals will be shorter if integrated care systems are established. Chien-Lung Chan, Hsien-Wei Ting, and Hsin-Tsung Huang Copyright © 2014 Chien-Lung Chan et al. All rights reserved. Incorporating Amino Acids Composition and Functional Domains for Identifying Bacterial Toxin Proteins Mon, 07 Jul 2014 08:55:16 +0000 http://www.hindawi.com/journals/bmri/2014/972692/ Aside from pathogenesis, bacterial toxins also have been used for medical purpose such as drugs for cancer and immune diseases. Correctly identifying bacterial toxins and their types (endotoxins and exotoxins) has great impact on the cell biology study and therapy development. However, experimental methods for bacterial toxins identification are time-consuming and labor-intensive, implying an urgent need for computational prediction. Thus, we are motivated to develop a method for computational identification of bacterial toxins based on amino acid sequences and functional domain information. In this study, a nonredundant dataset of 167 bacterial toxins including 77 exotoxins and 90 endotoxins is adopted to learn the predictive model by using support vector machines (SVMs). The cross-validation evaluation shows that the SVM models trained with amino acids and dipeptides composition could yield an accuracy of 96.07% and 92.50%, respectively. For discriminating endotoxins from exotoxins, the SVM models trained with amino acids and dipeptides composition have achieved an accuracy of 95.71% and 92.86%, respectively. After incorporating functional domain information, the predictive performance is further improved. The proposed method has been demonstrated to be able to more effectively identify and classify bacterial toxins than the other two features on independent dataset, which may aid in bacterial biomedical development. Min-Gang Su, Chien-Hsun Huang, Tzong-Yi Lee, Yu-Ju Chen, and Hsin-Yi Wu Copyright © 2014 Min-Gang Su et al. All rights reserved. Risk Factors for Mortality in Patients with Septic Acute Kidney Injury in Intensive Care Units in Beijing, China: A Multicenter Prospective Observational Study Mon, 07 Jul 2014 06:34:39 +0000 http://www.hindawi.com/journals/bmri/2014/172620/ Objective. To discover risk factors for mortality of patients with septic AKI in ICU via a multicenter study. Background. Septic AKI is a serious threat to patients in ICU, but there are a few clinical studies focusing on this. Methods. This was a prospective, observational, and multicenter study conducted in 30 ICUs of 28 major hospitals in Beijing. 3,107 patients were admitted consecutively, among which 361 patients were with septic AKI. Patient clinical data were recorded daily for 10 days after admission. Kidney Disease: Improving Global Outcomes (KDIGO) criteria were used to define and stage AKI. Of the involved patients, 201 survived and 160 died. Results. The rate of septic AKI was 11.6%. Twenty-one risk factors were found, and six independent risk factors were identified: age, APACHE II score, duration of mechanical ventilation, duration of MAP <65 mmHg, time until RRT started, and progressive KIDGO stage. Admission KDIGO stages were not associated with mortality, while worst KDIGO stages were. Only progressive KIDGO stage was an independent risk factor. Conclusions. Six independent risk factors for mortality for septic AKI were identified. Progressive KIDGO stage is better than admission or the worst KIDGO for prediction of mortality. This trial is registered with ChiCTR-ONC-11001875. Xin Wang, Li Jiang, Ying Wen, Mei-Ping Wang, Wei Li, Zhi-Qiang Li, and Xiu-Ming Xi Copyright © 2014 Xin Wang et al. All rights reserved. Gonadal Transcriptome Analysis of Male and Female Olive Flounder (Paralichthys olivaceus) Sun, 06 Jul 2014 10:07:38 +0000 http://www.hindawi.com/journals/bmri/2014/291067/ Olive flounder (Paralichthys olivaceus) is an important commercially cultured marine flatfish in China, Korea, and Japan, of which female grows faster than male. In order to explore the molecular mechanism of flounder sex determination and development, we used RNA-seq technology to investigate transcriptomes of flounder gonads. This produced 22,253,217 and 19,777,841 qualified reads from ovary and testes, which were jointly assembled into 97,233 contigs. Among them, 23,223 contigs were mapped to known genes, of which 2,193 were predicted to be differentially expressed in ovary and 887 in testes. According to annotation information, several sex-related biological pathways including ovarian steroidogenesis and estrogen signaling pathways were firstly found in flounder. The dimorphic expression of overall sex-related genes provides further insights into sex determination and gonadal development. Our study also provides an archive for further studies of molecular mechanism of fish sex determination. Zhaofei Fan, Feng You, Lijuan Wang, Shenda Weng, Zhihao Wu, Jinwei Hu, Yuxia Zou, Xungang Tan, and Peijun Zhang Copyright © 2014 Zhaofei Fan et al. All rights reserved. Characteristics and Prediction of RNA Structure Sun, 06 Jul 2014 09:18:47 +0000 http://www.hindawi.com/journals/bmri/2014/690340/ RNA secondary structures with pseudoknots are often predicted by minimizing free energy, which is NP-hard. Most RNAs fold during transcription from DNA into RNA through a hierarchical pathway wherein secondary structures form prior to tertiary structures. Real RNA secondary structures often have local instead of global optimization because of kinetic reasons. The performance of RNA structure prediction may be improved by considering dynamic and hierarchical folding mechanisms. This study is a novel report on RNA folding that accords with the golden mean characteristic based on the statistical analysis of the real RNA secondary structures of all 480 sequences from RNA STRAND, which are validated by NMR or X-ray. The length ratios of domains in these sequences are approximately 0.382L, 0.5L, 0.618L, and L, where L is the sequence length. These points are just the important golden sections of sequence. With this characteristic, an algorithm is designed to predict RNA hierarchical structures and simulate RNA folding by dynamically folding RNA structures according to the above golden section points. The sensitivity and number of predicted pseudoknots of our algorithm are better than those of the Mfold, HotKnots, McQfold, ProbKnot, and Lhw-Zhu algorithms. Experimental results reflect the folding rules of RNA from a new angle that is close to natural folding. Hengwu Li, Daming Zhu, Caiming Zhang, Huijian Han, and Keith A. Crandall Copyright © 2014 Hengwu Li et al. All rights reserved. Microarray-Based RNA Profiling of Breast Cancer: Batch Effect Removal Improves Cross-Platform Consistency Thu, 03 Jul 2014 00:00:00 +0000 http://www.hindawi.com/journals/bmri/2014/651751/ Microarray is a powerful technique used extensively for gene expression analysis. Different technologies are available, but lack of standardization makes it challenging to compare and integrate data. Furthermore, batch-related biases within datasets are common but often not tackled. We have analyzed the same 234 breast cancers on two different microarray platforms. One dataset contained known batch-effects associated with the fabrication procedure used. The aim was to assess the significance of correcting for systematic batch-effects when integrating data from different platforms. We here demonstrate the importance of detecting batch-effects and how tools, such as ComBat, can be used to successfully overcome such systematic variations in order to unmask essential biological signals. Batch adjustment was found to be particularly valuable in the detection of more delicate differences in gene expression. Furthermore, our results show that prober adjustment is essential for integration of gene expression data obtained from multiple sources. We show that high-variance genes are highly reproducibly expressed across platforms making them particularly well suited as biomarkers and for building gene signatures, exemplified by prediction of estrogen-receptor status and molecular subtypes. In conclusion, the study emphasizes the importance of utilizing proper batch adjustment methods when integrating data across different batches and platforms. Martin J. Larsen, Mads Thomassen, Qihua Tan, Kristina P. Sørensen, and Torben A. Kruse Copyright © 2014 Martin J. Larsen et al. All rights reserved. MAVTgsa: An R Package for Gene Set (Enrichment) Analysis Thu, 03 Jul 2014 00:00:00 +0000 http://www.hindawi.com/journals/bmri/2014/346074/ Gene set analysis methods aim to determine whether an a priori defined set of genes shows statistically significant difference in expression on either categorical or continuous outcomes. Although many methods for gene set analysis have been proposed, a systematic analysis tool for identification of different types of gene set significance modules has not been developed previously. This work presents an R package, called MAVTgsa, which includes three different methods for integrated gene set enrichment analysis. (1) The one-sided OLS (ordinary least squares) test detects coordinated changes of genes in gene set in one direction, either up- or downregulation. (2) The two-sided MANOVA (multivariate analysis variance) detects changes both up- and downregulation for studying two or more experimental conditions. (3) A random forests-based procedure is to identify gene sets that can accurately predict samples from different experimental conditions or are associated with the continuous phenotypes. MAVTgsa computes the values and FDR (false discovery rate) -value for all gene sets in the study. Furthermore, MAVTgsa provides several visualization outputs to support and interpret the enrichment results. This package is available online. Chih-Yi Chien, Ching-Wei Chang, Chen-An Tsai, and James J. Chen Copyright © 2014 Chih-Yi Chien et al. All rights reserved. Combined Analysis with Copy Number Variation Identifies Risk Loci in Lung Cancer Tue, 01 Jul 2014 11:54:11 +0000 http://www.hindawi.com/journals/bmri/2014/469103/ Background. Lung cancer is the most important cause of cancer mortality worldwide, but the underlying mechanisms of this disease are not fully understood. Copy number variations (CNVs) are promising genetic variations to study because of their potential effects on cancer. Methodology/Principal Findings. Here we conducted a pilot study in which we systematically analyzed the association of CNVs in two lung cancer datasets: the Environment And Genetics in Lung cancer Etiology (EAGLE) and the Prostate, Lung, Colorectal and Ovarian (PLCO) Cancer Screening Trial datasets. We used a preestablished association method to test the datasets separately and conducted a combined analysis to test the association accordance between the two datasets. Finally, we identified 167 risk SNP loci and 22 CNVs associated with lung cancer and linked them with recombination hotspots. Functional annotation and biological relevance analyses implied that some of our predicted risk loci were supported by other studies and might be potential candidate loci for lung cancer studies. Conclusions/Significance. Our results further emphasized the importance of copy number variations in cancer and might be a valuable complement to current genome-wide association studies on cancer. Xinlei Li, Xianfeng Chen, Guohong Hu, Yang Liu, Zhenguo Zhang, Ping Wang, You Zhou, Xianfu Yi, Jie Zhang, Yufei Zhu, Zejun Wei, Fei Yuan, Guoping Zhao, Jun Zhu, Landian Hu, and Xiangyin Kong Copyright © 2014 Xinlei Li et al. All rights reserved. Target Capture and Massive Sequencing of Genes Transcribed in Mytilus galloprovincialis Mon, 30 Jun 2014 11:33:59 +0000 http://www.hindawi.com/journals/bmri/2014/538549/ Next generation sequencing (NGS) allows fast and massive production of both genome and transcriptome sequence datasets. As the genome of the Mediterranean mussel Mytilus galloprovincialis is not available at present, we have explored the possibility of reducing the whole genome sequencing efforts by using capture probes coupled with PCR amplification and high-throughput 454-sequencing to enrich selected genomic regions. The enrichment of DNA target sequences was validated by real-time PCR, whereas the efficacy of the applied strategy was evaluated by mapping the 454-output reads against reference transcript data already available for M. galloprovincialis and by measuring coverage, SNPs, number of de novo sequenced introns, and complete gene sequences. Focusing on a target size of nearly 1.5 Mbp, we obtained a target coverage which allowed the identification of more than 250 complete introns, 10,741 SNPs, and also complete gene sequences. This study confirms the transcriptome-based enrichment of gDNA regions as a good strategy to expand knowledge on specific subsets of genes also in nonmodel organisms. Umberto Rosani, Stefania Domeneghetti, Alberto Pallavicini, and Paola Venier Copyright © 2014 Umberto Rosani et al. All rights reserved. Identifying Hierarchical and Overlapping Protein Complexes Based on Essential Protein-Protein Interactions and “Seed-Expanding” Method Mon, 30 Jun 2014 09:43:33 +0000 http://www.hindawi.com/journals/bmri/2014/838714/ Many evidences have demonstrated that protein complexes are overlapping and hierarchically organized in PPI networks. Meanwhile, the large size of PPI network wants complex detection methods have low time complexity. Up to now, few methods can identify overlapping and hierarchical protein complexes in a PPI network quickly. In this paper, a novel method, called MCSE, is proposed based on -module and “seed-expanding.” First, it chooses seeds as essential PPIs or edges with high edge clustering values. Then, it identifies protein complexes by expanding each seed to a -module. MCSE is suitable for large PPI networks because of its low time complexity. MCSE can identify overlapping protein complexes naturally because a protein can be visited by different seeds. MCSE uses the parameter _th to control the range of seed expanding and can detect a hierarchical organization of protein complexes by tuning the value of _th. Experimental results of S. cerevisiae show that this hierarchical organization is similar to that of known complexes in MIPS database. The experimental results also show that MCSE outperforms other previous competing algorithms, such as CPM, CMC, Core-Attachment, Dpclus, HC-PIN, MCL, and NFC, in terms of the functional enrichment and matching with known protein complexes. Jun Ren, Wei Zhou, and Jianxin Wang Copyright © 2014 Jun Ren et al. All rights reserved. Integrating In Silico Prediction Methods, Molecular Docking, and Molecular Dynamics Simulation to Predict the Impact of ALK Missense Mutations in Structural Perspective Thu, 26 Jun 2014 12:00:41 +0000 http://www.hindawi.com/journals/bmri/2014/895831/ Over the past decade, advancements in next generation sequencing technology have placed personalized genomic medicine upon horizon. Understanding the likelihood of disease causing mutations in complex diseases as pathogenic or neutral remains as a major task and even impossible in the structural context because of its time consuming and expensive experiments. Among the various diseases causing mutations, single nucleotide polymorphisms (SNPs) play a vital role in defining individual’s susceptibility to disease and drug response. Understanding the genotype-phenotype relationship through SNPs is the first and most important step in drug research and development. Detailed understanding of the effect of SNPs on patient drug response is a key factor in the establishment of personalized medicine. In this paper, we represent a computational pipeline in anaplastic lymphoma kinase (ALK) for SNP-centred study by the application of in silico prediction methods, molecular docking, and molecular dynamics simulation approaches. Combination of computational methods provides a way in understanding the impact of deleterious mutations in altering the protein drug targets and eventually leading to variable patient’s drug response. We hope this rapid and cost effective pipeline will also serve as a bridge to connect the clinicians and in silico resources in tailoring treatments to the patients’ specific genotype. C. George Priya Doss, Chiranjib Chakraborty, Luonan Chen, and Hailong Zhu Copyright © 2014 C. George Priya Doss et al. All rights reserved. SSFinder: High Throughput CRISPR-Cas Target Sites Prediction Tool Thu, 26 Jun 2014 00:00:00 +0000 http://www.hindawi.com/journals/bmri/2014/742482/ Clustered regularly interspaced short palindromic repeats (CRISPR) and CRISPR-associated protein (Cas) system facilitates targeted genome editing in organisms. Despite high demand of this system, finding a reliable tool for the determination of specific target sites in large genomic data remained challenging. Here, we report SSFinder, a python script to perform high throughput detection of specific target sites in large nucleotide datasets. The SSFinder is a user-friendly tool, compatible with Windows, Mac OS, and Linux operating systems, and freely available online. Santosh Kumar Upadhyay and Shailesh Sharma Copyright © 2014 Santosh Kumar Upadhyay and Shailesh Sharma. All rights reserved. Defining Loci in Restriction-Based Reduced Representation Genomic Data from Nonmodel Species: Sources of Bias and Diagnostics for Optimal Clustering Wed, 25 Jun 2014 07:19:47 +0000 http://www.hindawi.com/journals/bmri/2014/675158/ Next generation sequencing holds great promise for applications of phylogeography, landscape genetics, and population genomics in wild populations of nonmodel species, but the robustness of inferences hinges on careful experimental design and effective bioinformatic removal of predictable artifacts. Addressing this issue, we use published genomes from a tunicate, stickleback, and soybean to illustrate the potential for bioinformatic artifacts and introduce a protocol to minimize two sources of error expected from similarity-based de-novo clustering of stacked reads: the splitting of alleles into different clusters, which creates false homozygosity, and the grouping of paralogs into the same cluster, which creates false heterozygosity. We present an empirical application focused on Ciona savignyi, a tunicate with very high SNP heterozygosity (~0.05), because high diversity challenges the computational efficiency of most existing nonmodel pipelines while also potentially exacerbating paralog artifacts. The simulated and empirical data illustrate the advantages of using higher sequence difference clustering thresholds than is typical and demonstrate the utility of our protocol for efficiently identifying an optimum threshold from data without prior knowledge of heterozygosity. The empirical Ciona savignyi data also highlight null alleles as a potentially large source of false homozygosity in restriction-based reduced representation genomic data. Daniel C. Ilut, Marie L. Nydam, and Matthew P. Hare Copyright © 2014 Daniel C. Ilut et al. All rights reserved. The Human Plasma Membrane Peripherome: Visualization and Analysis of Interactions Wed, 25 Jun 2014 07:02:03 +0000 http://www.hindawi.com/journals/bmri/2014/397145/ A major part of membrane function is conducted by proteins, both integral and peripheral. Peripheral membrane proteins temporarily adhere to biological membranes, either to the lipid bilayer or to integral membrane proteins with noncovalent interactions. The aim of this study was to construct and analyze the interactions of the human plasma membrane peripheral proteins (peripherome hereinafter). For this purpose, we collected a dataset of peripheral proteins of the human plasma membrane. We also collected a dataset of experimentally verified interactions for these proteins. The interaction network created from this dataset has been visualized using Cytoscape. We grouped the proteins based on their subcellular location and clustered them using the MCL algorithm in order to detect functional modules. Moreover, functional and graph theory based analyses have been performed to assess biological features of the network. Interaction data with drug molecules show that ~10% of peripheral membrane proteins are targets for approved drugs, suggesting their potential implications in disease. In conclusion, we reveal novel features and properties regarding the protein-protein interaction network created by peripheral proteins of the human plasma membrane. Katerina C. Nastou, Georgios N. Tsaousis, Kimon E. Kremizas, Zoi I. Litou, and Stavros J. Hamodrakas Copyright © 2014 Katerina C. Nastou et al. All rights reserved. MPINet: Metabolite Pathway Identification via Coupling of Global Metabolite Network Structure and Metabolomic Profile Wed, 25 Jun 2014 06:50:21 +0000 http://www.hindawi.com/journals/bmri/2014/325697/ High-throughput metabolomics technology, such as gas chromatography mass spectrometry, allows the analysis of hundreds of metabolites. Understanding that these metabolites dominate the study condition from biological pathway perspective is still a significant challenge. Pathway identification is an invaluable aid to address this issue and, thus, is urgently needed. In this study, we developed a network-based metabolite pathway identification method, MPINet, which considers the global importance of metabolites and the unique character of metabolomic profile. Through integrating the global metabolite functional network structure and the character of metabolomic profile, MPINet provides a more accurate metabolomic pathway analysis. This integrative strategy simultaneously captures the global nonequivalence of metabolites in a pathway and the bias from metabolomic experimental technology. We then applied MPINet to four different types of metabolite datasets. In the analysis of metastatic prostate cancer dataset, we demonstrated the effectiveness of MPINet. With the analysis of the two type 2 diabetes datasets, we show that MPINet has the potentiality for identifying novel pathways related with disease and is reliable for analyzing metabolomic data. Finally, we extensively applied MPINet to identify drug sensitivity related pathways. These results suggest MPINet’s effectiveness and reliability for analyzing metabolomic data across multiple different application fields. Feng Li, Yanjun Xu, Desi Shang, Haixiu Yang, Wei Liu, Junwei Han, Zeguo Sun, Qianlan Yao, Chunlong Zhang, Jiquan Ma, Fei Su, Li Feng, Xinrui Shi, Yunpeng Zhang, Jing Li, Qi Gu, Xia Li, and Chunquan Li Copyright © 2014 Feng Li et al. All rights reserved. Biomolecular Networks and Human Diseases Tue, 24 Jun 2014 08:06:49 +0000 http://www.hindawi.com/journals/bmri/2014/363717/ FangXiang Wu, Luonan Chen, Jianxin Wang, and Reda Alhajj Copyright © 2014 FangXiang Wu et al. All rights reserved. miRSeq: A User-Friendly Standalone Toolkit for Sequencing Quality Evaluation and miRNA Profiling Tue, 24 Jun 2014 06:46:17 +0000 http://www.hindawi.com/journals/bmri/2014/462135/ MicroRNAs (miRNAs) present diverse regulatory functions in a wide range of biological activities. Studies on miRNA functions generally depend on determining miRNA expression profiles between libraries by using a next-generation sequencing (NGS) platform. Currently, several online web services are developed to provide small RNA NGS data analysis. However, the submission of large amounts of NGS data, conversion of data format, and limited availability of species bring problems. In this study, we developed miRSeq to provide alternatives. To test the performance, we had small RNA NGS data from four species, including human, rat, fly, and nematode, analyzed with miRSeq. The alignments results indicate that miRSeq can precisely evaluate the sequencing quality of samples regarding percentage of self-ligation read, read length distribution, and read category. miRSeq is a user-friendly standalone toolkit featuring a graphical user interface (GUI). After a simple installation, users can easily operate miRSeq on a PC or laptop by using a mouse. Within minutes, miRSeq yields useful miRNA data, including miRNA expression profiles, 3′ end modification patterns, and isomiR forms. Moreover, miRSeq supports the analysis of up to 105 animal species, providing higher flexibility. Cheng-Tsung Pan, Kuo-Wang Tsai, Tzu-Min Hung, Wei-Chen Lin, Chao-Yu Pan, Hong-Ren Yu, and Sung-Chou Li Copyright © 2014 Cheng-Tsung Pan et al. All rights reserved. A Graphic Method for Identification of Novel Glioma Related Genes Mon, 23 Jun 2014 07:15:21 +0000 http://www.hindawi.com/journals/bmri/2014/891945/ Glioma, as the most common and lethal intracranial tumor, is a serious disease that causes many deaths every year. Good comprehension of the mechanism underlying this disease is very helpful to design effective treatments. However, up to now, the knowledge of this disease is still limited. It is an important step to understand the mechanism underlying this disease by uncovering its related genes. In this study, a graphic method was proposed to identify novel glioma related genes based on known glioma related genes. A weighted graph was constructed according to the protein-protein interaction information retrieved from STRING and the well-known shortest path algorithm was employed to discover novel genes. The following analysis suggests that some of them are related to the biological process of glioma, proving that our method was effective in identifying novel glioma related genes. We hope that the proposed method would be applied to study other diseases and provide useful information to medical workers, thereby designing effective treatments of different diseases. Yu-Fei Gao, Yang Shu, Lei Yang, Yi-Chun He, Li-Peng Li, GuaHua Huang, Hai-Peng Li, and Yang Jiang Copyright © 2014 Yu-Fei Gao et al. All rights reserved. A Novel Dynamic Update Framework for Epileptic Seizure Prediction Sun, 22 Jun 2014 00:00:00 +0000 http://www.hindawi.com/journals/bmri/2014/957427/ Epileptic seizure prediction is a difficult problem in clinical applications, and it has the potential to significantly improve the patients’ daily lives whose seizures cannot be controlled by either drugs or surgery. However, most current studies of epileptic seizure prediction focus on high sensitivity and low false-positive rate only and lack the flexibility for a variety of epileptic seizures and patients’ physical conditions. Therefore, a novel dynamic update framework for epileptic seizure prediction is proposed in this paper. In this framework, two basic sample pools are constructed and updated dynamically. Furthermore, the prediction model can be updated to be the most appropriate one for the prediction of seizures’ arrival. Mahalanobis distance is introduced in this part to solve the problem of side information, measuring the distance between two data sets. In addition, a multichannel feature extraction method based on Hilbert-Huang transform and extreme learning machine is utilized to extract the features of a patient’s preseizure state against the normal state. At last, a dynamic update epileptic seizure prediction system is built up. Simulations on Freiburg database show that the proposed system has a better performance than the one without update. The research of this paper is significantly helpful for clinical applications, especially for the exploitation of online portable devices. Min Han, Sunan Ge, Minghui Wang, Xiaojun Hong, and Jie Han Copyright © 2014 Min Han et al. All rights reserved. An Integrated Analysis of miRNA, lncRNA, and mRNA Expression Profiles Wed, 18 Jun 2014 06:38:20 +0000 http://www.hindawi.com/journals/bmri/2014/345605/ Increasing amounts of evidence indicate that noncoding RNAs (ncRNAs) have important roles in various biological processes. Here, miRNA, lncRNA, and mRNA expression profiles were analyzed in human HepG2 and L02 cells using high-throughput technologies. An integrative method was developed to identify possible functional relationships between different RNA molecules. The dominant deregulated miRNAs were prone to be downregulated in tumor cells, and the most abnormal mRNAs and lncRNAs were always upregulated. However, the genome-wide analysis of differentially expressed RNA species did not show significant bias between up- and downregulated populations. miRNA-mRNA interaction was performed based on their regulatory relationships, and miRNA-lncRNA and mRNA-lncRNA interactions were thoroughly surveyed and identified based on their locational distributions and sequence correlations. Aberrantly expressed miRNAs were further analyzed based on their multiple isomiRs. IsomiR repertoires and expression patterns were varied across miRNA loci. Several specific miRNA loci showed differences between tumor and normal cells, especially with respect to abnormally expressed miRNA species. These findings suggest that isomiR repertoires and expression patterns might contribute to tumorigenesis through different biological roles. Systematic and integrative analysis of different RNA molecules with potential cross-talk may make great contributions to the unveiling of the complex mechanisms underlying tumorigenesis. Li Guo, Yang Zhao, Sheng Yang, Hui Zhang, and Feng Chen Copyright © 2014 Li Guo et al. All rights reserved. Ultrasonographic Fetal Growth Charts: An Informatic Approach by Quantitative Analysis of the Impact of Ethnicity on Diagnoses Based on a Preliminary Report on Salentinian Population Wed, 18 Jun 2014 00:00:00 +0000 http://www.hindawi.com/journals/bmri/2014/386124/ Clear guidance on fetal growth assessment is important because of the strong links between growth restriction or macrosomia and adverse perinatal outcome in order to reduce associated morbidity and mortality. Fetal growth curves are extensively adopted to track fetal sizes from the early phases of pregnancy up to delivery. In the literature, a large variety of reference charts are reported but they are mostly up to five decades old. Furthermore, they do not address several variables and factors (e.g., ethnicity, foods, lifestyle, smoke, and physiological and pathological variables), which are very important for a correct evaluation of the fetal well-being. Therefore, currently adopted fetal growth charts are inadequate to support the melting pot of ethnic groups and lifestyles of our society. Customized fetal growth charts are needed to provide an accurate fetal assessment and to avoid unnecessary obstetric interventions at the time of delivery. Starting from the development of a growth chart purposely built for a specific population, in the paper, authors quantify and analyse the impact of the adoption of wrong growth charts on fetal diagnoses. These results come from a preliminary evaluation of a new open service developed to produce personalized growth charts for specific ethnicity, lifestyle, and other parameters. Andrea Tinelli, Mario Alessandro Bochicchio, Lucia Vaira, and Antonio Malvasi Copyright © 2014 Andrea Tinelli et al. All rights reserved. Conformational B-Cell Epitopes Prediction from Sequences Using Cost-Sensitive Ensemble Classifiers and Spatial Clustering Tue, 17 Jun 2014 07:10:22 +0000 http://www.hindawi.com/journals/bmri/2014/689219/ B-cell epitopes are regions of the antigen surface which can be recognized by certain antibodies and elicit the immune response. Identification of epitopes for a given antigen chain finds vital applications in vaccine and drug research. Experimental prediction of B-cell epitopes is time-consuming and resource intensive, which may benefit from the computational approaches to identify B-cell epitopes. In this paper, a novel cost-sensitive ensemble algorithm is proposed for predicting the antigenic determinant residues and then a spatial clustering algorithm is adopted to identify the potential epitopes. Firstly, we explore various discriminative features from primary sequences. Secondly, cost-sensitive ensemble scheme is introduced to deal with imbalanced learning problem. Thirdly, we adopt spatial algorithm to tell which residues may potentially form the epitopes. Based on the strategies mentioned above, a new predictor, called CBEP (conformational B-cell epitopes prediction), is proposed in this study. CBEP achieves good prediction performance with the mean AUC scores (AUCs) of 0.721 and 0.703 on two benchmark datasets (bound and unbound) using the leave-one-out cross-validation (LOOCV). When compared with previous prediction tools, CBEP produces higher sensitivity and comparable specificity values. A web server named CBEP which implements the proposed method is available for academic use. Jian Zhang, Xiaowei Zhao, Pingping Sun, Bo Gao, and Zhiqiang Ma Copyright © 2014 Jian Zhang et al. All rights reserved. On Macroscopic Quantum Phenomena in Biomolecules and Cells: From Levinthal to Hopfield Mon, 16 Jun 2014 06:43:51 +0000 http://www.hindawi.com/journals/bmri/2014/580491/ In the context of the macroscopic quantum phenomena of the second kind, we hereby seek for a solution-in-principle of the long standing problem of the polymer folding, which was considered by Levinthal as (semi)classically intractable. To illuminate it, we applied quantum-chemical and quantum decoherence approaches to conformational transitions. Our analyses imply the existence of novel macroscopic quantum biomolecular phenomena, with biomolecular chain folding in an open environment considered as a subtle interplay between energy and conformation eigenstates of this biomolecule, governed by quantum-chemical and quantum decoherence laws. On the other hand, within an open biological cell, a system of all identical (noninteracting and dynamically noncoupled) biomolecular proteins might be considered as corresponding spatial quantum ensemble of these identical biomolecular processors, providing spatially distributed quantum solution to a single corresponding biomolecular chain folding, whose density of conformational states might be represented as Hopfield-like quantum-holographic associative neural network too (providing an equivalent global quantum-informational alternative to standard molecular-biology local biochemical approach in biomolecules and cells and higher hierarchical levels of organism, as well). Dejan Raković, Miroljub Dugić, Jasmina Jeknić-Dugić, Milenko Plavšić, Stevo Jaćimovski, and Jovan Šetrajčić Copyright © 2014 Dejan Raković et al. All rights reserved. Big Data and Network Biology Sun, 15 Jun 2014 12:51:54 +0000 http://www.hindawi.com/journals/bmri/2014/836708/ Shigehiko Kanaya, Md. Altaf-Ul-Amin, Samuel Kuria Kiboi, and Farit Mochamad Afendi Copyright © 2014 Shigehiko Kanaya et al. All rights reserved. Integrative Genomics and Computational Systems Medicine Sun, 15 Jun 2014 05:47:08 +0000 http://www.hindawi.com/journals/bmri/2014/945253/ Jason E. McDermott, Yufei Huang, Bing Zhang, Hua Xu, and Zhongming Zhao Copyright © 2014 Jason E. McDermott et al. All rights reserved. Development of Dual Inhibitors against Alzheimer’s Disease Using Fragment-Based QSAR and Molecular Docking Thu, 12 Jun 2014 10:57:36 +0000 http://www.hindawi.com/journals/bmri/2014/979606/ Alzheimer’s (AD) is the leading cause of dementia among elderly people. Considering the complex heterogeneous etiology of AD, there is an urgent need to develop multitargeted drugs for its suppression. -amyloid cleavage enzyme (BACE-1) and acetylcholinesterase (AChE), being important for AD progression, have been considered as promising drug targets. In this study, a robust and highly predictive group-based QSAR (GQSAR) model has been developed based on the descriptors calculated for the fragments of 20 1,4-dihydropyridine (DHP) derivatives. A large combinatorial library of DHP analogues was created, the activity of each compound was predicted, and the top compounds were analyzed using refined molecular docking. A detailed interaction analysis was carried out for the top two compounds (EDC and FDC) which showed significant binding affinity for BACE-1 and AChE. This study paves way for consideration of these lead molecules as prospective drugs for the effective dual inhibition of BACE-1 and AChE. The GQSAR model provides site-specific clues about the molecules where certain modifications can result in increased biological activity. This information could be of high value for design and development of multifunctional drugs for combating AD. Manisha Goyal, Jaspreet Kaur Dhanjal, Sukriti Goyal, Chetna Tyagi, Rabia Hamid, and Abhinav Grover Copyright © 2014 Manisha Goyal et al. All rights reserved. Large-Scale Investigation of Human TF-miRNA Relations Based on Coexpression Profiles Mon, 09 Jun 2014 00:00:00 +0000 http://www.hindawi.com/journals/bmri/2014/623078/ Noncoding, endogenous microRNAs (miRNAs) are fairly well known for regulating gene expression rather than protein coding. Dysregulation of miRNA gene, either upregulated or downregulated, may lead to severe diseases or oncogenesis, especially when the miRNA disorder involves significant bioreactions or pathways. Thus, how miRNA genes are transcriptionally regulated has been highlighted as well as target recognition in recent years. In this study, a large-scale investigation of novel cis- and trans-elements was undertaken to further determine TF-miRNA regulatory relations, which are necessary to unravel the transcriptional regulation of miRNA genes. Based on miRNA and annotated gene expression profiles, the term “coTFBS” was introduced to detect common transcription factors and the corresponding binding sites within the promoter regions of each miRNA and its coexpressed annotated genes. The computational pipeline was successfully established to filter redundancy due to short sequence motifs for TFBS pattern search. Eventually, we identified more convinced TF-miRNA regulatory relations for 225 human miRNAs. This valuable information is helpful in understanding miRNA functions and provides knowledge to evaluate the therapeutic potential in clinical research. Once most expression profiles of miRNAs in the latest database are completed, TF candidates of more miRNAs can be explored by this filtering approach in the future. Chia-Hung Chien, Yi-Fan Chiang-Hsieh, Ann-Ping Tsou, Shun-Long Weng, Wen-Chi Chang, and Hsien-Da Huang Copyright © 2014 Chia-Hung Chien et al. All rights reserved. Computational Evidence of NAGNAG Alternative Splicing in Human Large Intergenic Noncoding RNA Thu, 05 Jun 2014 12:22:48 +0000 http://www.hindawi.com/journals/bmri/2014/736798/ NAGNAG alternative splicing plays an essential role in biological processes and represents a highly adaptable system for posttranslational regulation of gene function. NAGNAG alternative splicing impacts a myriad of biological processes. Previous studies of NAGNAG largely focused on messenger RNA. To the best of our knowledge, this is the first study testing the hypothesis that NAGNAG alternative splicing is also operative in large intergenic noncoding RNA (lincRNA). The RNA-seq data sets from recent deep sequencing studies were queried to test our hypothesis. NAGNAG alternative splicing of human lincRNA was identified while querying two independent RNA-seq data sets. Within these datasets, 31 NAGNAG alternative splicing sites were identified in lincRNA. Notably, most exons of lincRNA containing NAGNAG acceptors were longer than those from protein-coding genes. Furthermore, presence of CAG coding appeared to participate in the splice site selection. Finally, expression of the isoforms of NAGNAG lincRNA exhibited tissue specificity. Together, this study improves our understanding of the NAGNAG alternative splicing in lincRNA. Xiaoyong Sun, Simon M. Lin, and Xiaoyan Yan Copyright © 2014 Xiaoyong Sun et al. All rights reserved. The Domain Landscape of Virus-Host Interactomes Wed, 04 Jun 2014 12:18:42 +0000 http://www.hindawi.com/journals/bmri/2014/867235/ Viral infections result in millions of deaths in the world today. A thorough analysis of virus-host interactomes may reveal insights into viral infection and pathogenic strategies. In this study, we presented a landscape of virus-host interactomes based on protein domain interaction. Compared to the analysis at protein level, this domain-domain interactome provided a unique abstraction of protein-protein interactome. Through comparisons among DNA, RNA, and retrotranscribing viruses, we identified a core of human domains, that viruses used to hijack the cellular machinery and evade the immune system, which might be promising antiviral drug targets. We showed that viruses preferentially interacted with host hub and bottleneck domains, and the degree and betweenness centrality among three categories of viruses are significantly different. Further analysis at functional level highlighted that different viruses perturbed the host cellular molecular network by common and unique strategies. Most importantly, we creatively proposed a viral disease network among viral domains, human domains and the corresponding diseases, which uncovered several unknown virus-disease relationships that needed further verification. Overall, it is expected that the findings will help to deeply understand the viral infection and contribute to the development of antiviral therapy. Lu-Lu Zheng, Chunyan Li, Jie Ping, Yanhong Zhou, Yixue Li, and Pei Hao Copyright © 2014 Lu-Lu Zheng et al. All rights reserved. biomvRhsmm: Genomic Segmentation with Hidden Semi-Markov Model Tue, 03 Jun 2014 12:17:37 +0000 http://www.hindawi.com/journals/bmri/2014/910390/ High-throughput technologies like tiling array and next-generation sequencing (NGS) generate continuous homogeneous segments or signal peaks in the genome that represent transcripts and transcript variants (transcript mapping and quantification), regions of deletion and amplification (copy number variation), or regions characterized by particular common features like chromatin state or DNA methylation ratio (epigenetic modifications). However, the volume and output of data produced by these technologies present challenges in analysis. Here, a hidden semi-Markov model (HSMM) is implemented and tailored to handle multiple genomic profile, to better facilitate genome annotation by assisting in the detection of transcripts, regulatory regions, and copy number variation by holistic microarray or NGS. With support for various data distributions, instead of limiting itself to one specific application, the proposed hidden semi-Markov model is designed to allow modeling options to accommodate different types of genomic data and to serve as a general segmentation engine. By incorporating genomic positions into the sojourn distribution of HSMM, with optional prior learning using annotation or previous studies, the modeling output is more biologically sensible. The proposed model has been compared with several other state-of-the-art segmentation models through simulation benchmarking, which shows that our efficient implementation achieves comparable or better sensitivity and specificity in genomic segmentation. Yang Du, Eduard Murani, Siriluck Ponsuksili, and Klaus Wimmers Copyright © 2014 Yang Du et al. All rights reserved. ABC and IFC: Modules Detection Method for PPI Network Mon, 02 Jun 2014 06:16:30 +0000 http://www.hindawi.com/journals/bmri/2014/968173/ Many clustering algorithms are unable to solve the clustering problem of protein-protein interaction (PPI) networks effectively. A novel clustering model which combines the optimization mechanism of artificial bee colony (ABC) with the fuzzy membership matrix is proposed in this paper. The proposed ABC-IFC clustering model contains two parts: searching for the optimum cluster centers using ABC mechanism and forming clusters using intuitionistic fuzzy clustering (IFC) method. Firstly, the cluster centers are set randomly and the initial clustering results are obtained by using fuzzy membership matrix. Then the cluster centers are updated through different functions of bees in ABC algorithm; then the clustering result is obtained through IFC method based on the new optimized cluster center. To illustrate its performance, the ABC-IFC method is compared with the traditional fuzzy C-means clustering and IFC method. The experimental results on MIPS dataset show that the proposed ABC-IFC method not only gets improved in terms of several commonly used evaluation criteria such as precision, recall, and P value, but also obtains a better clustering result. Xiujuan Lei, Fang-Xiang Wu, Jianfang Tian, and Jie Zhao Copyright © 2014 Xiujuan Lei et al. All rights reserved. iCTX-Type: A Sequence-Based Predictor for Identifying the Types of Conotoxins in Targeting Ion Channels Sun, 01 Jun 2014 06:50:38 +0000 http://www.hindawi.com/journals/bmri/2014/286419/ Conotoxins are small disulfide-rich neurotoxic peptides, which can bind to ion channels with very high specificity and modulate their activities. Over the last few decades, conotoxins have been the drug candidates for treating chronic pain, epilepsy, spasticity, and cardiovascular diseases. According to their functions and targets, conotoxins are generally categorized into three types: potassium-channel type, sodium-channel type, and calcium-channel types. With the avalanche of peptide sequences generated in the postgenomic age, it is urgent and challenging to develop an automated method for rapidly and accurately identifying the types of conotoxins based on their sequence information alone. To address this challenge, a new predictor, called iCTX-Type, was developed by incorporating the dipeptide occurrence frequencies of a conotoxin sequence into a 400-D (dimensional) general pseudoamino acid composition, followed by the feature optimization procedure to reduce the sample representation from 400-D to 50-D vector. The overall success rate achieved by iCTX-Type via a rigorous cross-validation was over 91%, outperforming its counterpart (RBF network). Besides, iCTX-Type is so far the only predictor in this area with its web-server available, and hence is particularly useful for most experimental scientists to get their desired results without the need to follow the complicated mathematics involved. Hui Ding, En-Ze Deng, Lu-Feng Yuan, Li Liu, Hao Lin, Wei Chen, and Kuo-Chen Chou Copyright © 2014 Hui Ding et al. All rights reserved. Systems Biology in the Context of Big Data and Networks Tue, 27 May 2014 12:27:40 +0000 http://www.hindawi.com/journals/bmri/2014/428570/ Science is going through two rapidly changing phenomena: one is the increasing capabilities of the computers and software tools from terabytes to petabytes and beyond, and the other is the advancement in high-throughput molecular biology producing piles of data related to genomes, transcriptomes, proteomes, metabolomes, interactomes, and so on. Biology has become a data intensive science and as a consequence biology and computer science have become complementary to each other bridged by other branches of science such as statistics, mathematics, physics, and chemistry. The combination of versatile knowledge has caused the advent of big-data biology, network biology, and other new branches of biology. Network biology for instance facilitates the system-level understanding of the cell or cellular components and subprocesses. It is often also referred to as systems biology. The purpose of this field is to understand organisms or cells as a whole at various levels of functions and mechanisms. Systems biology is now facing the challenges of analyzing big molecular biological data and huge biological networks. This review gives an overview of the progress in big-data biology, and data handling and also introduces some applications of networks and multivariate analysis in systems biology. Md. Altaf-Ul-Amin, Farit Mochamad Afendi, Samuel Kuria Kiboi, and Shigehiko Kanaya Copyright © 2014 Md. Altaf-Ul-Amin et al. All rights reserved. MultiRankSeq: Multiperspective Approach for RNAseq Differential Expression Analysis and Quality Control Tue, 27 May 2014 12:25:42 +0000 http://www.hindawi.com/journals/bmri/2014/248090/ Background. After a decade of microarray technology dominating the field of high-throughput gene expression profiling, the introduction of RNAseq has revolutionized gene expression research. While RNAseq provides more abundant information than microarray, its analysis has proved considerably more complicated. To date, no consensus has been reached on the best approach for RNAseq-based differential expression analysis. Not surprisingly, different studies have drawn different conclusions as to the best approach to identify differentially expressed genes based upon their own criteria and scenarios considered. Furthermore, the lack of effective quality control may lead to misleading results interpretation and erroneous conclusions. To solve these aforementioned problems, we propose a simple yet safe and practical rank-sum approach for RNAseq-based differential gene expression analysis named MultiRankSeq. MultiRankSeq first performs quality control assessment. For data meeting the quality control criteria, MultiRankSeq compares the study groups using several of the most commonly applied analytical methods and combines their results to generate a new rank-sum interpretation. MultiRankSeq provides a unique analysis approach to RNAseq differential expression analysis. MultiRankSeq is written in R, and it is easily applicable. Detailed graphical and tabular analysis reports can be generated with a single command line. Yan Guo, Shilin Zhao, Fei Ye, Quanhu Sheng, and Yu Shyr Copyright © 2014 Yan Guo et al. All rights reserved. Gleditsia sinensis: Transcriptome Sequencing, Construction, and Application of Its Protein-Protein Interaction Network Tue, 27 May 2014 09:02:45 +0000 http://www.hindawi.com/journals/bmri/2014/404578/ Gleditsia sinensis is a genus of deciduous tree in the family Caesalpinioideae, native to China, and is of great economic importance. However, despite its economic value, gene sequence information is strongly lacking. In the present study, transcriptome sequencing of G. sinensis was performed resulting in approximately 75.5 million clean reads assembled into 142155 unique transcripts generating 58583 unigenes. The average length of the unigenes was 900 bp, with an N50 of 549 bp. The obtained unigene sequences were then compared to four protein databases to include NCBI nonredundant protein (NRDB), Swiss-prot, Kyoto Encyclopedia of Genes and Genomes (KEGG), and the Cluster of Orthologous Groups (COG). Using BLAST procedure, 31385 unigenes (53.6%) were generated to have functional annotations. Additionally, sequence homologies between identified unigenes and genes of known species in a protein-protein interaction (PPI) network facilitated G. sinensis PPI network construction. Based on this network construction, new stress resistance genes (including cold, drought, and high salinity) were predicted. The present study is the first investigation of genome-wide gene expression in G. sinensis with the results providing a basis for future functional genomic studies relating to this species. Liucun Zhu, Ying Zhang, Wenna Guo, and Qiang Wang Copyright © 2014 Liucun Zhu et al. All rights reserved. An Association Study between Genetic Polymorphism in the Interleukin-6 Receptor Gene and Coronary Heart Disease Mon, 26 May 2014 11:25:43 +0000 http://www.hindawi.com/journals/bmri/2014/504727/ The goal of our study is to test the association of IL6R rs7529229 polymorphism with CHD through a case-control study in Han Chinese population and a meta-analysis. Our result showed there is a lack of association between IL6R rs7529229 polymorphism and CHD on both genotype and allele levels in Han Chinese (). However, a meta-analysis among 11678 cases and 12861 controls showed that rs7529229-C allele was significantly associated with a decreased risk of CHD, especially in Europeans (, odds ratio = 0.93, 95% confidential interval = 0.89–0.96). Since there is significant difference among different populations, further studies are warranted to test the contribution of rs7529229 to CHD in other ethnic populations. Jiangqing Zhou, Xiaoliang Chen, Huadan Ye, Ping Peng, Yanna Ba, Xi Yang, Xiaoyan Huang, Yae Lu, Xin Jiang, Jiangfang Lian, and Shiwei Duan Copyright © 2014 Jiangqing Zhou et al. All rights reserved. enDNA-Prot: Identification of DNA-Binding Proteins by Applying Ensemble Learning Mon, 26 May 2014 11:09:26 +0000 http://www.hindawi.com/journals/bmri/2014/294279/ DNA-binding proteins are crucial for various cellular processes, such as recognition of specific nucleotide, regulation of transcription, and regulation of gene expression. Developing an effective model for identifying DNA-binding proteins is an urgent research problem. Up to now, many methods have been proposed, but most of them focus on only one classifier and cannot make full use of the large number of negative samples to improve predicting performance. This study proposed a predictor called enDNA-Prot for DNA-binding protein identification by employing the ensemble learning technique. Experiential results showed that enDNA-Prot was comparable with DNA-Prot and outperformed DNAbinder and iDNA-Prot with performance improvement in the range of 3.97–9.52% in ACC and 0.08–0.19 in MCC. Furthermore, when the benchmark dataset was expanded with negative samples, the performance of enDNA-Prot outperformed the three existing methods by 2.83–16.63% in terms of ACC and 0.02–0.16 in terms of MCC. It indicated that enDNA-Prot is an effective method for DNA-binding protein identification and expanding training dataset with negative samples can improve its performance. For the convenience of the vast majority of experimental scientists, we developed a user-friendly web-server for enDNA-Prot which is freely accessible to the public. Ruifeng Xu, Jiyun Zhou, Bin Liu, Lin Yao, Yulan He, Quan Zou, and Xiaolong Wang Copyright © 2014 Ruifeng Xu et al. All rights reserved. A De Novo Genome Assembly Algorithm for Repeats and Nonrepeats Sun, 25 May 2014 08:46:24 +0000 http://www.hindawi.com/journals/bmri/2014/736473/ Background. Next generation sequencing platforms can generate shorter reads, deeper coverage, and higher throughput than those of the Sanger sequencing. These short reads may be assembled de novo before some specific genome analyses. Up to now, the performances of assembling repeats of these current assemblers are very poor. Results. To improve this problem, we proposed a new genome assembly algorithm, named SWA, which has four properties: (1) assembling repeats and nonrepeats; (2) adopting a new overlapping extension strategy to extend each seed; (3) adopting sliding window to filter out the sequencing bias; and (4) proposing a compensational mechanism for low coverage datasets. SWA was evaluated and validated in both simulations and real sequencing datasets. The accuracy of assembling repeats and estimating the copy numbers is up to 99% and 100%, respectively. Finally, the extensive comparisons with other eight leading assemblers show that SWA outperformed others in terms of completeness and correctness of assembling repeats and nonrepeats. Conclusions. This paper proposed a new de novo genome assembly method for resolving complex repeats. SWA not only can detect where repeats or nonrepeats are but also can assemble them completely from NGS data, especially for assembling repeats. This is the advantage over other assemblers. Shuaibin Lian, Qingyan Li, Zhiming Dai, Qian Xiang, and Xianhua Dai Copyright © 2014 Shuaibin Lian et al. All rights reserved. iMethyl-PseAAC: Identification of Protein Methylation Sites via a Pseudo Amino Acid Composition Approach Thu, 22 May 2014 11:45:29 +0000 http://www.hindawi.com/journals/bmri/2014/947416/ Before becoming the native proteins during the biosynthesis, their polypeptide chains created by ribosome’s translating mRNA will undergo a series of “product-forming” steps, such as cutting, folding, and posttranslational modification (PTM). Knowledge of PTMs in proteins is crucial for dynamic proteome analysis of various human diseases and epigenetic inheritance. One of the most important PTMs is the Arg- or Lys-methylation that occurs on arginine or lysine, respectively. Given a protein, which site of its Arg (or Lys) can be methylated, and which site cannot? This is the first important problem for understanding the methylation mechanism and drug development in depth. With the avalanche of protein sequences generated in the postgenomic age, its urgency has become self-evident. To address this problem, we proposed a new predictor, called iMethyl-PseAAC. In the prediction system, a peptide sample was formulated by a 346-dimensional vector, formed by incorporating its physicochemical, sequence evolution, biochemical, and structural disorder information into the general form of pseudo amino acid composition. It was observed by the rigorous jackknife test and independent dataset test that iMethyl-PseAAC was superior to any of the existing predictors in this area. Wang-Ren Qiu, Xuan Xiao, Wei-Zhong Lin, and Kuo-Chen Chou Copyright © 2014 Wang-Ren Qiu et al. All rights reserved. Modelling Arterial Pressure Waveforms Using Gaussian Functions and Two-Stage Particle Swarm Optimizer Tue, 20 May 2014 11:09:31 +0000 http://www.hindawi.com/journals/bmri/2014/923260/ Changes of arterial pressure waveform characteristics have been accepted as risk indicators of cardiovascular diseases. Waveform modelling using Gaussian functions has been used to decompose arterial pressure pulses into different numbers of subwaves and hence quantify waveform characteristics. However, the fitting accuracy and computation efficiency of current modelling approaches need to be improved. This study aimed to develop a novel two-stage particle swarm optimizer (TSPSO) to determine optimal parameters of Gaussian functions. The evaluation was performed on carotid and radial artery pressure waveforms (CAPW and RAPW) which were simultaneously recorded from twenty normal volunteers. The fitting accuracy and calculation efficiency of our TSPSO were compared with three published optimization methods: the Nelder-Mead, the modified PSO (MPSO), and the dynamic multiswarm particle swarm optimizer (DMS-PSO). The results showed that TSPSO achieved the best fitting accuracy with a mean absolute error (MAE) of 1.1% for CAPW and 1.0% for RAPW, in comparison with 4.2% and 4.1% for Nelder-Mead, 2.0% and 1.9% for MPSO, and 1.2% and 1.1% for DMS-PSO. In addition, to achieve target MAE of 2.0%, the computation time of TSPSO was only 1.5 s, which was only 20% and 30% of that for MPSO and DMS-PSO, respectively. Chengyu Liu, Tao Zhuang, Lina Zhao, Faliang Chang, Changchun Liu, Shoushui Wei, Qiqiang Li, and Dingchang Zheng Copyright © 2014 Chengyu Liu et al. All rights reserved. AmalgamScope: Merging Annotations Data across the Human Genome Tue, 20 May 2014 09:25:06 +0000 http://www.hindawi.com/journals/bmri/2014/893501/ The past years have shown an enormous advancement in sequencing and array-based technologies, producing supplementary or alternative views of the genome stored in various formats and databases. Their sheer volume and different data scope pose a challenge to jointly visualize and integrate diverse data types. We present AmalgamScope a new interactive software tool focusing on assisting scientists with the annotation of the human genome and particularly the integration of the annotation files from multiple data types, using gene identifiers and genomic coordinates. Supported platforms include next-generation sequencing and microarray technologies. The available features of AmalgamScope range from the annotation of diverse data types across the human genome to integration of the data based on the annotational information and visualization of the merged files within chromosomal regions or the whole genome. Additionally, users can define custom transcriptome library files for any species and use the file exchanging distant server options of the tool. Georgia Tsiliki, Konstantinos Tsaramirsis, and Sophia Kossida Copyright © 2014 Georgia Tsiliki et al. All rights reserved. Bioinformatic Prediction of WSSV-Host Protein-Protein Interaction Mon, 19 May 2014 13:08:27 +0000 http://www.hindawi.com/journals/bmri/2014/416543/ WSSV is one of the most dangerous pathogens in shrimp aquaculture. However, the molecular mechanism of how WSSV interacts with shrimp is still not very clear. In the present study, bioinformatic approaches were used to predict interactions between proteins from WSSV and shrimp. The genome data of WSSV (NC_003225.1) and the constructed transcriptome data of F. chinensis were used to screen potentially interacting proteins by searching in protein interaction databases, including STRING, Reactome, and DIP. Forty-four pairs of proteins were suggested to have interactions between WSSV and the shrimp. Gene ontology analysis revealed that 6 pairs of these interacting proteins were classified into “extracellular region” or “receptor complex” GO-terms. KEGG pathway analysis showed that they were involved in the “ECM-receptor interaction pathway.” In the 6 pairs of interacting proteins, an envelope protein called “collagen-like protein” (WSSV-CLP) encoded by an early virus gene “wsv001” in WSSV interacted with 6 deduced proteins from the shrimp, including three integrin alpha (ITGA), two integrin beta (ITGB), and one syndecan (SDC). Sequence analysis on WSSV-CLP, ITGA, ITGB, and SDC revealed that they possessed the sequence features for protein-protein interactions. This study might provide new insights into the interaction mechanisms between WSSV and shrimp. Zheng Sun, Shihao Li, Fuhua Li, and Jianhai Xiang Copyright © 2014 Zheng Sun et al. All rights reserved. A Priori Knowledge and Probability Density Based Segmentation Method for Medical CT Image Sequences Mon, 19 May 2014 06:00:45 +0000 http://www.hindawi.com/journals/bmri/2014/769751/ This paper briefly introduces a novel segmentation strategy for CT images sequences. As first step of our strategy, we extract a priori intensity statistical information from object region which is manually segmented by radiologists. Then we define a search scope for object and calculate probability density for each pixel in the scope using a voting mechanism. Moreover, we generate an optimal initial level set contour based on a priori shape of object of previous slice. Finally the modified distance regularity level set method utilizes boundaries feature and probability density to conform final object. The main contributions of this paper are as follows: a priori knowledge is effectively used to guide the determination of objects and a modified distance regularization level set method can accurately extract actual contour of object in a short time. The proposed method is compared to other seven state-of-the-art medical image segmentation methods on abdominal CT image sequences datasets. The evaluated results demonstrate our method performs better and has the potential for segmentation in CT image sequences. Huiyan Jiang, Hanqing Tan, and Benqiang Yang Copyright © 2014 Huiyan Jiang et al. All rights reserved. High-Dimensional Additive Hazards Regression for Oral Squamous Cell Carcinoma Using Microarray Data: A Comparative Study Mon, 19 May 2014 05:42:13 +0000 http://www.hindawi.com/journals/bmri/2014/393280/ Microarray technology results in high-dimensional and low-sample size data sets. Therefore, fitting sparse models is substantial because only a small number of influential genes can reliably be identified. A number of variable selection approaches have been proposed for high-dimensional time-to-event data based on Cox proportional hazards where censoring is present. The present study applied three sparse variable selection techniques of Lasso, smoothly clipped absolute deviation and the smooth integration of counting, and absolute deviation for gene expression survival time data using the additive risk model which is adopted when the absolute effects of multiple predictors on the hazard function are of interest. The performances of used techniques were evaluated by time dependent ROC curve and bootstrap .632+ prediction error curves. The selected genes by all methods were highly significant . The Lasso showed maximum median of area under ROC curve over time (0.95) and smoothly clipped absolute deviation showed the lowest prediction error (0.105). It was observed that the selected genes by all methods improved the prediction of purely clinical model indicating the valuable information containing in the microarray features. So it was concluded that used approaches can satisfactorily predict survival based on selected gene expression measurements. Omid Hamidi, Lily Tapak, Aarefeh Jafarzadeh Kohneloo, and Majid Sadeghifar Copyright © 2014 Omid Hamidi et al. All rights reserved. Identification of Influenza A/H7N9 Virus Infection-Related Human Genes Based on Shortest Paths in a Virus-Human Protein Interaction Network Sun, 18 May 2014 13:12:13 +0000 http://www.hindawi.com/journals/bmri/2014/239462/ The recently emerging Influenza A/H7N9 virus is reported to be able to infect humans and cause mortality. However, viral and host factors associated with the infection are poorly understood. It is suggested by the “guilt by association” rule that interacting proteins share the same or similar functions and hence may be involved in the same pathway. In this study, we developed a computational method to identify Influenza A/H7N9 virus infection-related human genes based on this rule from the shortest paths in a virus-human protein interaction network. Finally, we screened out the most significant 20 human genes, which could be the potential infection related genes, providing guidelines for further experimental validation. Analysis of the 20 genes showed that they were enriched in protein binding, saccharide or polysaccharide metabolism related pathways and oxidative phosphorylation pathways. We also compared the results with those from human rhinovirus (HRV) and respiratory syncytial virus (RSV) by the same method. It was indicated that saccharide or polysaccharide metabolism related pathways might be especially associated with the H7N9 infection. These results could shed some light on the understanding of the virus infection mechanism, providing basis for future experimental biology studies and for the development of effective strategies for H7N9 clinical therapies. Ning Zhang, Min Jiang, Tao Huang, and Yu-Dong Cai Copyright © 2014 Ning Zhang et al. All rights reserved. Identifying Dynamic Protein Complexes Based on Gene Expression Profiles and PPI Networks Sun, 18 May 2014 06:33:00 +0000 http://www.hindawi.com/journals/bmri/2014/375262/ Identification of protein complexes from protein-protein interaction networks has become a key problem for understanding cellular life in postgenomic era. Many computational methods have been proposed for identifying protein complexes. Up to now, the existing computational methods are mostly applied on static PPI networks. However, proteins and their interactions are dynamic in reality. Identifying dynamic protein complexes is more meaningful and challenging. In this paper, a novel algorithm, named DPC, is proposed to identify dynamic protein complexes by integrating PPI data and gene expression profiles. According to Core-Attachment assumption, these proteins which are always active in the molecular cycle are regarded as core proteins. The protein-complex cores are identified from these always active proteins by detecting dense subgraphs. Final protein complexes are extended from the protein-complex cores by adding attachments based on a topological character of “closeness” and dynamic meaning. The protein complexes produced by our algorithm DPC contain two parts: static core expressed in all the molecular cycle and dynamic attachments short-lived. The proposed algorithm DPC was applied on the data of Saccharomyces cerevisiae and the experimental results show that DPC outperforms CMC, MCL, SPICi, HC-PIN, COACH, and Core-Attachment based on the validation of matching with known complexes and hF-measures. Min Li, Weijie Chen, Jianxin Wang, Fang-Xiang Wu, and Yi Pan Copyright © 2014 Min Li et al. All rights reserved. A Network Biology Approach to Discover the Molecular Biomarker Associated with Hepatocellular Carcinoma Wed, 14 May 2014 09:12:48 +0000 http://www.hindawi.com/journals/bmri/2014/278956/ In recent years, high throughput technologies such as microarray platform have provided a new avenue for hepatocellular carcinoma (HCC) investigation. Traditionally, gene sets enrichment analysis of survival related genes is commonly used to reveal the underlying functional mechanisms. However, this approach usually produces too many candidate genes and cannot discover detailed signaling transduction cascades, which greatly limits their clinical application such as biomarker development. In this study, we have proposed a network biology approach to discover novel biomarkers from multidimensional omics data. This approach effectively combines clinical survival data with topological characteristics of human protein interaction networks and patients expression profiling data. It can produce novel network based biomarkers together with biological understanding of molecular mechanism. We have analyzed eighty HCC expression profiling arrays and identified that extracellular matrix and programmed cell death are the main themes related to HCC progression. Compared with traditional enrichment analysis, this approach can provide concrete and testable hypothesis on functional mechanism. Furthermore, the identified subnetworks can potentially be used as suitable targets for therapeutic intervention in HCC. Liwei Zhuang, Yun Wu, Jiwu Han, Xiaohua Ling, Liguo Wang, Chengyan Zhu, and Yili Fu Copyright © 2014 Liwei Zhuang et al. All rights reserved. Breast Cancer Prognosis Risk Estimation Using Integrated Gene Expression and Clinical Data Wed, 14 May 2014 00:00:00 +0000 http://www.hindawi.com/journals/bmri/2014/459203/ Background. Novel prognostic markers are needed so newly diagnosed breast cancer patients do not undergo any unnecessary therapy. Various microarray gene expression datasets based studies have generated gene signatures to predict the prognosis outcomes, while ignoring the large amount of information contained in established clinical markers. Nevertheless, small sample sizes in individual microarray datasets remain a bottleneck in generating robust gene signatures that show limited predictive power. The aim of this study is to achieve high classification accuracy for the good prognosis group and then achieve high classification accuracy for the poor prognosis group. Methods. We propose a novel algorithm called the IPRE (integrated prognosis risk estimation) algorithm. We used integrated microarray datasets from multiple studies to increase the sample sizes (∼2,700 samples). The IPRE algorithm consists of a virtual chromosome for the extraction of the prognostic gene signature that has 79 genes, and a multivariate logistic regression model that incorporates clinical data along with expression data to generate the risk score formula that accurately categorizes breast cancer patients into two prognosis groups. Results. The evaluation on two testing datasets showed that the IPRE algorithm achieved high classification accuracies of 82% and 87%, which was far greater than any existing algorithms. Ashish Saini, Jingyu Hou, and Wanlei Zhou Copyright © 2014 Ashish Saini et al. All rights reserved. Local Alignment Tool Based on Hadoop Framework and GPU Architecture Wed, 14 May 2014 00:00:00 +0000 http://www.hindawi.com/journals/bmri/2014/541490/ With the rapid growth of next generation sequencing technologies, such as Slex, more and more data have been discovered and published. To analyze such huge data the computational performance is an important issue. Recently, many tools, such as SOAP, have been implemented on Hadoop and GPU parallel computing architectures. BLASTP is an important tool, implemented on GPU architectures, for biologists to compare protein sequences. To deal with the big biology data, it is hard to rely on single GPU. Therefore, we implement a distributed BLASTP by combining Hadoop and multi-GPUs. The experimental results present that the proposed method can improve the performance of BLASTP on single GPU, and also it can achieve high availability and fault tolerance. Che-Lun Hung and Guan-Jie Hua Copyright © 2014 Che-Lun Hung and Guan-Jie Hua. All rights reserved. Meta-Analysis of Low Density Lipoprotein Receptor (LDLR) rs2228671 Polymorphism and Coronary Heart Disease Mon, 12 May 2014 14:08:07 +0000 http://www.hindawi.com/journals/bmri/2014/564940/ Low density lipoprotein receptor (LDLR) can regulate cholesterol metabolism by removing the excess low density lipoprotein cholesterol (LDL-C) in blood. Since cholesterol metabolism is often disrupted in coronary heart disease (CHD), LDLR as a candidate gene of CHD has been intensively studied. The goal of our study is to evaluate the overall contribution of LDLR rs2228671 polymorphism to the risk of CHD by combining the genotyping data from multiple case-control studies. Our meta-analysis is involved with 8 case-control studies among 7588 cases and 9711 controls to test the association between LDLR rs2228671 polymorphism and CHD. In addition, we performed a case-control study of LDLR rs2228671 polymorphism with the risk of CHD in Chinese population. Our meta-analysis showed that rs2228671-T allele was significantly associated with a reduced risk of CHD (, odds ratio (OR) = 0.83, and 95% confidence interval (95% CI) = 0.75–0.92). However, rs2228671-T allele frequency was rare (1%) and was not associated with CHD in Han Chinese (), suggesting an ethnic difference of LDLR rs2228671 polymorphism. Meta-analysis has established rs2228671 as a protective factor of CHD in Europeans. The lack of association in Chinese reflects an ethnic difference of this genetic variant between Chinese and European populations. Huadan Ye, Qianlei Zhao, Yi Huang, Lingyan Wang, Haibo Liu, Chunming Wang, Dongjun Dai, Leiting Xu, Meng Ye, and Shiwei Duan Copyright © 2014 Huadan Ye et al. All rights reserved. Integration of Residue Attributes for Sequence Diversity Characterization of Terpenoid Enzymes Sun, 11 May 2014 13:35:58 +0000 http://www.hindawi.com/journals/bmri/2014/753428/ Progress in the “omics” fields such as genomics, transcriptomics, proteomics, and metabolomics has engendered a need for innovative analytical techniques to derive meaningful information from the ever increasing molecular data. KNApSAcK motorcycle DB is a popular database for enzymes related to secondary metabolic pathways in plants. One of the challenges in analyses of protein sequence data in such repositories is the standard notation of sequences as strings of alphabetical characters. This has created lack of a natural underlying metric that eases amenability to computation. In view of this requirement, we applied novel integration of selected biochemical and physical attributes of amino acids derived from the amino acid index and quantified in numerical scale, to examine diversity of peptide sequences of terpenoid synthases accumulated in KNApSAcK motorcycle DB. We initially generated a reduced amino acid index table. This is a set of biochemical and physical properties obtained by random forest feature selection of important indices from the amino acid index. Principal component analysis was then applied for characterization of enzymes involved in synthesis of terpenoids. The variance explained was increased by incorporation of residue attributes for analyses. Nelson Kibinge, Shun Ikeda, Naoaki Ono, Md. Altaf-Ul-Amin, and Shigehiko Kanaya Copyright © 2014 Nelson Kibinge et al. All rights reserved. Topography Prediction of Helical Transmembrane Proteins by a New Modification of the Sliding Window Method Sun, 11 May 2014 00:00:00 +0000 http://www.hindawi.com/journals/bmri/2014/921218/ Protein functions are specified by its three-dimensional structure, which is usually obtained by X-ray crystallography. Due to difficulty of handling membrane proteins experimentally to date the structure has only been determined for a very limited part of membrane proteins (<4%). Nevertheless, investigation of structure and functions of membrane proteins is important for medicine and pharmacology and, therefore, is of significant interest. Methods of computer modeling based on the data on the primary protein structure or the symbolic amino acid sequence have become an actual alternative to the experimental method of X-ray crystallography for investigating the structure of membrane proteins. Here we presented the results of the study of 35 transmembrane proteins, mainly GPCRs, using the novel method of cascade averaging of hydrophobicity function within the limits of a sliding window. The proposed method allowed revealing 139 transmembrane domains out of 140 (or 99.3%) identified by other methods. Also 236 transmembrane domain boundary positions out of 280 (or 84%) were predicted correctly by the proposed method with deviation from the predictions made by other methods that does not exceed the detection error of this method. Maria N. Simakova and Nikolai N. Simakov Copyright © 2014 Maria N. Simakova and Nikolai N. Simakov. All rights reserved. Network of microRNAs-mRNAs Interactions in Pancreatic Cancer Wed, 07 May 2014 13:18:55 +0000 http://www.hindawi.com/journals/bmri/2014/534821/ Background. MicroRNAs are small RNA molecules that regulate the expression of certain genes through interaction with mRNA targets and are mainly involved in human cancer. This study was conducted to make the network of miRNAs-mRNAs interactions in pancreatic cancer as the fourth leading cause of cancer death. Methods. 56 miRNAs that were exclusively expressed and 1176 genes that were downregulated or silenced in pancreas cancer were extracted from beforehand investigations. MiRNA–mRNA interactions data analysis and related networks were explored using MAGIA tool and Cytoscape 3 software. Functional annotations of candidate genes in pancreatic cancer were identified by DAVID annotation tool. Results. This network is made of 217 nodes for mRNA, 15 nodes for miRNA, and 241 edges that show 241 regulations between 15 miRNAs and 217 target genes. The miR-24 was the most significantly powerful miRNA that regulated series of important genes. ACVR2B, GFRA1, and MTHFR were significant target genes were that downregulated. Conclusion. Although the collected previous data seems to be a treasure trove, there was no study simultaneous to analysis of miRNAs and mRNAs interaction. Network of miRNA-mRNA interactions will help to corroborate experimental remarks and could be used to refine miRNA target predictions for developing new therapeutic approaches. Elnaz Naderi, Mehdi Mostafaei, Akram Pourshams, and Ashraf Mohamadkhani Copyright © 2014 Elnaz Naderi et al. All rights reserved. Multiple Regression Analysis of mRNA-miRNA Associations in Colorectal Cancer Pathway Wed, 07 May 2014 12:20:42 +0000 http://www.hindawi.com/journals/bmri/2014/676724/ Background. MicroRNA (miRNA) is a short and endogenous RNA molecule that regulates posttranscriptional gene expression. It is an important factor for tumorigenesis of colorectal cancer (CRC), and a potential biomarker for diagnosis, prognosis, and therapy of CRC. Our objective is to identify the related miRNAs and their associations with genes frequently involved in CRC microsatellite instability (MSI) and chromosomal instability (CIN) signaling pathways. Results. A regression model was adopted to identify the significantly associated miRNAs targeting a set of candidate genes frequently involved in colorectal cancer MSI and CIN pathways. Multiple linear regression analysis was used to construct the model and find the significant mRNA-miRNA associations. We identified three significantly associated mRNA-miRNA pairs: BCL2 was positively associated with miR-16 and SMAD4 was positively associated with miR-567 in the CRC tissue, while MSH6 was positively associated with miR-142-5p in the normal tissue. As for the whole model, BCL2 and SMAD4 models were not significant, and MSH6 model was significant. The significant associations were different in the normal and the CRC tissues. Conclusion. Our results have laid down a solid foundation in exploration of novel CRC mechanisms, and identification of miRNA roles as oncomirs or tumor suppressor mirs in CRC. Fengfeng Wang, S. C. Cesar Wong, Lawrence W. C. Chan, William C. S. Cho, S. P. Yip, and Benjamin Y. M. Yung Copyright © 2014 Fengfeng Wang et al. All rights reserved. Double-Bottom Chaotic Map Particle Swarm Optimization Based on Chi-Square Test to Determine Gene-Gene Interactions Wed, 07 May 2014 11:02:38 +0000 http://www.hindawi.com/journals/bmri/2014/172049/ Gene-gene interaction studies focus on the investigation of the association between the single nucleotide polymorphisms (SNPs) of genes for disease susceptibility. Statistical methods are widely used to search for a good model of gene-gene interaction for disease analysis, and the previously determined models have successfully explained the effects between SNPs and diseases. However, the huge numbers of potential combinations of SNP genotypes limit the use of statistical methods for analysing high-order interaction, and finding an available high-order model of gene-gene interaction remains a challenge. In this study, an improved particle swarm optimization with double-bottom chaotic maps (DBM-PSO) was applied to assist statistical methods in the analysis of associated variations to disease susceptibility. A big data set was simulated using the published genotype frequencies of 26 SNPs amongst eight genes for breast cancer. Results showed that the proposed DBM-PSO successfully determined two- to six-order models of gene-gene interaction for the risk association with breast cancer (odds ratio > 1.0; value ). Analysis results supported that the proposed DBM-PSO can identify good models and provide higher chi-square values than conventional PSO. This study indicates that DBM-PSO is a robust and precise algorithm for determination of gene-gene interaction models for breast cancer. Cheng-Hong Yang, Yu-Da Lin, Li-Yeh Chuang, and Hsueh-Wei Chang Copyright © 2014 Cheng-Hong Yang et al. All rights reserved. Pathway-Driven Discovery of Rare Mutational Impact on Cancer Sun, 04 May 2014 12:48:09 +0000 http://www.hindawi.com/journals/bmri/2014/171892/ Identifying driver mutation is important in understanding disease mechanism and future application of custom tailored therapeutic decision. Functional analysis of mutational impact usually focuses on the gene expression level of the mutated gene itself. However, complex regulatory network may cause differential gene expression among functional neighbors of the mutated gene. We suggest a new approach for discovering rare mutations that have real impact in the context of pathway; the philosophy of our method is iteratively combining rare mutations until no more mutations can be added under the condition that the combined mutational event can statistically discriminate pathway level mRNA expression between groups with and without mutational events. Breast cancer patients with somatic mutation and mRNA expression were analyzed by our approach. Our approach is shown to sensitively capture mutations that change pathway level mRNA expression, concurrently discovering important mutations previously reported in breast cancer such as TP53, PIK3CA, and RB1. In addition, out of 15,819 genes considered in breast cancer, our approach identified mutational events of 32 genes showing pathway level mRNA expression differences. TaeJin Ahn and Taesung Park Copyright © 2014 TaeJin Ahn and Taesung Park. All rights reserved. Mining Seasonal Marine Microbial Pattern with Greedy Heuristic Clustering and Symmetrical Nonnegative Matrix Factorization Sun, 27 Apr 2014 09:56:31 +0000 http://www.hindawi.com/journals/bmri/2014/189590/ With the development of high-throughput and low-cost sequencing technology, a large number of marine microbial sequences were generated. The association patterns between marine microbial species and environment factors are hidden in these large amount sequences. Mining these association patterns is beneficial to exploit the marine resources. However, very few marine microbial association patterns are well investigated in this field. The present study reports the development of a novel method called HC-sNMF to detect the marine microbial association patterns. The results show that the four seasonal marine microbial association networks have characters of complex networks, the same environmental factor influences different species in the four seasons, and the correlative relationships are stronger between OTUs (taxa) than with environmental factors in the four seasons detecting community. Fei Liu, Shao-Wu Zhang, Ze-Gang Wei, Wei Chen, and Chen Zhou Copyright © 2014 Fei Liu et al. All rights reserved. OWL Reasoning Framework over Big Biological Knowledge Network Sun, 27 Apr 2014 00:00:00 +0000 http://www.hindawi.com/journals/bmri/2014/272915/ Recently, huge amounts of data are generated in the domain of biology. Embedded with domain knowledge from different disciplines, the isolated biological resources are implicitly connected. Thus it has shaped a big network of versatile biological knowledge. Faced with such massive, disparate, and interlinked biological data, providing an efficient way to model, integrate, and analyze the big biological network becomes a challenge. In this paper, we present a general OWL (web ontology language) reasoning framework to study the implicit relationships among biological entities. A comprehensive biological ontology across traditional Chinese medicine (TCM) and western medicine (WM) is used to create a conceptual model for the biological network. Then corresponding biological data is integrated into a biological knowledge network as the data model. Based on the conceptual model and data model, a scalable OWL reasoning method is utilized to infer the potential associations between biological entities from the biological network. In our experiment, we focus on the association discovery between TCM and WM. The derived associations are quite useful for biologists to promote the development of novel drugs and TCM modernization. The experimental results show that the system achieves high efficiency, accuracy, scalability, and effectivity. Huajun Chen, Xi Chen, Peiqin Gu, Zhaohui Wu, and Tong Yu Copyright © 2014 Huajun Chen et al. All rights reserved. Novel Design Strategy for Checkpoint Kinase 2 Inhibitors Using Pharmacophore Modeling, Combinatorial Fusion, and Virtual Screening Wed, 23 Apr 2014 09:23:00 +0000 http://www.hindawi.com/journals/bmri/2014/359494/ Checkpoint kinase 2 (Chk2) has a great effect on DNA-damage and plays an important role in response to DNA double-strand breaks and related lesions. In this study, we will concentrate on Chk2 and the purpose is to find the potential inhibitors by the pharmacophore hypotheses (PhModels), combinatorial fusion, and virtual screening techniques. Applying combinatorial fusion into PhModels and virtual screening techniques is a novel design strategy for drug design. We used combinatorial fusion to analyze the prediction results and then obtained the best correlation coefficient of the testing set () with the value 0.816 by combining the and prediction results. The potential inhibitors were selected from NCI database by screening according to + prediction results and molecular docking with CDOCKER docking program. Finally, the selected compounds have high interaction energy between a ligand and a receptor. Through these approaches, 23 potential inhibitors for Chk2 are retrieved for further study. Chun-Yuan Lin and Yen-Ling Wang Copyright © 2014 Chun-Yuan Lin and Yen-Ling Wang. All rights reserved. Syn-Lethality: An Integrative Knowledge Base of Synthetic Lethality towards Discovery of Selective Anticancer Therapies Tue, 22 Apr 2014 00:00:00 +0000 http://www.hindawi.com/journals/bmri/2014/196034/ Synthetic lethality (SL) is a novel strategy for anticancer therapies, whereby mutations of two genes will kill a cell but mutation of a single gene will not. Therefore, a cancer-specific mutation combined with a drug-induced mutation, if they have SL interactions, will selectively kill cancer cells. While numerous SL interactions have been identified in yeast, only a few have been known in human. There is a pressing need to systematically discover and understand SL interactions specific to human cancer. In this paper, we present Syn-Lethality, the first integrative knowledge base of SL that is dedicated to human cancer. It integrates experimentally discovered and verified human SL gene pairs into a network, associated with annotations of gene function, pathway, and molecular mechanisms. It also includes yeast SL genes from high-throughput screenings which are mapped to orthologous human genes. Such an integrative knowledge base, organized as a relational database with user interface for searching and network visualization, will greatly expedite the discovery of novel anticancer drug targets based on synthetic lethality interactions. The database can be downloaded as a stand-alone Java application. Xue-juan Li, Shital K. Mishra, Min Wu, Fan Zhang, and Jie Zheng Copyright © 2014 Xue-juan Li et al. All rights reserved. Using the Sadakane Compressed Suffix Tree to Solve the All-Pairs Suffix-Prefix Problem Wed, 16 Apr 2014 15:52:01 +0000 http://www.hindawi.com/journals/bmri/2014/745298/ The all-pairs suffix-prefix matching problem is a basic problem in string processing. It has an application in the de novo genome assembly task, which is one of the major bioinformatics problems. Due to the large size of the input data, it is crucial to use fast and space efficient solutions. In this paper, we present a space-economical solution to this problem using the generalized Sadakane compressed suffix tree. Furthermore, we present a parallel algorithm to provide more speed for shared memory computers. Our sequential and parallel algorithms are optimized by exploiting features of the Sadakane compressed index data structure. Experimental results show that our solution based on the Sadakane’s compressed index consumes significantly less space than the ones based on noncompressed data structures like the suffix tree and the enhanced suffix array. Our experimental results show that our parallel algorithm is efficient and scales well with increasing number of processors. Maan Haj Rachid, Qutaibah Malluhi, and Mohamed Abouelhoda Copyright © 2014 Maan Haj Rachid et al. All rights reserved. A Knowledge-Driven Approach to Extract Disease-Related Biomarkers from the Literature Wed, 16 Apr 2014 15:51:54 +0000 http://www.hindawi.com/journals/bmri/2014/253128/ The biomedical literature represents a rich source of biomarker information. However, both the size of literature databases and their lack of standardization hamper the automatic exploitation of the information contained in these resources. Text mining approaches have proven to be useful for the exploitation of information contained in the scientific publications. Here, we show that a knowledge-driven text mining approach can exploit a large literature database to extract a dataset of biomarkers related to diseases covering all therapeutic areas. Our methodology takes advantage of the annotation of MEDLINE publications pertaining to biomarkers with MeSH terms, narrowing the search to specific publications and, therefore, minimizing the false positive ratio. It is based on a dictionary-based named entity recognition system and a relation extraction module. The application of this methodology resulted in the identification of 131,012 disease-biomarker associations between 2,803 genes and 2,751 diseases, and represents a valuable knowledge base for those interested in disease-related biomarkers. Additionally, we present a bibliometric analysis of the journals reporting biomarker related information during the last 40 years. À. Bravo, M. Cases, N. Queralt-Rosinach, F. Sanz, and L. I. Furlong Copyright © 2014 À. Bravo et al. All rights reserved. Integrated Analysis of Gene Network in Childhood Leukemia from Microarray and Pathway Databases Tue, 15 Apr 2014 14:07:22 +0000 http://www.hindawi.com/journals/bmri/2014/278748/ Glucocorticoids (GCs) have been used as therapeutic agents for children with acute lymphoblastic leukaemia (ALL) for over 50 years. However, much remains to be understood about the molecular mechanism of GCs actions in ALL subtypes. In this study, we delineate differential responses of ALL subtypes, B- and T-ALL, to GCs treatment at systems level by identifying the differences among biological processes, molecular pathways, and interaction networks that emerge from the action of GCs through the use of a selected number of available bioinformatics methods and tools. We provide biological insight into GC-regulated genes, their related functions, and their networks specific to the ALL subtypes. We show that differentially expressed GC-regulated genes participate in distinct underlying biological processes affected by GCs in B-ALL and T-ALL with little to no overlap. These findings provide the opportunity towards identifying new therapeutic targets. Amphun Chaiboonchoe, Sandhya Samarasinghe, Don Kulasiri, and Kourosh Salehi-Ashtiani Copyright © 2014 Amphun Chaiboonchoe et al. All rights reserved. A Novel Algorithm for Detecting Protein Complexes with the Breadth First Search Thu, 10 Apr 2014 11:03:26 +0000 http://www.hindawi.com/journals/bmri/2014/354539/ Most biological processes are carried out by protein complexes. A substantial number of false positives of the protein-protein interaction (PPI) data can compromise the utility of the datasets for complexes reconstruction. In order to reduce the impact of such discrepancies, a number of data integration and affinity scoring schemes have been devised. The methods encode the reliabilities (confidence) of physical interactions between pairs of proteins. The challenge now is to identify novel and meaningful protein complexes from the weighted PPI network. To address this problem, a novel protein complex mining algorithm ClusterBFS (Cluster with Breadth-First Search) is proposed. Based on the weighted density, ClusterBFS detects protein complexes of the weighted network by the breadth first search algorithm, which originates from a given seed protein used as starting-point. The experimental results show that ClusterBFS performs significantly better than the other computational approaches in terms of the identification of protein complexes. Xiwei Tang, Jianxin Wang, Min Li, Yiming He, and Yi Pan Copyright © 2014 Xiwei Tang et al. All rights reserved. Gene Expression Correlation for Cancer Diagnosis: A Pilot Study Wed, 09 Apr 2014 14:12:08 +0000 http://www.hindawi.com/journals/bmri/2014/253804/ Poor prognosis for late-stage, high-grade, and recurrent cancers has been motivating cancer researchers to search for more efficient biomarkers to identify the onset of cancer. Recent advances in constructing and dynamically analyzing biomolecular networks for different types of cancer have provided a promising novel strategy to detect tumorigenesis and metastasis. The observation of different biomolecular networks associated with normal and cancerous states led us to hypothesize that correlations for gene expressions could serve as valid indicators of early cancer development. In this pilot study, we tested our hypothesis by examining whether the mRNA expressions of three randomly selected cancer-related genes PIK3C3, PIM3, and PTEN were correlated during cancer progression and the correlation coefficients could be used for cancer diagnosis. Strong correlations were observed between PIK3C3 and PIM3 in breast cancer, between PIK3C3 and PTEN in breast and ovary cancers, and between PIM3 and PTEN in breast, kidney, liver, and thyroid cancers during disease progression, implicating that the correlations for cancer network gene expressions could serve as a supplement to current clinical biomarkers, such as cancer antigens, for early cancer diagnosis. Binbing Ling, Lifeng Chen, Qiang Liu, and Jian Yang Copyright © 2014 Binbing Ling et al. All rights reserved. Computational Systems Biology Methods in Molecular Biology, Chemistry Biology, Molecular Biomedicine, and Biopharmacy Wed, 09 Apr 2014 13:17:43 +0000 http://www.hindawi.com/journals/bmri/2014/746814/ Yudong Cai, Julio Vera González, Zengrong Liu, and Tao Huang Copyright © 2014 Yudong Cai et al. All rights reserved. Tools and Databases of the KOMICS Web Portal for Preprocessing, Mining, and Dissemination of Metabolomics Data Wed, 09 Apr 2014 12:35:01 +0000 http://www.hindawi.com/journals/bmri/2014/194812/ A metabolome—the collection of comprehensive quantitative data on metabolites in an organism—has been increasingly utilized for applications such as data-intensive systems biology, disease diagnostics, biomarker discovery, and assessment of food quality. A considerable number of tools and databases have been developed to date for the analysis of data generated by various combinations of chromatography and mass spectrometry. We report here a web portal named KOMICS (The Kazusa Metabolomics Portal), where the tools and databases that we developed are available for free to academic users. KOMICS includes the tools and databases for preprocessing, mining, visualization, and publication of metabolomics data. Improvements in the annotation of unknown metabolites and dissemination of comprehensive metabolomic data are the primary aims behind the development of this portal. For this purpose, PowerGet and FragmentAlign include a manual curation function for the results of metabolite feature alignments. A metadata-specific wiki-based database, Metabolonote, functions as a hub of web resources related to the submitters' work. This feature is expected to increase citation of the submitters' work, thereby promoting data publication. As an example of the practical use of KOMICS, a workflow for a study on Jatropha curcas is presented. The tools and databases available at KOMICS should contribute to enhanced production, interpretation, and utilization of metabolomic Big Data. Nozomu Sakurai, Takeshi Ara, Mitsuo Enomoto, Takeshi Motegi, Yoshihiko Morishita, Atsushi Kurabayashi, Yoko Iijima, Yoshiyuki Ogata, Daisuke Nakajima, Hideyuki Suzuki, and Daisuke Shibata Copyright © 2014 Nozomu Sakurai et al. All rights reserved. An Infrastructure to Mine Molecular Descriptors for Ligand Selection on Virtual Screening Wed, 09 Apr 2014 11:34:08 +0000 http://www.hindawi.com/journals/bmri/2014/325959/ The receptor-ligand interaction evaluation is one important step in rational drug design. The databases that provide the structures of the ligands are growing on a daily basis. This makes it impossible to test all the ligands for a target receptor. Hence, a ligand selection before testing the ligands is needed. One possible approach is to evaluate a set of molecular descriptors. With the aim of describing the characteristics of promising compounds for a specific receptor we introduce a data warehouse-based infrastructure to mine molecular descriptors for virtual screening (VS). We performed experiments that consider as target the receptor HIV-1 protease and different compounds for this protein. A set of 9 molecular descriptors are taken as the predictive attributes and the free energy of binding is taken as a target attribute. By applying the J48 algorithm over the data we obtain decision tree models that achieved up to 84% of accuracy. The models indicate which molecular descriptors and their respective values are relevant to influence good FEB results. Using their rules we performed ligand selection on ZINC database. Our results show important reduction in ligands selection to be applied in VS experiments; for instance, the best selection model picked only 0.21% of the total amount of drug-like ligands. Vinicius Rosa Seus, Giovanni Xavier Perazzo, Ana T. Winck, Adriano V. Werhli, and Karina S. Machado Copyright © 2014 Vinicius Rosa Seus et al. All rights reserved. An Intelligent Clinical Decision Support System for Patient-Specific Predictions to Improve Cervical Intraepithelial Neoplasia Detection Wed, 09 Apr 2014 08:12:50 +0000 http://www.hindawi.com/journals/bmri/2014/341483/ Nowadays, there are molecular biology techniques providing information related to cervical cancer and its cause: the human Papillomavirus (HPV), including DNA microarrays identifying HPV subtypes, mRNA techniques such as nucleic acid based amplification or flow cytometry identifying E6/E7 oncogenes, and immunocytochemistry techniques such as overexpression of p16. Each one of these techniques has its own performance, limitations and advantages, thus a combinatorial approach via computational intelligence methods could exploit the benefits of each method and produce more accurate results. In this article we propose a clinical decision support system (CDSS), composed by artificial neural networks, intelligently combining the results of classic and ancillary techniques for diagnostic accuracy improvement. We evaluated this method on 740 cases with complete series of cytological assessment, molecular tests, and colposcopy examination. The CDSS demonstrated high sensitivity (89.4%), high specificity (97.1%), high positive predictive value (89.4%), and high negative predictive value (97.1%), for detecting cervical intraepithelial neoplasia grade 2 or worse (CIN2+). In comparison to the tests involved in this study and their combinations, the CDSS produced the most balanced results in terms of sensitivity, specificity, PPV, and NPV. The proposed system may reduce the referral rate for colposcopy and guide personalised management and therapeutic interventions. Panagiotis Bountris, Maria Haritou, Abraham Pouliakis, Niki Margari, Maria Kyrgiou, Aris Spathis, Asimakis Pappas, Ioannis Panayiotides, Evangelos A. Paraskevaidis, Petros Karakitsos, and Dimitrios-Dionyssios Koutsouris Copyright © 2014 Panagiotis Bountris et al. All rights reserved. Supervised Clustering Based on DPClusO: Prediction of Plant-Disease Relations Using Jamu Formulas of KNApSAcK Database Mon, 07 Apr 2014 14:04:55 +0000 http://www.hindawi.com/journals/bmri/2014/831751/ Indonesia has the largest medicinal plant species in the world and these plants are used as Jamu medicines. Jamu medicines are popular traditional medicines from Indonesia and we need to systemize the formulation of Jamu and develop basic scientific principles of Jamu to meet the requirement of Indonesian Healthcare System. We propose a new approach to predict the relation between plant and disease using network analysis and supervised clustering. At the preliminary step, we assigned 3138 Jamu formulas to 116 diseases of International Classification of Diseases (ver. 10) which belong to 18 classes of disease from National Center for Biotechnology Information. The correlation measures between Jamu pairs were determined based on their ingredient similarity. Networks are constructed and analyzed by selecting highly correlated Jamu pairs. Clusters were then generated by using the network clustering algorithm DPClusO. By using matching score of a cluster, the dominant disease and high frequency plant associated to the cluster are determined. The plant to disease relations predicted by our method were evaluated in the context of previously published results and were found to produce around 90% successful predictions. Sony Hartono Wijaya, Husnawati Husnawati, Farit Mochamad Afendi, Irmanida Batubara, Latifah K. Darusman, Md. Altaf-Ul-Amin, Tetsuo Sato, Naoaki Ono, Tadao Sugiura, and Shigehiko Kanaya Copyright © 2014 Sony Hartono Wijaya et al. All rights reserved. Combining Haar Wavelet and Karhunen Loeve Transforms for Medical Images Watermarking Mon, 07 Apr 2014 08:14:41 +0000 http://www.hindawi.com/journals/bmri/2014/313078/ This paper presents a novel watermarking method, applied to the medical imaging domain, used to embed the patient’s data into the corresponding image or set of images used for the diagnosis. The main objective behind the proposed technique is to perform the watermarking of the medical images in such a way that the three main attributes of the hidden information (i.e., imperceptibility, robustness, and integration rate) can be jointly ameliorated as much as possible. These attributes determine the effectiveness of the watermark, resistance to external attacks, and increase the integration rate. In order to improve the robustness, a combination of the characteristics of Discrete Wavelet and Karhunen Loeve Transforms is proposed. The Karhunen Loeve Transform is applied on the subblocks (sized ) of the different wavelet coefficients (in the HL2, LH2, and HH2 subbands). In this manner, the watermark will be adapted according to the energy values of each of the Karhunen Loeve components, with the aim of ensuring a better watermark extraction under various types of attacks. For the correct identification of inserted data, the use of an Errors Correcting Code (ECC) mechanism is required for the check and, if possible, the correction of errors introduced into the inserted data. Concerning the enhancement of the imperceptibility factor, the main goal is to determine the optimal value of the visibility factor, which depends on several parameters of the DWT and the KLT transforms. As a first step, a Fuzzy Inference System (FIS) has been set up and then applied to determine an initial visibility factor value. Several features extracted from the Cooccurrence matrix are used as an input to the FIS and used to determine an initial visibility factor for each block; these values are subsequently reweighted in function of the eigenvalues extracted from each subblock. Regarding the integration rate, the previous works insert one bit per coefficient. In our proposal, the integration of the data to be hidden is 3 bits per coefficient so that we increase the integration rate by a factor of magnitude 3. Mohamed Ali Hajjaji, El-Bay Bourennane, Abdessalem Ben Abdelali, and Abdellatif Mtibaa Copyright © 2014 Mohamed Ali Hajjaji et al. All rights reserved. A Novel Feature Selection Strategy for Enhanced Biomedical Event Extraction Using the Turku System Sun, 06 Apr 2014 07:51:12 +0000 http://www.hindawi.com/journals/bmri/2014/205239/ Feature selection is of paramount importance for text-mining classifiers with high-dimensional features. The Turku Event Extraction System (TEES) is the best performing tool in the GENIA BioNLP 2009/2011 shared tasks, which relies heavily on high-dimensional features. This paper describes research which, based on an implementation of an accumulated effect evaluation (AEE) algorithm applying the greedy search strategy, analyses the contribution of every single feature class in TEES with a view to identify important features and modify the feature set accordingly. With an updated feature set, a new system is acquired with enhanced performance which achieves an increased -score of 53.27% up from 51.21% for Task 1 under strict evaluation criteria and 57.24% according to the approximate span and recursive criterion. Jingbo Xia, Alex Chengyu Fang, and Xing Zhang Copyright © 2014 Jingbo Xia et al. All rights reserved. A Novel Bioinformatics Method for Efficient Knowledge Discovery by BLSOM from Big Genomic Sequence Data Thu, 03 Apr 2014 13:31:48 +0000 http://www.hindawi.com/journals/bmri/2014/765648/ With remarkable increase of genomic sequence data of a wide range of species, novel tools are needed for comprehensive analyses of the big sequence data. Self-Organizing Map (SOM) is an effective tool for clustering and visualizing high-dimensional data such as oligonucleotide composition on one map. By modifying the conventional SOM, we have previously developed Batch-Learning SOM (BLSOM), which allows classification of sequence fragments according to species, solely depending on the oligonucleotide composition. In the present study, we introduce the oligonucleotide BLSOM used for characterization of vertebrate genome sequences. We first analyzed pentanucleotide compositions in 100 kb sequences derived from a wide range of vertebrate genomes and then the compositions in the human and mouse genomes in order to investigate an efficient method for detecting differences between the closely related genomes. BLSOM can recognize the species-specific key combination of oligonucleotide frequencies in each genome, which is called a “genome signature,” and the specific regions specifically enriched in transcription-factor-binding sequences. Because the classification and visualization power is very high, BLSOM is an efficient powerful tool for extracting a wide range of information from massive amounts of genomic sequences (i.e., big sequence data). Yu Bai, Yuki Iwasaki, Shigehiko Kanaya, Yue Zhao, and Toshimichi Ikemura Copyright © 2014 Yu Bai et al. All rights reserved. msiDBN: A Method of Identifying Critical Proteins in Dynamic PPI Networks Wed, 02 Apr 2014 12:56:21 +0000 http://www.hindawi.com/journals/bmri/2014/138410/ Dynamics of protein-protein interactions (PPIs) reveals the recondite principles of biological processes inside a cell. Shown in a wealth of study, just a small group of proteins, rather than the majority, play more essential roles at crucial points of biological processes. This present work focuses on identifying these critical proteins exhibiting dramatic structural changes in dynamic PPI networks. First, a comprehensive way of modeling the dynamic PPIs is presented which simultaneously analyzes the activity of proteins and assembles the dynamic coregulation correlation between proteins at each time point. Second, a novel method is proposed, named msiDBN, which models a common representation of multiple PPI networks using a deep belief network framework and analyzes the reconstruction errors and the variabilities across the time courses in the biological process. Experiments were implemented on data of yeast cell cycles. We evaluated our network construction method by comparing the functional representations of the derived networks with two other traditional construction methods. The ranking results of critical proteins in msiDBN were compared with the results from the baseline methods. The results of comparison showed that msiDBN had better reconstruction rate and identified more proteins of critical value to yeast cell cycle process. Yuan Zhang, Nan Du, Kang Li, Jinchao Feng, Kebin Jia, and Aidong Zhang Copyright © 2014 Yuan Zhang et al. All rights reserved. Applied Graph-Mining Algorithms to Study Biomolecular Interaction Networks Wed, 02 Apr 2014 11:57:36 +0000 http://www.hindawi.com/journals/bmri/2014/439476/ Protein-protein interaction (PPI) networks carry vital information on the organization of molecular interactions in cellular systems. The identification of functionally relevant modules in PPI networks is one of the most important applications of biological network analysis. Computational analysis is becoming an indispensable tool to understand large-scale biomolecular interaction networks. Several types of computational methods have been developed and employed for the analysis of PPI networks. Of these computational methods, graph comparison and module detection are the two most commonly used strategies. This review summarizes current literature on graph kernel and graph alignment methods for graph comparison strategies, as well as module detection approaches including seed-and-extend, hierarchical clustering, optimization-based, probabilistic, and frequent subgraph methods. Herein, we provide a comprehensive review of the major algorithms employed under each theme, including our recently published frequent subgraph method, for detecting functional modules commonly shared across multiple cancer PPI networks. Ru Shen and Chittibabu Guda Copyright © 2014 Ru Shen and Chittibabu Guda. All rights reserved. An Unsupervised Approach to Predict Functional Relations between Genes Based on Expression Data Mon, 31 Mar 2014 07:16:16 +0000 http://www.hindawi.com/journals/bmri/2014/154594/ This work presents a novel approach to predict functional relations between genes using gene expression data. Genes may have various types of relations between them, for example, regulatory relations, or they may be concerned with the same protein complex or metabolic/signaling pathways and obviously gene expression data should contain some clues to such relations. The present approach first digitizes the log-ratio type gene expression data of S. cerevisiae to a matrix consisting of 1, 0, and −1 indicating highly expressed, no major change, and highly suppressed conditions for genes, respectively. For each gene pair, a probability density mass function table is constructed indicating nine joint probabilities. Then gene pairs were selected based on linear and probabilistic relation between their profiles indicated by the sum of probability density masses in selected points. The selected gene pairs share many Gene Ontology terms. Furthermore a network is constructed by selecting a large number of gene pairs based on FDR analysis and the clustering of the network generates many modules rich with similar function genes. Also, the promoters of the gene sets in many modules are rich with binding sites of known transcription factors indicating the effectiveness of the proposed approach in predicting regulatory relations. Md. Altaf-Ul-Amin, Tetsuo Katsuragi, Tetsuo Sato, Naoaki Ono, and Shigehiko Kanaya Copyright © 2014 Md. Altaf-Ul-Amin et al. All rights reserved. Protein Sequence Classification with Improved Extreme Learning Machine Algorithms Sun, 30 Mar 2014 09:04:21 +0000 http://www.hindawi.com/journals/bmri/2014/103054/ Precisely classifying a protein sequence from a large biological protein sequences database plays an important role for developing competitive pharmacological products. Comparing the unseen sequence with all the identified protein sequences and returning the category index with the highest similarity scored protein, conventional methods are usually time-consuming. Therefore, it is urgent and necessary to build an efficient protein sequence classification system. In this paper, we study the performance of protein sequence classification using SLFNs. The recent efficient extreme learning machine (ELM) and its invariants are utilized as the training algorithms. The optimal pruned ELM is first employed for protein sequence classification in this paper. To further enhance the performance, the ensemble based SLFNs structure is constructed where multiple SLFNs with the same number of hidden nodes and the same activation function are used as ensembles. For each ensemble, the same training algorithm is adopted. The final category index is derived using the majority voting method. Two approaches, namely, the basic ELM and the OP-ELM, are adopted for the ensemble based SLFNs. The performance is analyzed and compared with several existing methods using datasets obtained from the Protein Information Resource center. The experimental results show the priority of the proposed algorithms. Jiuwen Cao and Lianglin Xiong Copyright © 2014 Jiuwen Cao and Lianglin Xiong. All rights reserved. Association between 2/3/4, Promoter Polymorphism (−491A/T, −427T/C, and −219T/G) at the Apolipoprotein E Gene, and Mental Retardation in Children from an Iodine Deficiency Area, China Tue, 25 Mar 2014 12:55:09 +0000 http://www.hindawi.com/journals/bmri/2014/236702/ Background. Several common single-nucleotide polymorphisms (SNPs) at apolipoprotein E (ApoE) have been linked with late onset sporadic Alzheimer’s disease and declining normative cognitive ability in elder people, but we are unclear about their relationship with cognition in children. Results. We studied , , and promoter polymorphisms and at ApoE among children with mental retardation (MR, ), borderline MR (), and controls () from an iodine deficiency area in China. The allelic and genotypic distribution of individual locus did not significantly differ among three groups with Mantel-Haenszel test (). However, frequencies of haplotype of /// were distributed as MR > borderline MR > controls ( uncorrected = 0.004), indicating that the presence of this haplotype may increase the risk of disease. Conclusions. In this large population-based study in children, we did not find any significant association between single locus of the four common ApoE polymorphisms (, , , and ) and MR or borderline MR. However, we found that the presence of ATT haplotype was associated with an increased risk of MR and borderline MR. Our present work may help enlarge our knowledge of the cognitive role of ApoE across the lifespan and the mechanisms of human cognition. Jun Li, Fuchang Zhang, Yunliang Wang, Yan Wang, Wei Qin, Qinghe Xing, Xueqing Qian, Tingwei Guo, Xiaocai Gao, Lin He, and Jianjun Gao Copyright © 2014 Jun Li et al. All rights reserved. Survey of Network-Based Approaches to Research of Cardiovascular Diseases Thu, 20 Mar 2014 08:21:55 +0000 http://www.hindawi.com/journals/bmri/2014/527029/ Cardiovascular diseases (CVDs) are the leading health problem worldwide. Investigating causes and mechanisms of CVDs calls for an integrative approach that would take into account its complex etiology. Biological networks generated from available data on biomolecular interactions are an excellent platform for understanding interconnectedness of all processes within a living cell, including processes that underlie diseases. Consequently, topology of biological networks has successfully been used for identifying genes, pathways, and modules that govern molecular actions underlying various complex diseases. Here, we review approaches that explore and use relationships between topological properties of biological networks and mechanisms underlying CVDs. Anida Sarajlić and Nataša Pržulj Copyright © 2014 Anida Sarajlić and Nataša Pržulj. All rights reserved. New Strategies for Evaluation and Analysis of SELEX Experiments Wed, 19 Mar 2014 13:58:14 +0000 http://www.hindawi.com/journals/bmri/2014/849743/ Aptamers are an interesting alternative to antibodies in pharmaceutics and biosensorics, because they are able to bind to a multitude of possible target molecules with high affinity. Therefore the process of finding such aptamers, which is commonly a SELEX screening process, becomes crucial. The standard SELEX procedure schedules the validation of certain found aptamers via binding experiments, which is not leading to any detailed specification of the aptamer enrichment during the screening. For the purpose of advanced analysis of the accrued enrichment within the SELEX library we used sequence information gathered by next generation sequencing techniques in addition to the standard SELEX procedure. As sequence motifs are one possibility of enrichment description, the need of finding those recurring sequence motifs corresponding to substructures within the aptamers, which are characteristically fitted to specific binding sites of the target, arises. In this paper a motif search algorithm is presented, which helps to describe the aptamers enrichment in more detail. The extensive characterization of target and binding aptamers may later reveal a functional connection between these molecules, which can be modeled and used to optimize future SELEX runs in case of the generation of target-specific starting libraries. Rico Beier, Elke Boschke, and Dirk Labudde Copyright © 2014 Rico Beier et al. All rights reserved. Essential Functional Modules for Pathogenic and Defensive Mechanisms in Candida albicans Infections Tue, 18 Mar 2014 12:20:46 +0000 http://www.hindawi.com/journals/bmri/2014/136130/ The clinical and biological significance of the study of fungal pathogen Candida albicans (C. albicans) has markedly increased. However, the explicit pathogenic and invasive mechanisms of such host-pathogen interactions have not yet been fully elucidated. Therefore, the essential functional modules involved in C. albicans-zebrafish interactions were investigated in this study. Adopting a systems biology approach, the early-stage and late-stage protein-protein interaction (PPI) networks for both C. albicans and zebrafish were constructed. By comparing PPI networks at the early and late stages of the infection process, several critical functional modules were identified in both pathogenic and defensive mechanisms. Functional modules in C. albicans, like those involved in hyphal morphogenesis, ion and small molecule transport, protein secretion, and shifts in carbon utilization, were seen to play important roles in pathogen invasion and damage caused to host cells. Moreover, the functional modules in zebrafish, such as those involved in immune response, apoptosis mechanisms, ion transport, protein secretion, and hemostasis-related processes, were found to be significant as defensive mechanisms during C. albicans infection. The essential functional modules thus determined could provide insights into the molecular mechanisms of host-pathogen interactions during the infection process and thereby devise potential therapeutic strategies to treat C. albicans infection. Yu-Chao Wang, I-Chun Tsai, Che Lin, Wen-Ping Hsieh, Chung-Yu Lan, Yung-Jen Chuang, and Bor-Sen Chen Copyright © 2014 Yu-Chao Wang et al. All rights reserved. A Diverse Stochastic Search Algorithm for Combination Therapeutics Wed, 12 Mar 2014 00:00:00 +0000 http://www.hindawi.com/journals/bmri/2014/873436/ Background. Design of drug combination cocktails to maximize sensitivity for individual patients presents a challenge in terms of minimizing the number of experiments to attain the desired objective. The enormous number of possible drug combinations constrains exhaustive experimentation approaches, and personal variations in genetic diseases restrict the use of prior knowledge in optimization. Results. We present a stochastic search algorithm that consisted of a parallel experimentation phase followed by a combination of focused and diversified sequential search. We evaluated our approach on seven synthetic examples; four of them were evaluated twice with different parameters, and two biological examples of bacterial and lung cancer cell inhibition response to combination drugs. The performance of our approach as compared to recently proposed adaptive reference update approach was superior for all the examples considered, achieving an average of 45% reduction in the number of experimental iterations. Conclusions. As the results illustrate, the proposed diverse stochastic search algorithm can produce optimized combinations in relatively smaller number of iterative steps. This approach can be combined with available knowledge on the genetic makeup of the patient to design optimal selection of drug cocktails. Mehmet Umut Caglar and Ranadip Pal Copyright © 2014 Mehmet Umut Caglar and Ranadip Pal. All rights reserved. Visualization of Genome Signatures of Eukaryote Genomes by Batch-Learning Self-Organizing Map with a Special Emphasis on Drosophila Genomes Tue, 11 Mar 2014 09:27:17 +0000 http://www.hindawi.com/journals/bmri/2014/985706/ A strategy of evolutionary studies that can compare vast numbers of genome sequences is becoming increasingly important with the remarkable progress of high-throughput DNA sequencing methods. We previously established a sequence alignment-free clustering method “BLSOM” for di-, tri-, and tetranucleotide compositions in genome sequences, which can characterize sequence characteristics (genome signatures) of a wide range of species. In the present study, we generated BLSOMs for tetra- and pentanucleotide compositions in approximately one million sequence fragments derived from 101 eukaryotes, for which almost complete genome sequences were available. BLSOM recognized phylotype-specific characteristics (e.g., key combinations of oligonucleotide frequencies) in the genome sequences, permitting phylotype-specific clustering of the sequences without any information regarding the species. In our detailed examination of 12 Drosophila species, the correlation between their phylogenetic classification and the classification on the BLSOMs was observed to visualize oligonucleotides diagnostic for species-specific clustering. Takashi Abe, Yuta Hamano, and Toshimichi Ikemura Copyright © 2014 Takashi Abe et al. All rights reserved. Exact and Heuristic Methods for Network Completion for Time-Varying Genetic Networks Sun, 09 Mar 2014 11:48:52 +0000 http://www.hindawi.com/journals/bmri/2014/684014/ Robustness in biological networks can be regarded as an important feature of living systems. A system maintains its functions against internal and external perturbations, leading to topological changes in the network with varying delays. To understand the flexibility of biological networks, we propose a novel approach to analyze time-dependent networks, based on the framework of network completion, which aims to make the minimum amount of modifications to a given network so that the resulting network is most consistent with the observed data. We have developed a novel network completion method for time-varying networks by extending our previous method for the completion of stationary networks. In particular, we introduce a double dynamic programming technique to identify change time points and required modifications. Although this extended method allows us to guarantee the optimality of the solution, this method has relatively low computational efficiency. In order to resolve this difficulty, we developed a heuristic method for speeding up the calculation of minimum least squares errors. We demonstrate the effectiveness of our proposed methods through computational experiments using synthetic data and real microarray gene expression data. The results indicate that our methods exhibit good performance in terms of completing and inferring gene association networks with time-varying structures. Natsu Nakajima and Tatsuya Akutsu Copyright © 2014 Natsu Nakajima and Tatsuya Akutsu. All rights reserved. Evaluating Word Representation Features in Biomedical Named Entity Recognition Tasks Thu, 06 Mar 2014 13:34:51 +0000 http://www.hindawi.com/journals/bmri/2014/240403/ Biomedical Named Entity Recognition (BNER), which extracts important entities such as genes and proteins, is a crucial step of natural language processing in the biomedical domain. Various machine learning-based approaches have been applied to BNER tasks and showed good performance. In this paper, we systematically investigated three different types of word representation (WR) features for BNER, including clustering-based representation, distributional representation, and word embeddings. We selected one algorithm from each of the three types of WR features and applied them to the JNLPBA and BioCreAtIvE II BNER tasks. Our results showed that all the three WR algorithms were beneficial to machine learning-based BNER systems. Moreover, combining these different types of WR features further improved BNER performance, indicating that they are complementary to each other. By combining all the three types of WR features, the improvements in -measure on the BioCreAtIvE II GM and JNLPBA corpora were 3.75% and 1.39%, respectively, when compared with the systems using baseline features. To the best of our knowledge, this is the first study to systematically evaluate the effect of three different types of WR features for BNER tasks. Buzhou Tang, Hongxin Cao, Xiaolong Wang, Qingcai Chen, and Hua Xu Copyright © 2014 Buzhou Tang et al. All rights reserved. Identifying Gastric Cancer Related Genes Using the Shortest Path Algorithm and Protein-Protein Interaction Network Wed, 05 Mar 2014 16:35:58 +0000 http://www.hindawi.com/journals/bmri/2014/371397/ Gastric cancer, as one of the leading causes of cancer related deaths worldwide, causes about 800,000 deaths per year. Up to now, the mechanism underlying this disease is still not totally uncovered. Identification of related genes of this disease is an important step which can help to understand the mechanism underlying this disease, thereby designing effective treatments. In this study, some novel gastric cancer related genes were discovered based on the knowledge of known gastric cancer related ones. These genes were searched by applying the shortest path algorithm in protein-protein interaction network. The analysis results suggest that some of them are indeed involved in the biological process of gastric cancer, which indicates that they are the actual gastric cancer related genes with high probability. It is hopeful that the findings in this study may help promote the study of this disease and the methods can provide new insights to study various diseases. Yang Jiang, Yang Shu, Ying Shi, Li-Peng Li, Fei Yuan, and Hui Ren Copyright © 2014 Yang Jiang et al. All rights reserved. TF2LncRNA: Identifying Common Transcription Factors for a List of lncRNA Genes from ChIP-Seq Data Tue, 04 Mar 2014 07:37:38 +0000 http://www.hindawi.com/journals/bmri/2014/317642/ High-throughput genomic technologies like lncRNA microarray and RNA-Seq often generate a set of lncRNAs of interest, yet little is known about the transcriptional regulation of the set of lncRNA genes. Here, based on ChIP-Seq peak lists of transcription factors (TFs) from ENCODE and annotated human lncRNAs from GENCODE, we developed a web-based interface titled “TF2lncRNA,” where TF peaks from each ChIP-Seq experiment are crossed with the genomic coordinates of a set of input lncRNAs, to identify which TFs present a statistically significant number of binding sites (peaks) within the regulatory region of the input lncRNA genes. The input can be a set of coexpressed lncRNA genes or any other cluster of lncRNA genes. Users can thus infer which TFs are likely to be common transcription regulators of the set of lncRNAs. In addition, users can retrieve all lncRNAs potentially regulated by a specific TF in a specific cell line of interest or retrieve all TFs that have one or more binding sites in the regulatory region of a given lncRNA in the specific cell line. TF2LncRNA is an efficient and easy-to-use web-based tool. Qinghua Jiang, Jixuan Wang, Yadong Wang, Rui Ma, Xiaoliang Wu, and Yu Li Copyright © 2014 Qinghua Jiang et al. All rights reserved. Comparative Metagenomic Analysis of Human Gut Microbiome Composition Using Two Different Bioinformatic Pipelines Tue, 25 Feb 2014 09:12:11 +0000 http://www.hindawi.com/journals/bmri/2014/325340/ Technological advances in next-generation sequencing-based approaches have greatly impacted the analysis of microbial community composition. In particular, 16S rRNA-based methods have been widely used to analyze the whole set of bacteria present in a target environment. As a consequence, several specific bioinformatic pipelines have been developed to manage these data. MetaGenome Rapid Annotation using Subsystem Technology (MG-RAST) and Quantitative Insights Into Microbial Ecology (QIIME) are two freely available tools for metagenomic analyses that have been used in a wide range of studies. Here, we report the comparative analysis of the same dataset with both QIIME and MG-RAST in order to evaluate their accuracy in taxonomic assignment and in diversity analysis. We found that taxonomic assignment was more accurate with QIIME which, at family level, assigned a significantly higher number of reads. Thus, QIIME generated a more accurate BIOM file, which in turn improved the diversity analysis output. Finally, although informatics skills are needed to install QIIME, it offers a wide range of metrics that are useful for downstream applications and, not less important, it is not dependent on server times. Valeria D’Argenio, Giorgio Casaburi, Vincenza Precone, and Francesco Salvatore Copyright © 2014 Valeria D’Argenio et al. All rights reserved. Approaches for Recognizing Disease Genes Based on Network Mon, 24 Feb 2014 07:40:35 +0000 http://www.hindawi.com/journals/bmri/2014/416323/ Diseases are closely related to genes, thus indicating that genetic abnormalities may lead to certain diseases. The recognition of disease genes has long been a goal in biology, which may contribute to the improvement of health care and understanding gene functions, pathways, and interactions. However, few large-scale gene-gene association datasets, disease-disease association datasets, and gene-disease association datasets are available. A number of machine learning methods have been used to recognize disease genes based on networks. This paper states the relationship between disease and gene, summarizes the approaches used to recognize disease genes based on network, analyzes the core problems and challenges of the methods, and outlooks future research direction. Quan Zou, Jinjin Li, Chunyu Wang, and Xiangxiang Zeng Copyright © 2014 Quan Zou et al. All rights reserved. Predicting Glycerophosphoinositol Identities in Lipidomic Datasets Using VaLID (Visualization and Phospholipid Identification)—An Online Bioinformatic Search Engine Thu, 20 Feb 2014 09:59:05 +0000 http://www.hindawi.com/journals/bmri/2014/818670/ The capacity to predict and visualize all theoretically possible glycerophospholipid molecular identities present in lipidomic datasets is currently limited. To address this issue, we expanded the search-engine and compositional databases of the online Visualization and Phospholipid Identification (VaLID) bioinformatic tool to include the glycerophosphoinositol superfamily. VaLID v1.0.0 originally allowed exact and average mass libraries of 736,584 individual species from eight phospholipid classes: glycerophosphates, glyceropyrophosphates, glycerophosphocholines, glycerophosphoethanolamines, glycerophosphoglycerols, glycerophosphoglycerophosphates, glycerophosphoserines, and cytidine 5′-diphosphate 1,2-diacyl-sn-glycerols to be searched for any mass to charge value (with adjustable tolerance levels) under a variety of mass spectrometry conditions. Here, we describe an update that now includes all possible glycerophosphoinositols, glycerophosphoinositol monophosphates, glycerophosphoinositol bisphosphates, and glycerophosphoinositol trisphosphates. This update expands the total number of lipid species represented in the VaLID v2.0.0 database to 1,473,168 phospholipids. Each phospholipid can be generated in skeletal representation. A subset of species curated by the Canadian Institutes of Health Research Training Program in Neurodegenerative Lipidomics (CTPNL) team is provided as an array of high-resolution structures. VaLID is freely available and responds to all users through the CTPNL resources web site. Graeme S. V. McDowell, Alexandre P. Blanchard, Graeme P. Taylor, Daniel Figeys, Stephen Fai, and Steffany A. L. Bennett Copyright © 2014 Graeme S. V. McDowell et al. All rights reserved. Integrative Analysis of miRNA-mRNA and miRNA-miRNA Interactions Wed, 12 Feb 2014 15:17:43 +0000 http://www.hindawi.com/journals/bmri/2014/907420/ MicroRNAs (miRNAs) are small, noncoding regulatory molecules. They are involved in many essential biological processes and act by suppressing gene expression. The present work reports an integrative analysis of miRNA-mRNA and miRNA-miRNA interactions and their regulatory patterns using high-throughput miRNA and mRNA datasets. Aberrantly expressed miRNA and mRNA profiles were obtained based on fold change analysis, and qRT-PCR was used for further validation of deregulated miRNAs. miRNAs and target mRNAs were found to show various expression patterns. miRNA-miRNA interactions and clustered/homologous miRNAs were also found to contribute to the flexible and selective regulatory network. Interacting miRNAs (e.g., miRNA-103a and miR-103b) showed more pronounced differences in expression, which suggests the potential “restricted interaction” in the miRNA world. miRNAs from the same gene clusters (e.g., miR-23b gene cluster) or gene families (e.g., miR-10 gene family) always showed the same types of deregulation patterns, although they sometimes differed in expression levels. These clustered and homologous miRNAs may have close functional relationships, which may indicate collaborative interactions between miRNAs. The integrative analysis of miRNA-mRNA based on biological characteristics of miRNA will further enrich miRNA study. Li Guo, Yang Zhao, Sheng Yang, Hui Zhang, and Feng Chen Copyright © 2014 Li Guo et al. All rights reserved. Network-Assisted Prediction of Potential Drugs for Addiction Sun, 09 Feb 2014 12:25:55 +0000 http://www.hindawi.com/journals/bmri/2014/258784/ Drug addiction is a chronic and complex brain disease, adding much burden on the community. Though numerous efforts have been made to identify the effective treatment, it is necessary to find more novel therapeutics for this complex disease. As network pharmacology has become a promising approach for drug repurposing, we proposed to apply the approach to drug addiction, which might provide new clues for the development of effective addiction treatment drugs. We first extracted 44 addictive drugs from the NIDA and their targets from DrugBank. Then, we constructed two networks: an addictive drug-target network and an expanded addictive drug-target network by adding other drugs that have at least one common target with these addictive drugs. By performing network analyses, we found that those addictive drugs with similar actions tended to cluster together. Additionally, we predicted 94 nonaddictive drugs with potential pharmacological functions to the addictive drugs. By examining the PubMed data, 51 drugs significantly cooccurred with addictive keywords than expected. Thus, the network analyses provide a list of candidate drugs for further investigation of their potential in addiction treatment or risk. Jingchun Sun, Liang-Chin Huang, Hua Xu, and Zhongming Zhao Copyright © 2014 Jingchun Sun et al. All rights reserved. Erratum to “New Optical Methods for Liveness Detection on Fingers” Sun, 02 Feb 2014 13:42:02 +0000 http://www.hindawi.com/journals/bmri/2014/252790/ Martin Drahansky, Michal Dolezel, Jan Vana, Eva Brezinova, Jaegeol Yim, and Kyubark Shim Copyright © 2014 Martin Drahansky et al. All rights reserved. A Novel Approach for Discovering Condition-Specific Correlations of Gene Expressions within Biological Pathways by Using Cloud Computing Technology Wed, 22 Jan 2014 17:16:42 +0000 http://www.hindawi.com/journals/bmri/2014/763237/ Microarrays are widely used to assess gene expressions. Most microarray studies focus primarily on identifying differential gene expressions between conditions (e.g., cancer versus normal cells), for discovering the major factors that cause diseases. Because previous studies have not identified the correlations of differential gene expression between conditions, crucial but abnormal regulations that cause diseases might have been disregarded. This paper proposes an approach for discovering the condition-specific correlations of gene expressions within biological pathways. Because analyzing gene expression correlations is time consuming, an Apache Hadoop cloud computing platform was implemented. Three microarray data sets of breast cancer were collected from the Gene Expression Omnibus, and pathway information from the Kyoto Encyclopedia of Genes and Genomes was applied for discovering meaningful biological correlations. The results showed that adopting the Hadoop platform considerably decreased the computation time. Several correlations of differential gene expressions were discovered between the relapse and nonrelapse breast cancer samples, and most of them were involved in cancer regulation and cancer-related pathways. The results showed that breast cancer recurrence might be highly associated with the abnormal regulations of these gene pairs, rather than with their individual expression levels. The proposed method was computationally efficient and reliable, and stable results were obtained when different data sets were used. The proposed method is effective in identifying meaningful biological regulation patterns between conditions. Tzu-Hao Chang, Shih-Lin Wu, Wei-Jen Wang, Jorng-Tzong Horng, and Cheng-Wei Chang Copyright © 2014 Tzu-Hao Chang et al. All rights reserved. Microsatellites in the Genome of the Edible Mushroom, Volvariella volvacea Sun, 19 Jan 2014 00:00:00 +0000 http://www.hindawi.com/journals/bmri/2014/281912/ Using bioinformatics software and database, we have characterized the microsatellite pattern in the V. volvacea genome and compared it with microsatellite patterns found in the genomes of four other edible fungi: Coprinopsis cinerea, Schizophyllum commune, Agaricus bisporus, and Pleurotus ostreatus. A total of 1346 microsatellites have been identified, with mono-nucleotides being the most frequent motif. The relative abundance of microsatellites was lower in coding regions with 21 No./Mb. However, the microsatellites in the V. volvacea gene models showed a greater tendency to be located in the CDS regions. There was also a higher preponderance of trinucleotide repeats, especially in the kinase genes, which implied a possible role in phenotypic variation. Among the five fungal genomes, microsatellite abundance appeared to be unrelated to genome size. Furthermore, the short motifs (mono- to tri-nucleotides) outnumbered other categories although these differed in proportion. Data analysis indicated a possible relationship between the most frequent microsatellite types and the genetic distance between the five fungal genomes. Ying Wang, Mingjie Chen, Hong Wang, Jing-Fang Wang, and Dapeng Bao Copyright © 2014 Ying Wang et al. All rights reserved. Integration of High-Volume Molecular and Imaging Data for Composite Biomarker Discovery in the Study of Melanoma Thu, 16 Jan 2014 16:36:04 +0000 http://www.hindawi.com/journals/bmri/2014/145243/ In this work the effects of simple imputations are studied, regarding the integration of multimodal data originating from different patients. Two separate datasets of cutaneous melanoma are used, an image analysis (dermoscopy) dataset together with a transcriptomic one, specifically DNA microarrays. Each modality is related to a different set of patients, and four imputation methods are employed to the formation of a unified, integrative dataset. The application of backward selection together with ensemble classifiers (random forests), followed by principal components analysis and linear discriminant analysis, illustrates the implication of the imputations on feature selection and dimensionality reduction methods. The results suggest that the expansion of the feature space through the data integration, achieved by the exploitation of imputation schemes in general, aids the classification task, imparting stability as regards the derivation of putative classifiers. In particular, although the biased imputation methods increase significantly the predictive performance and the class discrimination of the datasets, they still contribute to the study of prominent features and their relations. The fusion of separate datasets, which provide a multimodal description of the same pathology, represents an innovative, promising avenue, enhancing robust composite biomarker derivation and promoting the interpretation of the biomedical problem studied. Konstantinos Moutselos, Ilias Maglogiannis, and Aristotelis Chatziioannou Copyright © 2014 Konstantinos Moutselos et al. All rights reserved. Network Analysis of Neurodegenerative Disease Highlights a Role of Toll-Like Receptor Signaling Thu, 16 Jan 2014 13:33:49 +0000 http://www.hindawi.com/journals/bmri/2014/686505/ Despite significant advances in the study of the molecular mechanisms altered in the development and progression of neurodegenerative diseases (NDs), the etiology is still enigmatic and the distinctions between diseases are not always entirely clear. We present an efficient computational method based on protein-protein interaction network (PPI) to model the functional network of NDs. The aim of this work is fourfold: (i) reconstruction of a PPI network relating to the NDs, (ii) construction of an association network between diseases based on proximity in the disease PPI network, (iii) quantification of disease associations, and (iv) inference of potential molecular mechanism involved in the diseases. The functional links of diseases not only showed overlap with the traditional classification in clinical settings, but also offered new insight into connections between diseases with limited clinical overlap. To gain an expanded view of the molecular mechanisms involved in NDs, both direct and indirect connector proteins were investigated. The method uncovered molecular relationships that are in common apparently distinct diseases and provided important insight into the molecular networks implicated in disease pathogenesis. In particular, the current analysis highlighted the Toll-like receptor signaling pathway as a potential candidate pathway to be targeted by therapy in neurodegeneration. Thanh-Phuong Nguyen, Laura Caberlotto, Melissa J. Morine, and Corrado Priami Copyright © 2014 Thanh-Phuong Nguyen et al. All rights reserved. Computational Analysis of Transcriptional Circuitries in Human Embryonic Stem Cells Reveals Multiple and Independent Networks Thu, 09 Jan 2014 14:26:11 +0000 http://www.hindawi.com/journals/bmri/2014/725780/ It has been known that three core transcription factors (TFs), NANOG, OCT4, and SOX2, collaborate to form a transcriptional circuitry to regulate pluripotency and self-renewal of human embryonic stem (ES) cells. Similarly, MYC also plays an important role in regulating pluripotency and self-renewal of human ES cells. However, the precise mechanism by which the transcriptional regulatory networks control the activity of ES cells remains unclear. In this study, we reanalyzed an extended core network, which includes the set of genes that are cobound by the three core TFs and additional TFs that also bind to these cobound genes. Our results show that beyond the core transcriptional network, additional transcriptional networks are potentially important in the regulation of the fate of human ES cells. Several gene families that encode TFs play a key role in the transcriptional circuitry of ES cells. We also demonstrate that MYC acts independently of the core module in the regulation of the fate of human ES cells, consistent with the established argument. We find that TP53 is a key connecting molecule between the core-centered and MYC-centered modules. This study provides additional insights into the underlying regulatory mechanisms involved in the fate determination of human ES cells. Xiaosheng Wang and Chittibabu Guda Copyright © 2014 Xiaosheng Wang and Chittibabu Guda. All rights reserved. De Novo Assembly and Characterization of Sophora japonica Transcriptome Using RNA-seq Thu, 02 Jan 2014 11:42:06 +0000 http://www.hindawi.com/journals/bmri/2014/750961/ Sophora japonica Linn (Chinese Scholar Tree) is a shrub species belonging to the subfamily Faboideae of the pea family Fabaceae. In this study, RNA sequencing of S. japonica transcriptome was performed to produce large expression datasets for functional genomic analysis. Approximate 86.1 million high-quality clean reads were generated and assembled de novo into 143010 unique transcripts and 57614 unigenes. The average length of unigenes was 901 bps with an N50 of 545 bps. Four public databases, including the NCBI nonredundant protein (NR), Swiss-Prot, Kyoto Encyclopedia of Genes and Genomes (KEGG), and the Cluster of Orthologous Groups (COG), were used to annotate unigenes through NCBI BLAST procedure. A total of 27541 of 57614 unigenes (47.8%) were annotated for gene descriptions, conserved protein domains, or gene ontology. Moreover, an interaction network of unigenes in S. japonica was predicted based on known protein-protein interactions of putative orthologs of well-studied plant genomes. The transcriptome data of S. japonica reported here represents first genome-scale investigation of gene expressions in Faboideae plants. We expect that our study will provide a useful resource for further studies on gene expression, genomics, functional genomics, and protein-protein interaction in S. japonica. Liucun Zhu, Ying Zhang, Wenna Guo, Xin-Jian Xu, and Qiang Wang Copyright © 2014 Liucun Zhu et al. All rights reserved. Application of Systems Biology and Bioinformatics Methods in Biochemistry and Biomedicine Tue, 31 Dec 2013 11:31:30 +0000 http://www.hindawi.com/journals/bmri/2013/651968/ Yudong Cai, Tao Huang, Lei Chen, and Bin Niu Copyright © 2013 Yudong Cai et al. All rights reserved. HGF Accelerates Wound Healing by Promoting the Dedifferentiation of Epidermal Cells through -Integrin/ILK Pathway Mon, 30 Dec 2013 13:52:26 +0000 http://www.hindawi.com/journals/bmri/2013/470418/ Skin wound healing is a critical and complex biological process after trauma. This process is activated by signaling pathways of both epithelial and nonepithelial cells, which release a myriad of different cytokines and growth factors. Hepatocyte growth factor (HGF) is a cytokine known to play multiple roles during the various stages of wound healing. This study evaluated the benefits of HGF on reepithelialization during wound healing and investigated its mechanisms of action. Gross and histological results showed that HGF significantly accelerated reepithelialization in diabetic (DB) rats. HGF increased the expressions of the cell adhesion molecules -integrin and the cytoskeleton remodeling protein integrin-linked kinase (ILK) in epidermal cells in vivo and in vitro. Silencing of ILK gene expression by RNA interference reduced expression of -integrin, ILK, and c-met in epidermal cells, concomitantly decreasing the proliferation and migration ability of epidermal cells. -Integrin can be an important maker of poorly differentiated epidermal cells. Therefore, these data demonstrate that epidermal cells become poorly differentiated state and regained some characteristics of epidermal stem cells under the role of HGF after wound. Taken together, the results provide evidence that HGF can accelerate reepithelialization in skin wound healing by dedifferentiation of epidermal cells in a manner related to the -integrin/ILK pathway. Jin-Feng Li, Hai-Feng Duan, Chu-Tse Wu, Da-Jin Zhang, Youping Deng, Hong-Lei Yin, Bing Han, Hui-Cui Gong, Hong-Wei Wang, and Yun-Liang Wang Copyright © 2013 Jin-Feng Li et al. All rights reserved. Prediction of Substrate-Enzyme-Product Interaction Based on Molecular Descriptors and Physicochemical Properties Sun, 22 Dec 2013 18:10:05 +0000 http://www.hindawi.com/journals/bmri/2013/674215/ It is important to correctly and efficiently predict the interaction of substrate-enzyme and to predict their product in metabolic pathway. In this work, a novel approach was introduced to encode substrate/product and enzyme molecules with molecular descriptors and physicochemical properties, respectively. Based on this encoding method, KNN was adopted to build the substrate-enzyme-product interaction network. After selecting the optimal features that are able to represent the main factors of substrate-enzyme-product interaction in our prediction, totally 160 features out of 290 features were attained which can be clustered into ten categories: elemental analysis, geometry, chemistry, amino acid composition, predicted secondary structure, hydrophobicity, polarizability, solvent accessibility, normalized van der Waals volume, and polarity. As a result, our predicting model achieved an MCC of 0.423 and an overall prediction accuracy of 89.1% for 10-fold cross-validation test. Bing Niu, Guohua Huang, Linfeng Zheng, Xueyuan Wang, Fuxue Chen, Yuhui Zhang, and Tao Huang Copyright © 2013 Bing Niu et al. All rights reserved. Identification of Age-Related Macular Degeneration Related Genes by Applying Shortest Path Algorithm in Protein-Protein Interaction Network Wed, 18 Dec 2013 12:38:15 +0000 http://www.hindawi.com/journals/bmri/2013/523415/ This study attempted to find novel age-related macular degeneration (AMD) related genes based on 36 known AMD genes. The well-known shortest path algorithm, Dijkstra’s algorithm, was applied to find the shortest path connecting each pair of known AMD related genes in protein-protein interaction (PPI) network. The genes occurring in any shortest path were considered as candidate AMD related genes. As a result, 125 novel AMD genes were predicted. The further analysis based on betweenness and permutation test indicates that there are 10 genes involved in the formation or development of AMD and may be the actual AMD related genes with high probability. We hope that this contribution would promote the study of age-related macular degeneration and discovery of novel effective treatments. Jian Zhang, Min Jiang, Fei Yuan, Kai-Yan Feng, Yu-Dong Cai, Xun Xu, and Lei Chen Copyright © 2013 Jian Zhang et al. All rights reserved. Biometrics and Biosecurity 2013 Tue, 10 Dec 2013 13:42:09 +0000 http://www.hindawi.com/journals/bmri/2013/734642/ Tai-hoon Kim, Sabah Mohammed, and Wai-Chi Fang Copyright © 2013 Tai-hoon Kim et al. All rights reserved. iEzy-Drug: A Web Server for Identifying the Interaction between Enzymes and Drugs in Cellular Networking Tue, 26 Nov 2013 18:00:45 +0000 http://www.hindawi.com/journals/bmri/2013/701317/ With the features of extremely high selectivity and efficiency in catalyzing almost all the chemical reactions in cells, enzymes play vitally important roles for the life of an organism and hence have become frequent targets for drug design. An essential step in developing drugs by targeting enzymes is to identify drug-enzyme interactions in cells. It is both time-consuming and costly to do this purely by means of experimental techniques alone. Although some computational methods were developed in this regard based on the knowledge of the three-dimensional structure of enzyme, unfortunately their usage is quite limited because three-dimensional structures of many enzymes are still unknown. Here, we reported a sequence-based predictor, called “iEzy-Drug,” in which each drug compound was formulated by a molecular fingerprint with 258 feature components, each enzyme by the Chou’s pseudo amino acid composition generated via incorporating sequential evolution information and physicochemical features derived from its sequence, and the prediction engine was operated by the fuzzy -nearest neighbor algorithm. The overall success rate achieved by iEzy-Drug via rigorous cross-validations was about 91%. Moreover, to maximize the convenience for the majority of experimental scientists, a user-friendly web server was established, by which users can easily obtain their desired results. Jian-Liang Min, Xuan Xiao, and Kuo-Chen Chou Copyright © 2013 Jian-Liang Min et al. All rights reserved. Multiple Biomarker Panels for Early Detection of Breast Cancer in Peripheral Blood Tue, 26 Nov 2013 14:26:09 +0000 http://www.hindawi.com/journals/bmri/2013/781618/ Detecting breast cancer at early stages can be challenging. Traditional mammography and tissue microarray that have been studied for early breast cancer detection and prediction have many drawbacks. Therefore, there is a need for more reliable diagnostic tools for early detection of breast cancer due to a number of factors and challenges. In the paper, we presented a five-marker panel approach based on SVM for early detection of breast cancer in peripheral blood and show how to use SVM to model the classification and prediction problem of early detection of breast cancer in peripheral blood. We found that the five-marker panel can improve the prediction performance (area under curve) in the testing data set from 0.5826 to 0.7879. Further pathway analysis showed that the top four five-marker panels are associated with signaling, steroid hormones, metabolism, immune system, and hemostasis, which are consistent with previous findings. Our prediction model can serve as a general model for multibiomarker panel discovery in early detection of other cancers. Fan Zhang, Youping Deng, and Renee Drabier Copyright © 2013 Fan Zhang et al. All rights reserved. Gene Prioritization of Resistant Rice Gene against Xanthomas oryzae pv. oryzae by Using Text Mining Technologies Mon, 25 Nov 2013 16:01:48 +0000 http://www.hindawi.com/journals/bmri/2013/853043/ To effectively assess the possibility of the unknown rice protein resistant to Xanthomonas oryzae pv. oryzae, a hybrid strategy is proposed to enhance gene prioritization by combining text mining technologies with a sequence-based approach. The text mining technique of term frequency inverse document frequency is used to measure the importance of distinguished terms which reflect biomedical activity in rice before candidate genes are screened and vital terms are produced. Afterwards, a built-in classifier under the chaos games representation algorithm is used to sieve the best possible candidate gene. Our experiment results show that the combination of these two methods achieves enhanced gene prioritization. Jingbo Xia, Xing Zhang, Daojun Yuan, Lingling Chen, Jonathan Webster, and Alex Chengyu Fang Copyright © 2013 Jingbo Xia et al. All rights reserved. QSBR Study of Bitter Taste of Peptides: Application of GA-PLS in Combination with MLR, SVM, and ANN Approaches Mon, 25 Nov 2013 08:38:58 +0000 http://www.hindawi.com/journals/bmri/2013/501310/ Detailed information about the relationships between structures and properties/activities of peptides as drugs and nutrients is useful in the development of drugs and functional foods containing peptides as active compounds. The bitterness of the peptides is an undesirable property which should be reduced during drug/nutrient production, and quantitative structure bitter taste relationship (QSBR) studies can help researchers to design less bitter peptides with higher target efficiency. Calculated structural parameters were used to develop three different QSBR models (i.e., multiple linear regression, support vector machine, and artificial neural network) to predict the bitterness of 229 peptides (containing 2–12 amino acids, obtained from the literature). The developed models were validated using internal and external validation methods, and the prediction errors were checked using mean percentage deviation and absolute average error values. All developed models predicted the activities successfully (with prediction errors less than experimental error values), whereas the prediction errors for nonlinear methods were less than those for linear methods. The selected structural descriptors successfully differentiated between bitter and nonbitter peptides. Somaieh Soltani, Hossein Haghaei, Ali Shayanfar, Javad Vallipour, Karim Asadpour Zeynali, and Abolghasem Jouyban Copyright © 2013 Somaieh Soltani et al. All rights reserved. Expression Sensitivity Analysis of Human Disease Related Genes Sun, 24 Nov 2013 11:16:16 +0000 http://www.hindawi.com/journals/bmri/2013/637424/ Background. Genome-wide association studies (GWAS) have shown its revolutionary power in seeking the influenced loci on complex diseases genetically. Thousands of replicated loci for common traits are helpful in diseases risk assessment. However it is still difficult to elucidate the variations in these loci that directly cause susceptibility to diseases by disrupting the expression or function of a protein currently. Results. We evaluate the expression features of disease related genes and find that different diseases related genes show different expression perturbation sensitivities in various conditions. It is worth noting that the expression of some robust disease-genes doesn’t show significant change in their corresponding diseases, these genes might be easily ignored in the expression profile analysis. Conclusion. Gene ontology enrichment analysis indicates that robust disease-genes execute essential function in comparison with sensitive disease-genes. The diseases associated with robust genes seem to be relatively lethal like cancer and aging. On the other hand, the diseases associated with sensitive genes are apparently nonlethal like psych and chemical dependency diseases. Liang-Xiao Ma, Ya-Jun Wang, Jing-Fang Wang, Xuan Li, and Pei Hao Copyright © 2013 Liang-Xiao Ma et al. All rights reserved. Translational Biomedical Informatics and Computational Systems Medicine Thu, 21 Nov 2013 14:39:08 +0000 http://www.hindawi.com/journals/bmri/2013/237465/ Zhongming Zhao, Bairong Shen, Xinghua Lu, and Wanwipa Vongsangnak Copyright © 2013 Zhongming Zhao et al. All rights reserved. An Improved Biometrics-Based Remote User Authentication Scheme with User Anonymity Thu, 21 Nov 2013 13:09:31 +0000 http://www.hindawi.com/journals/bmri/2013/491289/ The authors review the biometrics-based user authentication scheme proposed by An in 2012. The authors show that there exist loopholes in the scheme which are detrimental for its security. Therefore the authors propose an improved scheme eradicating the flaws of An’s scheme. Then a detailed security analysis of the proposed scheme is presented followed by its efficiency comparison. The proposed scheme not only withstands security problems found in An’s scheme but also provides some extra features with mere addition of only two hash operations. The proposed scheme allows user to freely change his password and also provides user anonymity with untraceability. Muhammad Khurram Khan and Saru Kumari Copyright © 2013 Muhammad Khurram Khan and Saru Kumari. All rights reserved. Prediction of Drugs Target Groups Based on ChEBI Ontology Wed, 20 Nov 2013 17:06:28 +0000 http://www.hindawi.com/journals/bmri/2013/132724/ Most drugs have beneficial as well as adverse effects and exert their biological functions by adjusting and altering the functions of their target proteins. Thus, knowledge of drugs target proteins is essential for the improvement of therapeutic effects and mitigation of undesirable side effects. In the study, we proposed a novel prediction method based on drug/compound ontology information extracted from ChEBI to identify drugs target groups from which the kind of functions of a drug may be deduced. By collecting data in KEGG, a benchmark dataset consisting of 876 drugs, categorized into four target groups, was constructed. To evaluate the method more thoroughly, the benchmark dataset was divided into a training dataset and an independent test dataset. It is observed by jackknife test that the overall prediction accuracy on the training dataset was 83.12%, while it was 87.50% on the test dataset—the predictor exhibited an excellent generalization. The good performance of the method indicates that the ontology information of the drugs contains rich information about their target groups, and the study may become an inspiration to solve the problems of this sort and bridge the gap between ChEBI ontology and drugs target groups. Yu-Fei Gao, Lei Chen, Guo-Hua Huang, Tao Zhang, Kai-Yan Feng, Hai-Peng Li, and Yang Jiang Copyright © 2013 Yu-Fei Gao et al. All rights reserved. Identifying Breast Cancer Subtype Related miRNAs from Two Constructed miRNAs Interaction Networks in Silico Method Wed, 20 Nov 2013 08:32:57 +0000 http://www.hindawi.com/journals/bmri/2013/798912/ Background. It has been known that microRNAs (miRNAs) regulate the expression of multiple proteins and therefore are likely to emerge as more effective targets of selective therapeutic modalities for breast cancer. Although recent lines of evidence have approved that miRNAs are associated with the most common molecular breast cancer subtypes, the studies to breast cancer subtypes have not been well characterized. Objectives. In this study, we propose a silico method to identify breast cancer subtype related miRNAs based on two constructed miRNAs interaction networks using miRNA-mRNA dual expression profiling data arising from the same samples. Methods. Firstly, we used a new mutual information estimation method to construct two miRNAs interaction networks based on miRNA-mRNA dual expression profiling data. Secondly, we compared and analyzed the topological properties of these two networks. Finally, miRNAs showing the outstanding topological properties in both of the two networks were identified. Results. Further functional analysis and literature evidence confirm that the identified potential breast cancer subtype related miRNAs are essential to unraveling their biological function. Conclusions. This study provides a new silico method to predict candidate miRNAs of breast cancer subtype from a system biology level and can help exploit for functional studies of important breast cancer subtype related miRNAs. Lin Hua, Lin Li, and Ping Zhou Copyright © 2013 Lin Hua et al. All rights reserved. DeGNServer: Deciphering Genome-Scale Gene Networks through High Performance Reverse Engineering Analysis Sun, 17 Nov 2013 10:21:45 +0000 http://www.hindawi.com/journals/bmri/2013/856325/ Analysis of genome-scale gene networks (GNs) using large-scale gene expression data provides unprecedented opportunities to uncover gene interactions and regulatory networks involved in various biological processes and developmental programs, leading to accelerated discovery of novel knowledge of various biological processes, pathways and systems. The widely used context likelihood of relatedness (CLR) method based on the mutual information (MI) for scoring the similarity of gene pairs is one of the accurate methods currently available for inferring GNs. However, the MI-based reverse engineering method can achieve satisfactory performance only when sample size exceeds one hundred. This in turn limits their applications for GN construction from expression data set with small sample size. We developed a high performance web server, DeGNServer, to reverse engineering and decipher genome-scale networks. It extended the CLR method by integration of different correlation methods that are suitable for analyzing data sets ranging from moderate to large scale such as expression profiles with tens to hundreds of microarray hybridizations, and implemented all analysis algorithms using parallel computing techniques to infer gene-gene association at extraordinary speed. In addition, we integrated the SNBuilder and GeNa algorithms for subnetwork extraction and functional module discovery. DeGNServer is publicly and freely available online. Jun Li, Hairong Wei, and Patrick Xuechun Zhao Copyright © 2013 Jun Li et al. All rights reserved. A Systems’ Biology Approach to Study MicroRNA-Mediated Gene Regulatory Networks Sun, 17 Nov 2013 09:00:43 +0000 http://www.hindawi.com/journals/bmri/2013/703849/ MicroRNAs (miRNAs) are potent effectors in gene regulatory networks where aberrant miRNA expression can contribute to human diseases such as cancer. For a better understanding of the regulatory role of miRNAs in coordinating gene expression, we here present a systems biology approach combining data-driven modeling and model-driven experiments. Such an approach is characterized by an iterative process, including biological data acquisition and integration, network construction, mathematical modeling and experimental validation. To demonstrate the application of this approach, we adopt it to investigate mechanisms of collective repression on p21 by multiple miRNAs. We first construct a p21 regulatory network based on data from the literature and further expand it using algorithms that predict molecular interactions. Based on the network structure, a detailed mechanistic model is established and its parameter values are determined using data. Finally, the calibrated model is used to study the effect of different miRNA expression profiles and cooperative target regulation on p21 expression levels in different biological contexts. Xin Lai, Animesh Bhattacharya, Ulf Schmitz, Manfred Kunz, Julio Vera, and Olaf Wolkenhauer Copyright © 2013 Xin Lai et al. All rights reserved. Novel Natural Structure Corrector of ApoE4 for Checking Alzheimer’s Disease: Benefits from High Throughput Screening and Molecular Dynamics Simulations Wed, 13 Nov 2013 08:27:06 +0000 http://www.hindawi.com/journals/bmri/2013/620793/ A major genetic suspect for Alzheimer’s disease is the pathological conformation assumed by apolipoprotein E4 (ApoE4) through intramolecular interaction. In the present study, a large library of natural compounds was screened against ApoE4 to identify novel therapeutic molecules that can prevent ApoE4 from being converted to its pathological conformation. We report two such natural compounds PHC and IAH that bound to the active site of ApoE4 during the docking process. The binding analysis suggested that they have a strong mechanistic ability to correct the pathological structural orientation of ApoE4 by preventing repulsion between Arg 61 and Arg 112, thus inhibiting the formation of a salt bridge between Arg 61 and Glu 255. However, when the molecular dynamics simulations were carried out, structural changes in the PHC-bound complex forced PHC to move out of the cavity thus destabilizing the complex. However, IAH was structurally stable inside the binding pocket throughout the simulations trajectory. Our simulations results indicate that the initial receptor-ligand interaction observed after docking could be limited due to the receptor rigid docking algorithm and that the conformations and interactions observed after simulation runs are more energetically favored and should be better representations of derivative poses in the receptor. Manisha Goyal, Sonam Grover, Jaspreet Kaur Dhanjal, Sukriti Goyal, Chetna Tyagi, Sajeev Chacko, and Abhinav Grover Copyright © 2013 Manisha Goyal et al. All rights reserved. Efficient Haplotype Block Partitioning and Tag SNP Selection Algorithms under Various Constraints Mon, 11 Nov 2013 14:36:46 +0000 http://www.hindawi.com/journals/bmri/2013/984014/ Patterns of linkage disequilibrium plays a central role in genome-wide association studies aimed at identifying genetic variation responsible for common human diseases. These patterns in human chromosomes show a block-like structure, and regions of high linkage disequilibrium are called haplotype blocks. A small subset of SNPs, called tag SNPs, is sufficient to capture the haplotype patterns in each haplotype block. Previously developed algorithms completely partition a haplotype sample into blocks while attempting to minimize the number of tag SNPs. However, when resource limitations prevent genotyping all the tag SNPs, it is desirable to restrict their number. We propose two dynamic programming algorithms, incorporating many diversity evaluation functions, for haplotype block partitioning using a limited number of tag SNPs. We use the proposed algorithms to partition the chromosome 21 haplotype data. When the sample is fully partitioned into blocks by our algorithms, the 2,266 blocks and 3,260 tag SNPs are fewer than those identified by previous studies. We also demonstrate that our algorithms find the optimal solution by exploiting the nonmonotonic property of a common haplotype-evaluation function. Wen-Pei Chen, Che-Lun Hung, and Yaw-Ling Lin Copyright © 2013 Wen-Pei Chen et al. All rights reserved. QPLOT: A Quality Assessment Tool for Next Generation Sequencing Data Mon, 11 Nov 2013 11:16:47 +0000 http://www.hindawi.com/journals/bmri/2013/865181/ Background. Next generation sequencing (NGS) is being widely used to identify genetic variants associated with human disease. Although the approach is cost effective, the underlying data is susceptible to many types of error. Importantly, since NGS technologies and protocols are rapidly evolving, with constantly changing steps ranging from sample preparation to data processing software updates, it is important to enable researchers to routinely assess the quality of sequencing and alignment data prior to downstream analyses. Results. Here we describe QPLOT, an automated tool that can facilitate the quality assessment of sequencing run performance. Taking standard sequence alignments as input, QPLOT generates a series of diagnostic metrics summarizing run quality and produces convenient graphical summaries for these metrics. QPLOT is computationally efficient, generates webpages for interactive exploration of detailed results, and can handle the joint output of many sequencing runs. Conclusion. QPLOT is an automated tool that facilitates assessment of sequence run quality. We routinely apply QPLOT to ensure quick detection of diagnostic of sequencing run problems. We hope that QPLOT will be useful to the community as well. Bingshan Li, Xiaowei Zhan, Mary-Kate Wing, Paul Anderson, Hyun Min Kang, and Goncalo R. Abecasis Copyright © 2013 Bingshan Li et al. All rights reserved. A Comparative Analysis of Biomarker Selection Techniques Sun, 10 Nov 2013 09:15:14 +0000 http://www.hindawi.com/journals/bmri/2013/387673/ Feature selection has become the essential step in biomarker discovery from high-dimensional genomics data. It is recognized that different feature selection techniques may result in different set of biomarkers, that is, different groups of genes highly correlated to a given pathological condition, but few direct comparisons exist which quantify these differences in a systematic way. In this paper, we propose a general methodology for comparing the outcomes of different selection techniques in the context of biomarker discovery. The comparison is carried out along two dimensions: (i) measuring the similarity/dissimilarity of selected gene sets; (ii) evaluating the implications of these differences in terms of both predictive performance and stability of selected gene sets. As a case study, we considered three benchmarks deriving from DNA microarray experiments and conducted a comparative analysis among eight selection methods, representatives of different classes of feature selection techniques. Our results show that the proposed approach can provide useful insight about the pattern of agreement of biomarker discovery techniques. Nicoletta Dessì, Emanuele Pascariello, and Barbara Pes Copyright © 2013 Nicoletta Dessì et al. All rights reserved. Prediction of Gene Phenotypes Based on GO and KEGG Pathway Enrichment Scores Thu, 07 Nov 2013 14:53:49 +0000 http://www.hindawi.com/journals/bmri/2013/870795/ Observing what phenotype the overexpression or knockdown of gene can cause is the basic method of investigating gene functions. Many advanced biotechnologies, such as RNAi, were developed to study the gene phenotype. But there are still many limitations. Besides the time and cost, the knockdown of some gene may be lethal which makes the observation of other phenotypes impossible. Due to ethical and technological reasons, the knockdown of genes in complex species, such as mammal, is extremely difficult. Thus, we proposed a new sequence-based computational method called kNNA-based method for gene phenotypes prediction. Different to the traditional sequence-based computational method, our method regards the multiphenotype as a whole network which can rank the possible phenotypes associated with the query protein and shows a more comprehensive view of the protein's biological effects. According to the prediction result of yeast, we also find some more related features, including GO and KEGG information, which are making more contributions in identifying protein phenotypes. This method can be applied in gene phenotype prediction in other species. Tao Zhang, Min Jiang, Lei Chen, Bing Niu, and Yudong Cai Copyright © 2013 Tao Zhang et al. All rights reserved. ASPic-GeneID: A Lightweight Pipeline for Gene Prediction and Alternative Isoforms Detection Thu, 07 Nov 2013 13:15:40 +0000 http://www.hindawi.com/journals/bmri/2013/502827/ New genomes are being sequenced at an increasingly rapid rate, far outpacing the rate at which manual gene annotation can be performed. Automated genome annotation is thus necessitated by this growth in genome projects; however, full-fledged annotation systems are usually home-grown and customized to a particular genome. There is thus a renewed need for accurate ab initio gene prediction methods. However, it is apparent that fully ab initio methods fall short of the required level of sensitivity and specificity for a quality annotation. Evidence in the form of expressed sequences gives the single biggest improvement in accuracy when used to inform gene predictions. Here, we present a lightweight pipeline for first-pass gene prediction on newly sequenced genomes. The two main components are ASPic, a program that derives highly accurate, albeit not necessarily complete, EST-based transcript annotations from EST alignments, and GeneID, a standard gene prediction program, which we have modified to take as evidence intron annotations. The introns output by ASPic CDS predictions is given to GeneID to constrain the exon-chaining process and produce predictions consistent with the underlying EST alignments. The pipeline was successfully tested on the entire C. elegans genome and the 44 ENCODE human pilot regions. Tyler Alioto, Ernesto Picardi, Roderic Guigó, and Graziano Pesole Copyright © 2013 Tyler Alioto et al. All rights reserved. Comparative Study of Exome Copy Number Variation Estimation Tools Using Array Comparative Genomic Hybridization as Control Mon, 04 Nov 2013 14:18:32 +0000 http://www.hindawi.com/journals/bmri/2013/915636/ Exome sequencing using next-generation sequencing technologies is a cost-efficient approach to selectively sequencing coding regions of the human genome for detection of disease variants. One of the lesser known yet important applications of exome sequencing data is to identify copy number variation (CNV). There have been many exome CNV tools developed over the last few years, but the performance and accuracy of these programs have not been thoroughly evaluated. In this study, we systematically compared four popular exome CNV tools (CoNIFER, cn.MOPS, exomeCopy, and ExomeDepth) and evaluated their effectiveness against array comparative genome hybridization (array CGH) platforms. We found that exome CNV tools are capable of identifying CNVs, but they can have problems such as high false positives, low sensitivity, and duplication bias when compared to array CGH platforms. While exome CNV tools do serve their purpose for data mining, careful evaluation and additional validation is highly recommended. Based on all these results, we recommend CoNIFER and cn.MOPs for nonpaired exome CNV detection over the other two tools due to a low false-positive rate, although none of the four exome CNV tools performed at an outstanding level when compared to array CGH. Yan Guo, Quanghu Sheng, David C. Samuels, Brian Lehmann, Joshua A. Bauer, Jennifer Pietenpol, and Yu Shyr Copyright © 2013 Yan Guo et al. All rights reserved. Enabling Large-Scale Biomedical Analysis in the Cloud Thu, 31 Oct 2013 09:08:49 +0000 http://www.hindawi.com/journals/bmri/2013/185679/ Recent progress in high-throughput instrumentations has led to an astonishing growth in both volume and complexity of biomedical data collected from various sources. The planet-size data brings serious challenges to the storage and computing technologies. Cloud computing is an alternative to crack the nut because it gives concurrent consideration to enable storage and high-performance computing on large-scale data. This work briefly introduces the data intensive computing system and summarizes existing cloud-based resources in bioinformatics. These developments and applications would facilitate biomedical research to make the vast amount of diversification data meaningful and usable. Ying-Chih Lin, Chin-Sheng Yu, and Yen-Jen Lin Copyright © 2013 Ying-Chih Lin et al. All rights reserved. Classifying Human Voices by Using Hybrid SFX Time-Series Preprocessing and Ensemble Feature Selection Tue, 29 Oct 2013 15:28:36 +0000 http://www.hindawi.com/journals/bmri/2013/720834/ Voice biometrics is one kind of physiological characteristics whose voice is different for each individual person. Due to this uniqueness, voice classification has found useful applications in classifying speakers’ gender, mother tongue or ethnicity (accent), emotion states, identity verification, verbal command control, and so forth. In this paper, we adopt a new preprocessing method named Statistical Feature Extraction (SFX) for extracting important features in training a classification model, based on piecewise transformation treating an audio waveform as a time-series. Using SFX we can faithfully remodel statistical characteristics of the time-series; together with spectral analysis, a substantial amount of features are extracted in combination. An ensemble is utilized in selecting only the influential features to be used in classification model induction. We focus on the comparison of effects of various popular data mining algorithms on multiple datasets. Our experiment consists of classification tests over four typical categories of human voice data, namely, Female and Male, Emotional Speech, Speaker Identification, and Language Recognition. The experiments yield encouraging results supporting the fact that heuristically choosing significant features from both time and frequency domains indeed produces better performance in voice classification than traditional signal processing techniques alone, like wavelets and LPC-to-CC. Simon Fong, Kun Lan, and Raymond Wong Copyright © 2013 Simon Fong et al. All rights reserved. New aQTL SNPs for the CYP2D6 Identified by a Novel Mediation Analysis of Genome-Wide SNP Arrays, Gene Expression Arrays, and CYP2D6 Activity Tue, 22 Oct 2013 09:11:04 +0000 http://www.hindawi.com/journals/bmri/2013/493019/ Background. The genome-wide association studies (GWAS) have been successful during the last few years. A key challenge is that the interpretation of the results is not straightforward, especially for transacting SNPs. Integration of transcriptome data into GWAS may provide clues elucidating the mechanisms by which a genetic variant leads to a disease. Methods. Here, we developed a novel mediation analysis approach to identify new expression quantitative trait loci (eQTL) driving CYP2D6 activity by combining genotype, gene expression, and enzyme activity data. Results. 389,573 and 1,214,416 SNP-transcript-CYP2D6 activity trios are found strongly associated (, % and 11.7%) for two different genotype platforms, namely, Affymetrix and Illumina, respectively. The majority of eQTLs are trans-SNPs. A single polymorphism leads to widespread downstream changes in the expression of distant genes by affecting major regulators or transcription factors (TFs), which would be visible as an eQTL hotspot and can lead to large and consistent biological effects. Overlapped eQTL hotspots with the mediators lead to the discovery of 64 TFs. Conclusions. Our mediation analysis is a powerful approach in identifying the trans-QTL-phenotype associations. It improves our understanding of the functional genetic variations for the liver metabolism mechanisms. Guanglong Jiang, Arindom Chakraborty, Zhiping Wang, Malaz Boustani, Yunlong Liu, Todd Skaar, and Lang Li Copyright © 2013 Guanglong Jiang et al. All rights reserved. A Quantitative Analysis of the Impact on Chromatin Accessibility by Histone Modifications and Binding of Transcription Factors in DNase I Hypersensitive Sites Tue, 22 Oct 2013 08:36:13 +0000 http://www.hindawi.com/journals/bmri/2013/914971/ It is known that chromatin features such as histone modifications and the binding of transcription factors exert a significant impact on the “openness” of chromatin. In this study, we present a quantitative analysis of the genome-wide relationship between chromatin features and chromatin accessibility in DNase I hypersensitive sites. We found that these features show distinct preference to localize in open chromatin. In order to elucidate the exact impact, we derived quantitative models to directly predict the “openness” of chromatin using histone modification features and transcription factor binding features, respectively. We show that these two types of features are highly predictive for chromatin accessibility in a statistical viewpoint. Moreover, our results indicate that these features are highly redundant and only a small number of features are needed to achieve a very high predictive power. Our study provides new insights into the true biological phenomena and the combinatorial effects of chromatin features to differential DNase I hypersensitivity. Peng Cui, Jing Li, Bo Sun, Menghuan Zhang, Baofeng Lian, Yixue Li, and Lu Xie Copyright © 2013 Peng Cui et al. All rights reserved. A Review for Detecting Gene-Gene Interactions Using Machine Learning Methods in Genetic Epidemiology Mon, 21 Oct 2013 14:59:30 +0000 http://www.hindawi.com/journals/bmri/2013/432375/ Recently, the greatest statistical computational challenge in genetic epidemiology is to identify and characterize the genes that interact with other genes and environment factors that bring the effect on complex multifactorial disease. These gene-gene interactions are also denoted as epitasis in which this phenomenon cannot be solved by traditional statistical method due to the high dimensionality of the data and the occurrence of multiple polymorphism. Hence, there are several machine learning methods to solve such problems by identifying such susceptibility gene which are neural networks (NNs), support vector machine (SVM), and random forests (RFs) in such common and multifactorial disease. This paper gives an overview on machine learning methods, describing the methodology of each machine learning methods and its application in detecting gene-gene and gene-environment interactions. Lastly, this paper discussed each machine learning method and presents the strengths and weaknesses of each machine learning method in detecting gene-gene interactions in complex human disease. Ching Lee Koo, Mei Jing Liew, Mohd Saberi Mohamad, and Abdul Hakim Mohamed Salleh Copyright © 2013 Ching Lee Koo et al. All rights reserved. Systems Approaches to Modeling Chronic Mucosal Inflammation Mon, 21 Oct 2013 09:18:54 +0000 http://www.hindawi.com/journals/bmri/2013/505864/ The respiratory mucosa is a major coordinator of the inflammatory response in chronic airway diseases, including asthma and chronic obstructive pulmonary disease (COPD). Signals produced by the chronic inflammatory process induce epithelial mesenchymal transition (EMT) that dramatically alters the epithelial cell phenotype. The effects of EMT on epigenetic reprogramming and the activation of transcriptional networks are known, its effects on the innate inflammatory response are underexplored. We used a multiplex gene expression profiling platform to investigate the perturbations of the innate pathways induced by TGFβ in a primary airway epithelial cell model of EMT. EMT had dramatic effects on the induction of the innate pathway and the coupling interval of the canonical and noncanonical NF-κB pathways. Simulation experiments demonstrate that rapid, coordinated cap-independent translation of TRAF-1 and NF-κB2 is required to reduce the noncanonical pathway coupling interval. Experiments using amantadine confirmed the prediction that TRAF-1 and NF-κB2/p100 production is mediated by an IRES-dependent mechanism. These data indicate that the epigenetic changes produced by EMT induce dynamic state changes of the innate signaling pathway. Further applications of systems approaches will provide understanding of this complex phenotype through deterministic modeling and multidimensional (genomic and proteomic) profiling. Mridul Kalita, Bing Tian, Boning Gao, Sanjeev Choudhary, Thomas G. Wood, Joseph R. Carmical, Istvan Boldogh, Sankar Mitra, John D. Minna, and Allan R. Brasier Copyright © 2013 Mridul Kalita et al. All rights reserved. Statistical Fractal Models Based on GND-PCA and Its Application on Classification of Liver Diseases Wed, 09 Oct 2013 17:37:31 +0000 http://www.hindawi.com/journals/bmri/2013/656391/ A new method is proposed to establish the statistical fractal model for liver diseases classification. Firstly, the fractal theory is used to construct the high-order tensor, and then Generalized -dimensional Principal Component Analysis (GND-PCA) is used to establish the statistical fractal model and select the feature from the region of liver; at the same time different features have different weights, and finally, Support Vector Machine Optimized Ant Colony (ACO-SVM) algorithm is used to establish the classifier for the recognition of liver disease. In order to verify the effectiveness of the proposed method, PCA eigenface method and normal SVM method are chosen as the contrast methods. The experimental results show that the proposed method can reconstruct liver volume better and improve the classification accuracy of liver diseases. Huiyan Jiang, Tianjiao Feng, Di Zhao, Benqiang Yang, Libo Zhang, and Yenwei Chen Copyright © 2013 Huiyan Jiang et al. All rights reserved. Reducing the Complexity of Complex Gene Coexpression Networks by Coupling Multiweighted Labeling with Topological Analysis Mon, 07 Oct 2013 18:38:01 +0000 http://www.hindawi.com/journals/bmri/2013/676328/ Undirected gene coexpression networks obtained from experimental expression data coupled with efficient computational procedures are increasingly used to identify potentially relevant biological information (e.g., biomarkers) for a particular disease. However, coexpression networks built from experimental expression data are in general large highly connected networks with an elevated number of false-positive interactions (nodes and edges). In order to infer relevant information, the network must be properly filtered and its complexity reduced. Given the complexity and the multivariate nature of the information contained in the network, this requires the development and application of efficient feature selection algorithms to be able to exploit the topological characteristics of the network to identify relevant nodes and edges. This paper proposes an efficient multivariate filtering designed to analyze the topological properties of a coexpression network in order to identify potential relevant genes for a given disease. The algorithm has been tested on three datasets for three well known and studied diseases: acute myeloid leukemia, breast cancer, and diffuse large B-cell lymphoma. Results have been validated resorting to bibliographic data automatically mined using the ProteinQuest literature mining tool. Alfredo Benso, Paolo Cornale, Stefano Di Carlo, Gianfranco Politano, and Alessandro Savino Copyright © 2013 Alfredo Benso et al. All rights reserved. Reconstruction and Analysis of Human Kidney-Specific Metabolic Network Based on Omics Data Sat, 05 Oct 2013 14:29:08 +0000 http://www.hindawi.com/journals/bmri/2013/187509/ With the advent of the high-throughput data production, recent studies of tissue-specific metabolic networks have largely advanced our understanding of the metabolic basis of various physiological and pathological processes. However, for kidney, which plays an essential role in the body, the available kidney-specific model remains incomplete. This paper reports the reconstruction and characterization of the human kidney metabolic network based on transcriptome and proteome data. In silico simulations revealed that house-keeping genes were more essential than kidney-specific genes in maintaining kidney metabolism. Importantly, a total of 267 potential metabolic biomarkers for kidney-related diseases were successfully explored using this model. Furthermore, we found that the discrepancies in metabolic processes of different tissues are directly corresponding to tissue's functions. Finally, the phenotypes of the differentially expressed genes in diabetic kidney disease were characterized, suggesting that these genes may affect disease development through altering kidney metabolism. Thus, the human kidney-specific model constructed in this study may provide valuable information for the metabolism of kidney and offer excellent insights into complex kidney diseases. Ai-Di Zhang, Shao-Xing Dai, and Jing-Fei Huang Copyright © 2013 Ai-Di Zhang et al. All rights reserved. A Guide RNA Sequence Design Platform for the CRISPR/Cas9 System for Model Organism Genomes Thu, 03 Oct 2013 15:34:20 +0000 http://www.hindawi.com/journals/bmri/2013/270805/ Cas9/CRISPR has been reported to efficiently induce targeted gene disruption and homologous recombination in both prokaryotic and eukaryotic cells. Thus, we developed a Guide RNA Sequence Design Platform for the Cas9/CRISPR silencing system for model organisms. The platform is easy to use for gRNA design with input query sequences. It finds potential targets by PAM and ranks them according to factors including uniqueness, SNP, RNA secondary structure, and AT content. The platform allows users to upload and share their experimental results. In addition, most guide RNA sequences from published papers have been put into our database. Ming Ma, Adam Y. Ye, Weiguo Zheng, and Lei Kong Copyright © 2013 Ming Ma et al. All rights reserved. Systems Approaches Evaluating the Perturbation of Xenobiotic Metabolism in Response to Cigarette Smoke Exposure in Nasal and Bronchial Tissues Thu, 03 Oct 2013 11:51:13 +0000 http://www.hindawi.com/journals/bmri/2013/512086/ Capturing the effects of exposure in a specific target organ is a major challenge in risk assessment. Exposure to cigarette smoke (CS) implicates the field of tissue injury in the lung as well as nasal and airway epithelia. Xenobiotic metabolism in particular becomes an attractive tool for chemical risk assessment because of its responsiveness against toxic compounds, including those present in CS. This study describes an efficient integration from transcriptomic data to quantitative measures, which reflect the responses against xenobiotics that are captured in a biological network model. We show here that our novel systems approach can quantify the perturbation in the network model of xenobiotic metabolism. We further show that this approach efficiently compares the perturbation upon CS exposure in bronchial and nasal epithelial cells in vivo samples obtained from smokers. Our observation suggests the xenobiotic responses in the bronchial and nasal epithelial cells of smokers were similar to those observed in their respective organotypic models exposed to CS. Furthermore, the results suggest that nasal tissue is a reliable surrogate to measure xenobiotic responses in bronchial tissue. Anita R. Iskandar, Florian Martin, Marja Talikka, Walter K. Schlage, Radina Kostadinova, Carole Mathis, Julia Hoeng, and Manuel C. Peitsch Copyright © 2013 Anita R. Iskandar et al. All rights reserved. Biocloud: Cloud Computing for Biological, Genomics, and Drug Design Wed, 02 Oct 2013 15:53:48 +0000 http://www.hindawi.com/journals/bmri/2013/909470/ Ching-Hsien Hsu, Chun-Yuan Lin, Ming Ouyang, and Yi Ke Guo Copyright © 2013 Ching-Hsien Hsu et al. All rights reserved. An Accurate Method for Prediction of Protein-Ligand Binding Site on Protein Surface Using SVM and Statistical Depth Function Mon, 30 Sep 2013 15:09:38 +0000 http://www.hindawi.com/journals/bmri/2013/409658/ Since proteins carry out their functions through interactions with other molecules, accurately identifying the protein-ligand binding site plays an important role in protein functional annotation and rational drug discovery. In the past two decades, a lot of algorithms were present to predict the protein-ligand binding site. In this paper, we introduce statistical depth function to define negative samples and propose an SVM-based method which integrates sequence and structural information to predict binding site. The results show that the present method performs better than the existent ones. The accuracy, sensitivity, and specificity on training set are 77.55%, 56.15%, and 87.96%, respectively; on the independent test set, the accuracy, sensitivity, and specificity are 80.36%, 53.53%, and 92.38%, respectively. Kui Wang, Jianzhao Gao, Shiyi Shen, Jack A. Tuszynski, Jishou Ruan, and Gang Hu Copyright © 2013 Kui Wang et al. All rights reserved. Highly Ordered Architecture of MicroRNA Cluster Mon, 30 Sep 2013 09:44:09 +0000 http://www.hindawi.com/journals/bmri/2013/463168/ Although it is known that the placement of genes in a cluster may be critical for proper expression patterns, it remains largely unclear whether the orders of members in an miRNA cluster have biological insights. By investigating the relationship between expression and orders for miRNAs from the oncogenic miR-17-92 cluster, we observed a highly ordered architecture in this cluster. A significant correlation between miRNA expression level and its placement was revealed. More importantly, the placement of these miRNAs is associated with their dysregulation in cancer. Here, we presented the opinion that miRNA clusters are not arranged randomly but show highly ordered architectures, which may have critical roles in physiology and pathology. Bing Shi, Mingxuan Zhu, Shuang Liu, and Mandun Zhang Copyright © 2013 Bing Shi et al. All rights reserved. Genome-Wide Analysis of Human MicroRNA Stability Sat, 28 Sep 2013 12:12:40 +0000 http://www.hindawi.com/journals/bmri/2013/368975/ Increasing studies have shown that microRNA (miRNA) stability plays important roles in physiology. However, the global picture of miRNA stability remains largely unknown. Here, we had analyzed genome-wide miRNA stability across 10 diverse cell types using miRNA arrays. We found that miRNA stability shows high dynamics and diversity both within individual cells and across cell types. Strikingly, we observed a negative correlation between miRNA stability and miRNA expression level, which is different from current findings on other biological molecules such as proteins and mRNAs that show positive and not negative correlations between stability and expression level. This finding indicates that miRNA has a distinct action mode, which we called “rapid production, rapid turnover; slow production, slow turnover.” This mode further suggests that high expression miRNAs normally degrade fast and may endow the cell with special properties that facilitate cellular status-transition. Moreover, we revealed that the stability of miRNAs is affected by cohorts of factors that include miRNA targets, transcription factors, nucleotide content, evolution, associated disease, and environmental factors. Together, our results provided an extensive description of the global landscape, dynamics, and distinct mode of human miRNA stability, which provide help in investigating their functions in physiology and pathophysiology. Yang Li, Zhixin Li, Shixin Zhou, Jinhua Wen, Bin Geng, Jichun Yang, and Qinghua Cui Copyright © 2013 Yang Li et al. All rights reserved. Computer-Assisted System with Multiple Feature Fused Support Vector Machine for Sperm Morphology Diagnosis Thu, 26 Sep 2013 14:23:31 +0000 http://www.hindawi.com/journals/bmri/2013/687607/ Sperm morphology is an important technique in identifying the health of sperms. In this paper we present a new system and novel approaches to classify different kinds of sperm images in order to assess their health. Our approach mainly relies on a one-dimensional feature which is extracted from the sperm’s contour with gray level information. Our approach can handle rotation and scaling of the image. Moreover, it is fused with SVM classification to improve its accuracy. In our evaluation, our method has better performance than the existing approaches to sperm classification. Kuo-Kun Tseng, Yifan Li, Chih-Yu Hsu, Huang-Nan Huang, Ming Zhao, and Mingyue Ding Copyright © 2013 Kuo-Kun Tseng et al. All rights reserved. Enzyme Reaction Annotation Using Cloud Techniques Thu, 26 Sep 2013 12:13:04 +0000 http://www.hindawi.com/journals/bmri/2013/140237/ An understanding of the activities of enzymes could help to elucidate the metabolic pathways of thousands of chemical reactions that are catalyzed by enzymes in living systems. Sophisticated applications such as drug design and metabolic reconstruction could be developed using accurate enzyme reaction annotation. Because accurate enzyme reaction annotation methods create potential for enhanced production capacity in these applications, they have received greater attention in the global market. We propose the enzyme reaction prediction (ERP) method as a novel tool to deduce enzyme reactions from domain architecture. We used several frequency relationships between architectures and reactions to enhance the annotation rates for single and multiple catalyzed reactions. The deluge of information which arose from high-throughput techniques in the postgenomic era has improved our understanding of biological data, although it presents obstacles in the data-processing stage. The high computational capacity provided by cloud computing has resulted in an exponential growth in the volume of incoming data. Cloud services also relieve the requirement for large-scale memory space required by this approach to analyze enzyme kinetic data. Our tool is designed as a single execution file; thus, it could be applied to any cloud platform in which multiple queries are supported. Chuan-Ching Huang, Chun-Yuan Lin, Cheng-Wen Chang, and Chuan Yi Tang Copyright © 2013 Chuan-Ching Huang et al. All rights reserved. The Quantitative Overhead Analysis for Effective Task Migration in Biosensor Networks Thu, 26 Sep 2013 10:16:33 +0000 http://www.hindawi.com/journals/bmri/2013/965318/ We present a quantitative overhead analysis for effective task migration in biosensor networks. A biosensor network is the key technology which can automatically provide accurate and specific parameters of a human in real time. Biosensor nodes are typically very small devices, so the use of computing resources is restricted. Due to the limitation of nodes, the biosensor network is vulnerable to an external attack against a system for exhausting system availability. Since biosensor nodes generally deal with sensitive and privacy data, their malfunction can bring unexpected damage to system. Therefore, we have to use a task migration process to avoid the malfunction of particular biosensor nodes. Also, it is essential to accurately analyze overhead to apply a proper migration process. In this paper, we calculated task processing time of nodes to analyze system overhead and compared the task processing time applied to a migration process and a general method. We focused on a cluster ratio and different processing time between biosensor nodes in our simulation environment. The results of performance evaluation show that task execution time is greatly influenced by a cluster ratio and different processing time of biosensor nodes. In the results, the proposed algorithm reduces total task execution time in a migration process. Sung-Min Jung, Tae-Kyung Kim, Jung-Ho Eom, and Tai-Myoung Chung Copyright © 2013 Sung-Min Jung et al. All rights reserved. Mixing Energy Models in Genetic Algorithms for On-Lattice Protein Structure Prediction Wed, 25 Sep 2013 11:32:53 +0000 http://www.hindawi.com/journals/bmri/2013/924137/ Protein structure prediction (PSP) is computationally a very challenging problem. The challenge largely comes from the fact that the energy function that needs to be minimised in order to obtain the native structure of a given protein is not clearly known. A high resolution energy model could better capture the behaviour of the actual energy function than a low resolution energy model such as hydrophobic polar. However, the fine grained details of the high resolution interaction energy matrix are often not very informative for guiding the search. In contrast, a low resolution energy model could effectively bias the search towards certain promising directions. In this paper, we develop a genetic algorithm that mainly uses a high resolution energy model for protein structure evaluation but uses a low resolution HP energy model in focussing the search towards exploring structures that have hydrophobic cores. We experimentally show that this mixing of energy models leads to significant lower energy structures compared to the state-of-the-art results. Mahmood A. Rashid, M. A. Hakim Newton, Md. Tamjidul Hoque, and Abdul Sattar Copyright © 2013 Mahmood A. Rashid et al. All rights reserved. Advanced Systems Biology Methods in Drug Discovery and Translational Biomedicine Thu, 19 Sep 2013 13:38:59 +0000 http://www.hindawi.com/journals/bmri/2013/742835/ Systems biology is in an exponential development stage in recent years and has been widely utilized in biomedicine to better understand the molecular basis of human disease and the mechanism of drug action. Here, we discuss the fundamental concept of systems biology and its two computational methods that have been commonly used, that is, network analysis and dynamical modeling. The applications of systems biology in elucidating human disease are highlighted, consisting of human disease networks, treatment response prediction, investigation of disease mechanisms, and disease-associated gene prediction. In addition, important advances in drug discovery, to which systems biology makes significant contributions, are discussed, including drug-target networks, prediction of drug-target interactions, investigation of drug adverse effects, drug repositioning, and drug combination prediction. The systems biology methods and applications covered in this review provide a framework for addressing disease mechanism and approaching drug discovery, which will facilitate the translation of research findings into clinical benefits such as novel biomarkers and promising therapies. Jun Zou, Ming-Wu Zheng, Gen Li, and Zhi-Guang Su Copyright © 2013 Jun Zou et al. All rights reserved. Evaluation of Stream Mining Classifiers for Real-Time Clinical Decision Support System: A Case Study of Blood Glucose Prediction in Diabetes Therapy Thu, 19 Sep 2013 10:05:03 +0000 http://www.hindawi.com/journals/bmri/2013/274193/ Earlier on, a conceptual design on the real-time clinical decision support system (rt-CDSS) with data stream mining was proposed and published. The new system is introduced that can analyze medical data streams and can make real-time prediction. This system is based on a stream mining algorithm called VFDT. The VFDT is extended with the capability of using pointers to allow the decision tree to remember the mapping relationship between leaf nodes and the history records. In this paper, which is a sequel to the rt-CDSS design, several popular machine learning algorithms are investigated for their suitability to be a candidate in the implementation of classifier at the rt-CDSS. A classifier essentially needs to accurately map the events inputted to the system into one of the several predefined classes of assessments, such that the rt-CDSS can follow up with the prescribed remedies being recommended to the clinicians. For a real-time system like rt-CDSS, the major technological challenges lie in the capability of the classifier to process, analyze and classify the dynamic input data, quickly and upmost reliably. An experimental comparison is conducted. This paper contributes to the insight of choosing and embedding a stream mining classifier into rt-CDSS with a case study of diabetes therapy. Simon Fong, Yang Zhang, Jinan Fiaidhi, Osama Mohammed, and Sabah Mohammed Copyright © 2013 Simon Fong et al. All rights reserved. New Optical Methods for Liveness Detection on Fingers Wed, 18 Sep 2013 19:13:30 +0000 http://www.hindawi.com/journals/bmri/2013/197925/ This paper is devoted to new optical methods, which are supposed to be used for liveness detection on fingers. First we describe the basics about fake finger use in fingerprint recognition process and the possibilities of liveness detection. Then we continue with introducing three new liveness detection methods, which we developed and tested in the scope of our research activities—the first one is based on measurement of the pulse, the second one on variations of optical characteristics caused by pressure change, and the last one is based on reaction of skin to illumination with different wavelengths. The last part deals with the influence of skin diseases on fingerprint recognition, especially on liveness detection. Martin Drahansky, Michal Dolezel, Jan Vana, Eva Brezinova, Jaegeol Yim, and Kyubark Shim Copyright © 2013 Martin Drahansky et al. All rights reserved. CADe System Integrated within the Electronic Health Record Tue, 17 Sep 2013 13:34:59 +0000 http://www.hindawi.com/journals/bmri/2013/219407/ The latest technological advances and information support systems for clinics and hospitals produce a wide range of possibilities in the storage and retrieval of an ever-growing amount of clinical information as well as in detection and diagnosis. In this work, an Electronic Health Record (EHR) combined with a Computer Aided Detection (CADe) system for breast cancer diagnosis has been implemented. Our objective is to provide to radiologists a comprehensive working environment that facilitates the integration, the image visualization, and the use of aided tools within the EHR. For this reason, a development methodology based on hardware and software system features in addition to system requirements must be present during the whole development process. This will lead to a complete environment for displaying, editing, and reporting results not only for the patient information but also for their medical images in standardised formats such as DICOM and DICOM-SR. As a result, we obtain a CADe system which helps in detecting breast cancer using mammograms and is completely integrated into an EHR. Noelia Vállez, Gloria Bueno, Óscar Déniz, María del Milagro Fernández, Carlos Pastor, Miguel Ángel Rienda, Pablo Esteve, and María Arias Copyright © 2013 Noelia Vállez et al. All rights reserved. Study of MicroRNAs Related to the Liver Regeneration of the Whitespotted Bamboo Shark, Chiloscyllium plagiosum Tue, 17 Sep 2013 09:55:52 +0000 http://www.hindawi.com/journals/bmri/2013/795676/ To understand the mechanisms of liver regeneration better to promote research examining liver diseases and marine biology, normal and regenerative liver tissues of Chiloscyllium plagiosum were harvested 0 h and 24 h after partial hepatectomy (PH) and used to isolate small RNAs for miRNA sequencing. In total, 91 known miRNAs and 166 putative candidate (PC) miRNAs were identified for the first time in Chiloscyllium plagiosum. Through target prediction and GO analysis, 46 of 91 known miRNAs were screened specially for cellular proliferation and growth. Differential expression levels of three miRNAs (xtr-miR-125b, fru-miR-204, and hsa-miR-142-3p_R-1) related to cellular proliferation and apoptosis were measured in normal and regenerating liver tissues at 0 h, 6 h, 12 h, and 24 h using real-time PCR. The expression of these miRNAs showed a rising trend in regenerative liver tissues at 6 h and 12 h but exhibited a downward trend compared to normal levels at 24 h. Differentially expressed genes were screened in normal and regenerating liver tissues at 24 h by DDRT-PCR, and ten sequences were identified. This study provided information regarding the function of genes related to liver regeneration, deepened the understanding of mechanisms of liver regeneration, and resulted in the addition of a significant number of novel miRNAs sequences to GenBank. Conger Lu, Jie Zhang, Zuoming Nie, Jian Chen, Wenping Zhang, Xiaoyuan Ren, Wei Yu, Lili Liu, Caiying Jiang, Yaozhou Zhang, Jiangfeng Guo, Wutong Wu, Jianhong Shu, and Zhengbing Lv Copyright © 2013 Conger Lu et al. All rights reserved. A Study on User Authentication Methodology Using Numeric Password and Fingerprint Biometric Information Tue, 17 Sep 2013 08:35:22 +0000 http://www.hindawi.com/journals/bmri/2013/427542/ The prevalence of computers and the development of the Internet made us able to easily access information. As people are concerned about user information security, the interest of the user authentication method is growing. The most common computer authentication method is the use of alphanumerical usernames and passwords. The password authentication systems currently used are easy, but only if you know the password, as the user authentication is vulnerable. User authentication using fingerprints, only the user with the information that is specific to the authentication security is strong. But there are disadvantage such as the user cannot change the authentication key. In this study, we proposed authentication methodology that combines numeric-based password and biometric-based fingerprint authentication system. Use the information in the user's fingerprint, authentication keys to obtain security. Also, using numeric-based password can to easily change the password; the authentication keys were designed to provide flexibility. Seung-hwan Ju, Hee-suk Seo, Sung-hyu Han, Jae-cheol Ryou, and Jin Kwak Copyright © 2013 Seung-hwan Ju et al. All rights reserved. Biomarker Selection and Classification of “-Omics” Data Using a Two-Step Bayes Classification Framework Wed, 11 Sep 2013 11:40:11 +0000 http://www.hindawi.com/journals/bmri/2013/148014/ Identification of suitable biomarkers for accurate prediction of phenotypic outcomes is a goal for personalized medicine. However, current machine learning approaches are either too complex or perform poorly. Here, a novel two-step machine-learning framework is presented to address this need. First, a Naïve Bayes estimator is used to rank features from which the top-ranked will most likely contain the most informative features for prediction of the underlying biological classes. The top-ranked features are then used in a Hidden Naïve Bayes classifier to construct a classification prediction model from these filtered attributes. In order to obtain the minimum set of the most informative biomarkers, the bottom-ranked features are successively removed from the Naïve Bayes-filtered feature list one at a time, and the classification accuracy of the Hidden Naïve Bayes classifier is checked for each pruned feature set. The performance of the proposed two-step Bayes classification framework was tested on different types of -omics datasets including gene expression microarray, single nucleotide polymorphism microarray (SNParray), and surface-enhanced laser desorption/ionization time-of-flight (SELDI-TOF) proteomic data. The proposed two-step Bayes classification framework was equal to and, in some cases, outperformed other classification methods in terms of prediction accuracy, minimum number of classification markers, and computational time. Anunchai Assawamakin, Supakit Prueksaaroon, Supasak Kulawonganunchai, Philip James Shaw, Vara Varavithya, Taneth Ruangrajitpakorn, and Sissades Tongsima Copyright © 2013 Anunchai Assawamakin et al. All rights reserved. Cloud Infrastructures for In Silico Drug Discovery: Economic and Practical Aspects Tue, 10 Sep 2013 08:15:34 +0000 http://www.hindawi.com/journals/bmri/2013/138012/ Cloud computing opens new perspectives for small-medium biotechnology laboratories that need to perform bioinformatics analysis in a flexible and effective way. This seems particularly true for hybrid clouds that couple the scalability offered by general-purpose public clouds with the greater control and ad hoc customizations supplied by the private ones. A hybrid cloud broker, acting as an intermediary between users and public providers, can support customers in the selection of the most suitable offers, optionally adding the provisioning of dedicated services with higher levels of quality. This paper analyses some economic and practical aspects of exploiting cloud computing in a real research scenario for the in silico drug discovery in terms of requirements, costs, and computational load based on the number of expected users. In particular, our work is aimed at supporting both the researchers and the cloud broker delivering an IaaS cloud infrastructure for biotechnology laboratories exposing different levels of nonfunctional requirements. Daniele D'Agostino, Andrea Clematis, Alfonso Quarati, Daniele Cesini, Federica Chiappori, Luciano Milanesi, and Ivan Merelli Copyright © 2013 Daniele D'Agostino et al. All rights reserved. Characterization of Schizophrenia Adverse Drug Interactions through a Network Approach and Drug Classification Mon, 09 Sep 2013 17:58:21 +0000 http://www.hindawi.com/journals/bmri/2013/458989/ Antipsychotic drugs are medications commonly for schizophrenia (SCZ) treatment, which include two groups: typical and atypical. SCZ patients have multiple comorbidities, and the coadministration of drugs is quite common. This may result in adverse drug-drug interactions, which are events that occur when the effect of a drug is altered by the coadministration of another drug. Therefore, it is important to provide a comprehensive view of these interactions for further coadministration improvement. Here, we extracted SCZ drugs and their adverse drug interactions from the DrugBank and compiled a SCZ-specific adverse drug interaction network. This network included 28 SCZ drugs, 241 non-SCZs, and 991 interactions. By integrating the Anatomical Therapeutic Chemical (ATC) classification with the network analysis, we characterized those interactions. Our results indicated that SCZ drugs tended to have more adverse drug interactions than other drugs. Furthermore, SCZ typical drugs had significant interactions with drugs of the “alimentary tract and metabolism” category while SCZ atypical drugs had significant interactions with drugs of the categories “nervous system” and “antiinfectives for systemic uses.” This study is the first to characterize the adverse drug interactions in the course of SCZ treatment and might provide useful information for the future SCZ treatment. Jingchun Sun, Min Zhao, Ayman H. Fanous, and Zhongming Zhao Copyright © 2013 Jingchun Sun et al. All rights reserved. Structural and Sequence Similarities of Hydra Xeroderma Pigmentosum A Protein to Human Homolog Suggest Early Evolution and Conservation Thu, 05 Sep 2013 15:39:25 +0000 http://www.hindawi.com/journals/bmri/2013/854745/ Xeroderma pigmentosum group A (XPA) is a protein that binds to damaged DNA, verifies presence of a lesion, and recruits other proteins of the nucleotide excision repair (NER) pathway to the site. Though its homologs from yeast, Drosophila, humans, and so forth are well studied, XPA has not so far been reported from protozoa and lower animal phyla. Hydra is a fresh-water cnidarian with a remarkable capacity for regeneration and apparent lack of organismal ageing. Cnidarians are among the first metazoa with a defined body axis, tissue grade organisation, and nervous system. We report here for the first time presence of XPA gene in hydra. Putative protein sequence of hydra XPA contains nuclear localization signal and bears the zinc-finger motif. It contains two conserved Pfam domains and various characterized features of XPA proteins like regions for binding to excision repair cross-complementing protein-1 (ERCC1) and replication protein A 70 kDa subunit (RPA70) proteins. Hydra XPA shows a high degree of similarity with vertebrate homologs and clusters with deuterostomes in phylogenetic analysis. Homology modelling corroborates the very close similarity between hydra and human XPA. The protein thus most likely functions in hydra in the same manner as in other animals, indicating that it arose early in evolution and has been conserved across animal phyla. Apurva Barve, Saroj Ghaskadbi, and Surendra Ghaskadbi Copyright © 2013 Apurva Barve et al. All rights reserved. In Silico Determination and Validation of Baumannii Acinetobactin Utilization A Structure and Ligand Binding Site Thu, 05 Sep 2013 15:17:04 +0000 http://www.hindawi.com/journals/bmri/2013/172784/ Acinetobacter baumannii is a deadly nosocomial pathogen. Iron is an essential element for the pathogen. Under iron-restricted conditions, the bacterium expresses iron-regulated outer membrane proteins (IROMPs). Baumannii acinetobactin utilization (BauA) is the most important member of IROMPs in A. baumannii. Determination of its tertiary structure could help deduction of its functions and its interactions with ligands. The present study unveils BauA 3D structure via in silico approaches. Apart from ab initio, other rational methods such as homology modeling and threading were invoked to achieve the purpose. For homology modeling, BLAST was run on the sequence in order to find the best template. The template was then served to model the 3D structure. All the models built were evaluated qualitatively. The best model predicted by LOMETS was selected for analyses. Refinement of 3D structure as well as determination of its clefts and ligand binding sites was carried out on the structure. In contrast to the typical trimeric arrangement found in porins, BauA is monomeric. The barrel is formed by 22 antiparallel transmembrane β-strands. There are short periplasmic turns and longer surface-located loops. An N-terminal domain referred to either as the cork, the plug, or the hatch domain occludes the β-barrel. Fatemeh Sefid, Iraj Rasooli, and Abolfazl Jahangiri Copyright © 2013 Fatemeh Sefid et al. All rights reserved. Prediction of Effective Drug Combinations by Chemical Interaction, Protein Interaction and Target Enrichment of KEGG Pathways Thu, 05 Sep 2013 11:22:39 +0000 http://www.hindawi.com/journals/bmri/2013/723780/ Drug combinatorial therapy could be more effective in treating some complex diseases than single agents due to better efficacy and reduced side effects. Although some drug combinations are being used, their underlying molecular mechanisms are still poorly understood. Therefore, it is of great interest to deduce a novel drug combination by their molecular mechanisms in a robust and rigorous way. This paper attempts to predict effective drug combinations by a combined consideration of: (1) chemical interaction between drugs, (2) protein interactions between drugs’ targets, and (3) target enrichment of KEGG pathways. A benchmark dataset was constructed, consisting of 121 confirmed effective combinations and 605 random combinations. Each drug combination was represented by 465 features derived from the aforementioned three properties. Some feature selection techniques, including Minimum Redundancy Maximum Relevance and Incremental Feature Selection, were adopted to extract the key features. Random forest model was built with its performance evaluated by 5-fold cross-validation. As a result, 55 key features providing the best prediction result were selected. These important features may help to gain insights into the mechanisms of drug combinations, and the proposed prediction model could become a useful tool for screening possible drug combinations. Lei Chen, Bi-Qing Li, Ming-Yue Zheng, Jian Zhang, Kai-Yan Feng, and Yu-Dong Cai Copyright © 2013 Lei Chen et al. All rights reserved. Predicting Drugs Side Effects Based on Chemical-Chemical Interactions and Protein-Chemical Interactions Wed, 04 Sep 2013 08:31:26 +0000 http://www.hindawi.com/journals/bmri/2013/485034/ A drug side effect is an undesirable effect which occurs in addition to the intended therapeutic effect of the drug. The unexpected side effects that many patients suffer from are the major causes of large-scale drug withdrawal. To address the problem, it is highly demanded by pharmaceutical industries to develop computational methods for predicting the side effects of drugs. In this study, a novel computational method was developed to predict the side effects of drug compounds by hybridizing the chemical-chemical and protein-chemical interactions. Compared to most of the previous works, our method can rank the potential side effects for any query drug according to their predicted level of risk. A training dataset and test datasets were constructed from the benchmark dataset that contains 835 drug compounds to evaluate the method. By a jackknife test on the training dataset, the 1st order prediction accuracy was 86.30%, while it was 89.16% on the test dataset. It is expected that the new method may become a useful tool for drug design, and that the findings obtained by hybridizing various interactions in a network system may provide useful insights for conducting in-depth pharmacological research as well, particularly at the level of systems biomedicine. Lei Chen, Tao Huang, Jian Zhang, Ming-Yue Zheng, Kai-Yan Feng, Yu-Dong Cai, and Kuo-Chen Chou Copyright © 2013 Lei Chen et al. All rights reserved. Information Content-Based Gene Ontology Semantic Similarity Approaches: Toward a Unified Framework Theory Mon, 02 Sep 2013 14:28:42 +0000 http://www.hindawi.com/journals/bmri/2013/292063/ Several approaches have been proposed for computing term information content (IC) and semantic similarity scores within the gene ontology (GO) directed acyclic graph (DAG). These approaches contributed to improving protein analyses at the functional level. Considering the recent proliferation of these approaches, a unified theory in a well-defined mathematical framework is necessary in order to provide a theoretical basis for validating these approaches. We review the existing IC-based ontological similarity approaches developed in the context of biomedical and bioinformatics fields to propose a general framework and unified description of all these measures. We have conducted an experimental evaluation to assess the impact of IC approaches, different normalization models, and correction factors on the performance of a functional similarity metric. Results reveal that considering only parents or only children of terms when assessing information content or semantic similarity scores negatively impacts the approach under consideration. This study produces a unified framework for current and future GO semantic similarity measures and provides theoretical basics for comparing different approaches. The experimental evaluation of different approaches based on different term information content models paves the way towards a solution to the issue of scoring a term’s specificity in the GO DAG. Gaston K. Mazandu and Nicola J. Mulder Copyright © 2013 Gaston K. Mazandu and Nicola J. Mulder. All rights reserved. Network-Based Inference Framework for Identifying Cancer Genes from Gene Expression Data Sun, 01 Sep 2013 13:24:33 +0000 http://www.hindawi.com/journals/bmri/2013/401649/ Great efforts have been devoted to alleviate uncertainty of detected cancer genes as accurate identification of oncogenes is of tremendous significance and helps unravel the biological behavior of tumors. In this paper, we present a differential network-based framework to detect biologically meaningful cancer-related genes. Firstly, a gene regulatory network construction algorithm is proposed, in which a boosting regression based on likelihood score and informative prior is employed for improving accuracy of identification. Secondly, with the algorithm, two gene regulatory networks are constructed from case and control samples independently. Thirdly, by subtracting the two networks, a differential-network model is obtained and then used to rank differentially expressed hub genes for identification of cancer biomarkers. Compared with two existing gene-based methods (t-test and lasso), the method has a significant improvement in accuracy both on synthetic datasets and two real breast cancer datasets. Furthermore, identified six genes (TSPYL5, CD55, CCNE2, DCK, BBC3, and MUC1) susceptible to breast cancer were verified through the literature mining, GO analysis, and pathway functional enrichment analysis. Among these oncogenes, TSPYL5 and CCNE2 have been already known as prognostic biomarkers in breast cancer, CD55 has been suspected of playing an important role in breast cancer prognosis from literature evidence, and other three genes are newly discovered breast cancer biomarkers. More generally, the differential-network schema can be extended to other complex diseases for detection of disease associated-genes. Bo Yang, Junying Zhang, Yaling Yin, and Yuanyuan Zhang Copyright © 2013 Bo Yang et al. All rights reserved. Secure Encapsulation and Publication of Biological Services in the Cloud Computing Environment Sun, 01 Sep 2013 11:35:18 +0000 http://www.hindawi.com/journals/bmri/2013/170580/ Secure encapsulation and publication for bioinformatics software products based on web service are presented, and the basic function of biological information is realized in the cloud computing environment. In the encapsulation phase, the workflow and function of bioinformatics software are conducted, the encapsulation interfaces are designed, and the runtime interaction between users and computers is simulated. In the publication phase, the execution and management mechanisms and principles of the GRAM components are analyzed. The functions such as remote user job submission and job status query are implemented by using the GRAM components. The services of bioinformatics software are published to remote users. Finally the basic prototype system of the biological cloud is achieved. Weizhe Zhang, Xuehui Wang, Bo Lu, and Tai-hoon Kim Copyright © 2013 Weizhe Zhang et al. All rights reserved. Selecting Summary Statistics in Approximate Bayesian Computation for Calibrating Stochastic Models Sun, 01 Sep 2013 09:47:51 +0000 http://www.hindawi.com/journals/bmri/2013/210646/ Approximate Bayesian computation (ABC) is an approach for using measurement data to calibrate stochastic computer models, which are common in biology applications. ABC is becoming the “go-to” option when the data and/or parameter dimension is large because it relies on user-chosen summary statistics rather than the full data and is therefore computationally feasible. One technical challenge with ABC is that the quality of the approximation to the posterior distribution of model parameters depends on the user-chosen summary statistics. In this paper, the user requirement to choose effective summary statistics in order to accurately estimate the posterior distribution of model parameters is investigated and illustrated by example, using a model and corresponding real data of mitochondrial DNA population dynamics. We show that for some choices of summary statistics, the posterior distribution of model parameters is closely approximated and for other choices of summary statistics, the posterior distribution is not closely approximated. A strategy to choose effective summary statistics is suggested in cases where the stochastic computer model can be run at many trial parameter settings, as in the example. Tom Burr and Alexei Skurikhin Copyright © 2013 Tom Burr and Alexei Skurikhin. All rights reserved. Molecular Dynamic Simulation and Inhibitor Prediction of Cysteine Synthase Structured Model as a Potential Drug Target for Trichomoniasis Sun, 01 Sep 2013 08:07:54 +0000 http://www.hindawi.com/journals/bmri/2013/390920/ In our presented research, we made an attempt to predict the 3D model for cysteine synthase (A2GMG5_TRIVA) using homology-modeling approaches. To investigate deeper into the predicted structure, we further performed a molecular dynamics simulation for 10 ns and calculated several supporting analysis for structural properties such as RMSF, radius of gyration, and the total energy calculation to support the predicted structured model of cysteine synthase. The present findings led us to conclude that the proposed model is stereochemically stable. The overall PROCHECK G factor for the homology-modeled structure was −0.04. On the basis of the virtual screening for cysteine synthase against the NCI subset II molecule, we present the molecule 1-N, 4-N-bis [3-(1H-benzimidazol-2-yl) phenyl] benzene-1,4-dicarboxamide (ZINC01690699) having the minimum energy score (−13.0 Kcal/Mol) and a log P value of 6 as a potential inhibitory molecule used to inhibit the growth of T. vaginalis infection. Satendra Singh, Gaurav Sablok, Rohit Farmer, Atul Kumar Singh, Budhayash Gautam, and Sunil Kumar Copyright © 2013 Satendra Singh et al. All rights reserved. Prokaryotic Phylogenies Inferred from Whole-Genome Sequence and Annotation Data Thu, 29 Aug 2013 15:03:53 +0000 http://www.hindawi.com/journals/bmri/2013/409062/ Phylogenetic trees are used to represent the evolutionary relationship among various groups of species. In this paper, a novel method for inferring prokaryotic phylogenies using multiple genomic information is proposed. The method is called CGCPhy and based on the distance matrix of orthologous gene clusters between whole-genome pairs. CGCPhy comprises four main steps. First, orthologous genes are determined by sequence similarity, genomic function, and genomic structure information. Second, genes involving potential HGT events are eliminated, since such genes are considered to be the highly conserved genes across different species and the genes located on fragments with abnormal genome barcode. Third, we calculate the distance of the orthologous gene clusters between each genome pair in terms of the number of orthologous genes in conserved clusters. Finally, the neighbor-joining method is employed to construct phylogenetic trees across different species. CGCPhy has been examined on different datasets from 617 complete single-chromosome prokaryotic genomes and achieved applicative accuracies on different species sets in agreement with Bergey's taxonomy in quartet topologies. Simulation results show that CGCPhy achieves high average accuracy and has a low standard deviation on different datasets, so it has an applicative potential for phylogenetic analysis. Wei Du, Zhongbo Cao, Yan Wang, Ying Sun, Enrico Blanzieri, and Yanchun Liang Copyright © 2013 Wei Du et al. All rights reserved. SeedSeq: Off-Target Transcriptome Database Thu, 29 Aug 2013 08:34:55 +0000 http://www.hindawi.com/journals/bmri/2013/905429/ Detection of potential cross-reaction between a short oligonucleotide sequence and a longer (unintended) sequence is crucial for many biological applications, such as high content screening (HCS), microarray nucleotide probes, or short interfering RNAs (siRNAs). However, owing to a tolerance for mismatches and gaps in base-pairing with target transcripts, siRNAs could have up to hundreds of potential target sequences in a genome, and some small RNAs in mammalian systems have been shown to affect the levels of many messenger RNAs (off-targets) besides their intended target transcripts (on-targets). The reference sequence (RefSeq) collection aims to provide a comprehensive, integrated, nonredundant, well-annotated set of sequences, including mRNA transcripts. We performed a detailed off-target analysis of three most commonly used kinome siRNA libraries based on the latest RefSeq version. To simplify the access to off-target transcripts, we created a SeedSeq database, a new unique format to store off-target information. Shaoli Das, Suman Ghosal, Jayprokas Chakrabarti, and Karol Kozak Copyright © 2013 Shaoli Das et al. All rights reserved. Position-Specific Analysis and Prediction of Protein Pupylation Sites Based on Multiple Features Mon, 26 Aug 2013 14:43:31 +0000 http://www.hindawi.com/journals/bmri/2013/109549/ Pupylation is one of the most important posttranslational modifications of proteins; accurate identification of pupylation sites will facilitate the understanding of the molecular mechanism of pupylation. Besides the conventional experimental approaches, computational prediction of pupylation sites is much desirable for their convenience and fast speed. In this study, we developed a novel predictor to predict the pupylation sites. First, the maximum relevance minimum redundancy (mRMR) and incremental feature selection methods were made on five kinds of features to select the optimal feature set. Then the prediction model was built based on the optimal feature set with the assistant of the support vector machine algorithm. As a result, the overall jackknife success rate by the new predictor on a newly constructed benchmark dataset was 0.764, and the Mathews correlation coefficient was 0.522, indicating a good prediction. Feature analysis showed that all features types contributed to the prediction of protein pupylation sites. Further site-specific features analysis revealed that the features of sites surrounding the central lysine contributed more to the determination of pupylation sites than the other sites. Xiaowei Zhao, Jiangyan Dai, Qiao Ning, Zhiqiang Ma, Minghao Yin, and Pingping Sun Copyright © 2013 Xiaowei Zhao et al. All rights reserved. Recognition of Multiple Imbalanced Cancer Types Based on DNA Microarray Data Using Ensemble Classifiers Mon, 26 Aug 2013 13:41:52 +0000 http://www.hindawi.com/journals/bmri/2013/239628/ DNA microarray technology can measure the activities of tens of thousands of genes simultaneously, which provides an efficient way to diagnose cancer at the molecular level. Although this strategy has attracted significant research attention, most studies neglect an important problem, namely, that most DNA microarray datasets are skewed, which causes traditional learning algorithms to produce inaccurate results. Some studies have considered this problem, yet they merely focus on binary-class problem. In this paper, we dealt with multiclass imbalanced classification problem, as encountered in cancer DNA microarray, by using ensemble learning. We utilized one-against-all coding strategy to transform multiclass to multiple binary classes, each of them carrying out feature subspace, which is an evolving version of random subspace that generates multiple diverse training subsets. Next, we introduced one of two different correction technologies, namely, decision threshold adjustment or random undersampling, into each training subset to alleviate the damage of class imbalance. Specifically, support vector machine was used as base classifier, and a novel voting rule called counter voting was presented for making a final decision. Experimental results on eight skewed multiclass cancer microarray datasets indicate that unlike many traditional classification approaches, our methods are insensitive to class imbalance. Hualong Yu, Shufang Hong, Xibei Yang, Jun Ni, Yuanyuan Dan, and Bin Qin Copyright © 2013 Hualong Yu et al. All rights reserved. SubMito-PSPCP: Predicting Protein Submitochondrial Locations by Hybridizing Positional Specific Physicochemical Properties with Pseudoamino Acid Compositions Wed, 21 Aug 2013 11:35:47 +0000 http://www.hindawi.com/journals/bmri/2013/263829/ Knowing the submitochondrial location of a mitochondrial protein is an important step in understanding its function. We developed a new method for predicting protein submitochondrial locations by introducing a new concept: positional specific physicochemical properties. With the framework of general form pseudoamino acid compositions, our method used only about 100 features to represent protein sequences, which is much simpler than the existing methods. On the dataset of SubMito, our method achieved over 93% overall accuracy, with 98.60% for inner membrane, 93.90% for matrix, and 70.70% for outer membrane, which are comparable to all state-of-the-art methods. As our method can be used as a general method to upgrade all pseudoamino-acid-composition-based methods, it should be very useful in future studies. We implement our method as an online service: SubMito-PSPCP. Pufeng Du and Yuan Yu Copyright © 2013 Pufeng Du and Yuan Yu. All rights reserved. An Approach for Identifying Cytokines Based on a Novel Ensemble Classifier Wed, 21 Aug 2013 10:26:33 +0000 http://www.hindawi.com/journals/bmri/2013/686090/ Biology is meaningful and important to identify cytokines and investigate their various functions and biochemical mechanisms. However, several issues remain, including the large scale of benchmark datasets, serious imbalance of data, and discovery of new gene families. In this paper, we employ the machine learning approach based on a novel ensemble classifier to predict cytokines. We directly selected amino acids sequences as research objects. First, we pretreated the benchmark data accurately. Next, we analyzed the physicochemical properties and distribution of whole amino acids and then extracted a group of 120-dimensional (120D) valid features to represent sequences. Third, in the view of the serious imbalance in benchmark datasets, we utilized a sampling approach based on the synthetic minority oversampling technique algorithm and K-means clustering undersampling algorithm to rebuild the training set. Finally, we built a library for dynamic selection and circulating combination based on clustering (LibD3C) and employed the new training set to realize cytokine classification. Experiments showed that the geometric mean of sensitivity and specificity obtained through our approach is as high as 93.3%, which proves that our approach is effective for identifying cytokines. Quan Zou, Zhen Wang, Xinjun Guan, Bin Liu, Yunfeng Wu, and Ziyu Lin Copyright © 2013 Quan Zou et al. All rights reserved. Optimal Control of Gene Regulatory Networks with Effectiveness of Multiple Drugs: A Boolean Network Approach Wed, 21 Aug 2013 09:10:43 +0000 http://www.hindawi.com/journals/bmri/2013/246761/ Developing control theory of gene regulatory networks is one of the significant topics in the field of systems biology, and it is expected to apply the obtained results to gene therapy technologies in the future. In this paper, a control method using a Boolean network (BN) is studied. A BN is widely used as a model of gene regulatory networks, and gene expression is expressed by a binary value (0 or 1). In the control problem, we assume that the concentration level of a part of genes is arbitrarily determined as the control input. However, there are cases that no gene satisfying this assumption exists, and it is important to consider structural control via external stimuli. Furthermore, these controls are realized by multiple drugs, and it is also important to consider multiple effects such as duration of effect and side effects. In this paper, we propose a BN model with two types of the control inputs and an optimal control method with duration of drug effectiveness. First, a BN model and duration of drug effectiveness are discussed. Next, the optimal control problem is formulated and is reduced to an integer linear programming problem. Finally, numerical simulations are shown. Koichi Kobayashi and Kunihiko Hiraishi Copyright © 2013 Koichi Kobayashi and Kunihiko Hiraishi. All rights reserved. Image Analysis of Endosocopic Ultrasonography in Submucosal Tumor Using Fuzzy Inference Mon, 19 Aug 2013 08:30:54 +0000 http://www.hindawi.com/journals/bmri/2013/329046/ Endoscopists usually make a diagnosis in the submucosal tumor depending on the subjective evaluation about general images obtained by endoscopic ultrasonography. In this paper, we propose a method to extract areas of gastrointestinal stromal tumor (GIST) and lipoma automatically from the ultrasonic image to assist those specialists. We also propose an algorithm to differentiate GIST from non-GIST by fuzzy inference from such images after applying ROC curve with mean and standard deviation of brightness information. In experiments using real images that medical specialists use, we verify that our method is sufficiently helpful for such specialists for efficient classification of submucosal tumors. Kwang Baek Kim and Gwang Ha Kim Copyright © 2013 Kwang Baek Kim and Gwang Ha Kim. All rights reserved. An Efficient Ensemble Learning Method for Gene Microarray Classification Wed, 14 Aug 2013 09:09:15 +0000 http://www.hindawi.com/journals/bmri/2013/478410/ The gene microarray analysis and classification have demonstrated an effective way for the effective diagnosis of diseases and cancers. However, it has been also revealed that the basic classification techniques have intrinsic drawbacks in achieving accurate gene classification and cancer diagnosis. On the other hand, classifier ensembles have received increasing attention in various applications. Here, we address the gene classification issue using RotBoost ensemble methodology. This method is a combination of Rotation Forest and AdaBoost techniques which in turn preserve both desirable features of an ensemble architecture, that is, accuracy and diversity. To select a concise subset of informative genes, 5 different feature selection algorithms are considered. To assess the efficiency of the RotBoost, other nonensemble/ensemble techniques including Decision Trees, Support Vector Machines, Rotation Forest, AdaBoost, and Bagging are also deployed. Experimental results have revealed that the combination of the fast correlation-based feature selection method with ICA-based RotBoost ensemble is highly effective for gene classification. In fact, the proposed method can create ensemble classifiers which outperform not only the classifiers produced by the conventional machine learning but also the classifiers generated by two widely used conventional ensemble learning methods, that is, Bagging and AdaBoost. Alireza Osareh and Bita Shadgar Copyright © 2013 Alireza Osareh and Bita Shadgar. All rights reserved. Designing a Bioengine for Detection and Analysis of Base String on an Affected Sequence in High-Concentration Regions Tue, 13 Aug 2013 11:38:18 +0000 http://www.hindawi.com/journals/bmri/2013/372646/ We design an Algorithm for bioengine. As a program are enable optimal alignments searching between two sequences, the host sequence (normal plant) as well as query sequence (virus). Searching for homologues has become a routine operation of biological sequences in 4 × 4 combination with different subsequence (word size). This program takes the advantage of the high degree of homology between such sequences to construct an alignment of the matching regions. There is a main aim which is to detect the overlapping reading frames. This program also enables to find out the highly infected colones selection highest matching region with minimum gap or mismatch zones and unique virus colones matches. This is a small, portable, interactive, front-end program intended to be used to find out the regions of matching between host sequence and query subsequences. All the operations are carried out in fraction of seconds, depending on the required task and on the sequence length. Debnath Bhattacharyya, Bijoy Kumar Mandal, and Tai-hoon Kim Copyright © 2013 Debnath Bhattacharyya et al. All rights reserved. Prediction and Analysis of Retinoblastoma Related Genes through Gene Ontology and KEGG Tue, 13 Aug 2013 10:15:12 +0000 http://www.hindawi.com/journals/bmri/2013/304029/ One of the most important and challenging problems in biomedicine is how to predict the cancer related genes. Retinoblastoma (RB) is the most common primary intraocular malignancy usually occurring in childhood. Early detection of RB could reduce the morbidity and promote the probability of disease-free survival. Therefore, it is of great importance to identify RB genes. In this study, we developed a computational method to predict RB related genes based on Dagging, with the maximum relevance minimum redundancy (mRMR) method followed by incremental feature selection (IFS). 119 RB genes were compiled from two previous RB related studies, while 5,500 non-RB genes were randomly selected from Ensemble genes. Ten datasets were constructed based on all these RB and non-RB genes. Each gene was encoded with a 13,126-dimensional vector including 12,887 Gene Ontology enrichment scores and 239 KEGG enrichment scores. Finally, an optimal feature set including 1061 GO terms and 8 KEGG pathways was obtained. Analysis showed that these features were closely related to RB. It is anticipated that the method can be applied to predict the other cancer related genes as well. Zhen Li, Bi-Qing Li, Min Jiang, Lei Chen, Jian Zhang, Lin Liu, and Tao Huang Copyright © 2013 Zhen Li et al. All rights reserved.