It is widely acknowledged that complex diseases or disorders (e.g., cancer, AIDS, and obesity) stem from the dysfunction of biomolecular networks, not only their isolated components (e.g., genes, proteins, and metabolites). Biomolecular networks typically include gene regulatory networks, protein-protein interaction networks, metabolic networks, and signal transduction networks. With advances in high throughput measurement techniques such as microarray, RNA-seq, ChIP-chip, yeast two-hybrid analysis, and mass spectrometry, large-scale biological data have been and will continuously be produced. Such data contain insightful information for understanding the mechanism of molecular biological systems and have proved useful in diagnosis, treatment, and drug design for complex diseases or disorders. For this special issue, we have invited the researchers to contribute their original studies in modeling/construction, analysis, synthesis, and control of complex disease-related biomolecular networks. This special issue accepted eleven articles for inclusion after rigorous review. We would like to introduce each of them by a short description.

Existing studies have shown that microRNAs (miRNAs) are involved in the development and progression of various complex diseases. Experimental identification of miRNA-disease association is expensive and time-consuming and thus it is appealing to design efficient algorithms to identify novel miRNA-disease association. In the paper “miRNA-Disease Association Prediction with Collaborative Matrix Factorization,” Z. Shen et al. developed the computational method of Collaborative Matrix Factorization for miRNA-Disease Association (CMFMDA) prediction to identify potential miRNA-disease associations by integrating miRNA functional similarity, disease semantic similarity, and experimentally verified miRNA-disease associations. Experiments verified that CMFMDA achieved intended purpose and application values with its short consuming-time and high prediction accuracy. In addition, CMFMDA was applied to reveal the potential related miRNAs of Esophageal Neoplasms and Kidney Neoplasms.

The identification of target molecules associated with specific complex diseases is the basis of modern drug discovery and development. The computational methods provide a low-cost and high-efficiency way for predicting drug-target interactions (DTIs) from biomolecular networks. In the paper “SDTRLS: Predicting Drug-Target Interactions for Complex Diseases Based on Chemical Substructures,” C. Yan et al. proposed a method (called SDTRLS) for predicting DTIs through RLS-Kron model with chemical substructure similarity fusion and Gaussian Interaction Profile (GIP) kernels. Their computational experiments showed that SDTRLS outperformed the state-of-the-art methods such as SDTNBI.

Vaccines represent one of the most effective interventions to control infectious diseases. Despite the many successes, an effective vaccine against current global pandemics such as HIV, malaria, and tuberculosis is still missing. In the paper “Exploring the Limitations of Peripheral Blood Transcriptional Biomarkers in Predicting Influenza Vaccine Responsiveness,” L. Marchetti et al. applied systems biology to vaccinology and employed a recently established algorithm for signature-based clustering of expression profiles, SCUDO, to provide new insights into why blood-derived transcriptome biomarkers often fail to predict the seroresponse to the influenza virus vaccination. Their analysis revealed that composite measures provided a more accurate assessment of the seroresponse to multicomponent influenza vaccines.

Protein complexes are involved in multiple biological processes, and thus detection of protein complexes is essential to the understanding of complex diseases. In the paper “Predicting Protein Complexes in Weighted Dynamic PPI Networks Based on ICSC,” J. Zhao et al. proposed a novel algorithm named improved Cuckoo Search Clustering (ICSC) algorithm for detecting protein complexes in weighted dynamic protein-protein interaction (PPI) networks. The experimental results on both DIP dataset and Krogan dataset demonstrated that ICSC algorithm was more effective in identifying protein complexes than other competing methods.

With the development of gene sequencing technology and other gene detection technologies, huge gene data have been generated. Differentially expressed genes identified from gene expression data play an important role in cancer diagnosis and classification. In the paper “Robust Nonnegative Matrix Factorization via Joint Graph Laplacian and Discriminative Information for Identifying Differentially Expressed Genes,” L.-Y. Dai et al. proposed a novel constrained method named robust nonnegative matrix factorization via joint graph Laplacian and discriminative information (GLD-RNMF) for identifying differentially expressed genes, in which manifold learning and the discriminative label information are incorporated into the traditional nonnegative matrix factorization model to train the objective matrix. The experimental results on two publicly available cancer datasets demonstrated that GLD-RNMF was an effective method for identifying differentially expressed genes.

Many of genetic changes represented neutral variations that do not contribute to cancer development which are called passenger mutations. Only a few alterations are causally implicated in the process of oncogenesis which are referred to as driver mutations. Although some large-scale cancer genomics projects have produced different omics data, it is still a major challenge to distinguish pathogenic driver mutations from the so-called random mutated passenger mutations. In the paper “DriverFinder: A Gene Length-Based Network Method to Identify Cancer Driver Genes,” P.-J. Wei et al. presented a gene length-based network method, named DriverFinder, to identify driver genes by integrating somatic mutations, copy number variations, gene-gene interaction network, tumor expression, and normal expression data. Their computational experimental results demonstrated the effectiveness of their proposed method.

Although Genome-Wide Association Studies (GWAS) predicted massive genetic variations related to complex traits, they can only explain a small part of the mechanism under the complex diseases known as “missing heritability.” In the paper “FAACOSE: A Fast Adaptive Ant Colony Optimization Algorithm for Detecting SNP Epistasis,” L. Yuan et al. presented a unified fast framework integrating adaptive ant colony optimization algorithm with multiobjective functions for detecting SNP epistasis in GWAS datasets. Their experimental results from Late-Onset Alzheimer’s Disease dataset showed that the proposed method outperformed other methods in epistasis detection and could contribute to the research of mechanism underlying the disease.

Clinical disorders of human brains, such as Alzheimer’s disease (AD), schizophrenia (SCZ), and Parkinson’s disease (PD), are among the most complex diseases and therapeutically intractable health problems. In recent years, brain regions and their interactions can be modeled as complex brain networks, which describe highly efficient information transmission in a brain. Many brain disorders have been found to be associated with the abnormal topological structures of brain networks. In the paper “Complex Brain Network Analysis and Its Applications to Brain Disorders: A Survey,” J. Liu et al. provided a comprehensive overview for complex brain network analysis and its applications to brain disorders.

Colorectal cancer (CRC) ranks 4 in cancer incidences and accounts for approximately 8–10% of cancer-related death and the 5-year survival rate (40–50%) is still not as satisfied as expected. Identifying the “high risk” populations is critical for early diagnosis and improvement of overall survival rate. In the paper “Building Up a Robust Risk Mathematical Platform to Predict Colorectal Cancer,” L. Zhang et al. collected relatively complete information of genetic variations and environmental exposure for both CRC patients and cancer-free controls and developed a multimethod ensemble model for CRC-risk prediction by employing such big data to train and test the model. Their results demonstrated that (1) the explored genetic and environmental biomarkers were validated to connect to the CRC by biological function- or population-based evidences, (2) the model could efficiently predict the risk of CRC after parameter optimization by the big CRC-related data, and (3) their innovated heterogeneous ensemble learning model (HELM) and generalized kernel recursive maximum correntropy (GKRMC) algorithm have high prediction power.

Osteoporosis is a type of systemic skeletal disease that is characterized by reduced bone mass and microarchitecture deterioration of bone tissues, thereby leading to the loss of strength and increased risk of fractures. In past decades, a number of genes and SNPs associated with osteoporosis have been found through GWAS method. In the paper “Identifying the Risky SNP of Osteoporosis with ID3-PEP Decision Tree Algorithm,” J. Yang et al. proposed a computational method for identifying the suspected risky SNPs of osteoporosis based on the known osteoporosis GWAS-associated SNPs. The experiment result showed that their method was feasible and could provide a more convenient way to identify the suspected risky SNPs associated with osteoporosis.

Major depressive disorder (MDD) is a global mental disorder and has an unfavorable influence on physical and psychological health. Studies demonstrated that MDD is characterized by the alterations in brain functional connections which is also identifiable during the brain’s “resting-state.” However, the existing approaches to constructing functional connectivity are often biased and as a result the clustering partition of nodes was unclear. In the paper “A Resting-State Brain Functional Network Study in MDD Based on Minimum Spanning Tree Analysis and the Hierarchical Clustering,” X. Li et al. applied minimum spanning tree (MST) analysis and the hierarchical clustering for studying the depression disease. With resting-state electroencephalograms (EEG) from 15 healthy and 23 major depressive subjects, their findings suggested that there was a stronger brain interaction in the MDD group and a left-right functional imbalance in the frontal regions for MDD controls.

In summary, this focus issue has reported the recent progress in the studies of biomolecular networks for complex diseases. We hope that the readers of this focus issue could get some benefits from these newly developed methods.

Fang-Xiang Wu
Jianxin Wang
Min Li
Haiying Wang