Scalable Data Mining Algorithms in Computational Biology and BiomedicineView this Special Issue
Research Article | Open Access
Zhijun Liao, Yong Huang, Xiaodong Yue, Huijuan Lu, Ping Xuan, Ying Ju, "In Silico Prediction of Gamma-Aminobutyric Acid Type-A Receptors Using Novel Machine-Learning-Based SVM and GBDT Approaches", BioMed Research International, vol. 2016, Article ID 2375268, 12 pages, 2016. https://doi.org/10.1155/2016/2375268
In Silico Prediction of Gamma-Aminobutyric Acid Type-A Receptors Using Novel Machine-Learning-Based SVM and GBDT Approaches
Gamma-aminobutyric acid type-A receptors (s) belong to multisubunit membrane spanning ligand-gated ion channels (LGICs) which act as the principal mediators of rapid inhibitory synaptic transmission in the human brain. Therefore, the category prediction of s just from the protein amino acid sequence would be very helpful for the recognition and research of novel receptors. Based on the proteins’ physicochemical properties, amino acids composition and position, a classifier was first constructed using a 188-dimensional (188D) algorithm at 90% cd-hit identity and compared with pseudo-amino acid composition (PseAAC) and ProtrWeb web-based algorithms for human proteins. Then, four classifiers including gradient boosting decision tree (GBDT), random forest (RF), a library for support vector machine (libSVM), and k-nearest neighbor (-NN) were compared on the dataset at cd-hit 40% low identity. This work obtained the highest correctly classified rate at 96.8% and the highest specificity at 99.29%. But the values of sensitivity, accuracy, and Matthew’s correlation coefficient were a little lower than those of PseAAC and ProtrWeb; GBDT and libSVM can make a little better performance than RF and -NN at the second dataset. In conclusion, a classifier was successfully constructed using only the protein sequence information.
Gamma-aminobutyric acid (GABA) is a major human brain inhibitory neurotransmitter and plays a principal role in the regulation of pituitary gland function. GABA is made up of a four-carbon chain flexible carbon skeleton (Figure 1), which can adopt a number of conformations when interacting with many macromolecular receptor targets. This characteristic of GABA can provide many selective ligands by producing conformationally restricted analogues . GABA is mainly synthesized in the hypothalamus as well as within the pituitary gland and stored in the anterior lobe and intermediate lobe cells, the GABA-synthesizing enzyme is glutamic acid decarboxylase (GAD) which is relevant to TCA cycle , and the direct substrate is glutamate  (Figure 2). In addition to GAD, the GABA level is also related to glutamine-glutamate (Gln-Glu) cycling , in which glutaminase and glutamine synthetase play a key role in keeping the cycling balance. Gln is first converted to Glu and then to GABA in the cycle, or Glu solution is catalyzed to GABA; this process is known to play a significant role in the regulation of neurogenesis, and the release of GABA is mainly produced from Purkinje cells in the cerebellar cortex via special regulatory mechanism [5–7].
GABA can specifically interact with the postsynaptic GABA receptor in human central nervous system (CNS) ; the specific binding of GABA to synaptic membrane fractions is saturable. Three types of GABA receptors are expressed in human, namely, the ionotropic receptor (), the metabotropic receptor (such as G protein-coupled receptor) , and another ionotropic receptor, among them is relevant to epilepsy . These receptors belong to the Cys-loop superfamily of ligand-gated ion channels (LGICs) and exhibit a long (about 200 a.a.) extracellular amino terminus, which is thought to be responsible for ligand channel interactions. The amino terminus forms agonist or antagonist binding sites, four transmembrane (TM) domains, and a large intracellular domain between TM3 and TM4 for phosphorylating regulation and localization at synapses, and five TM2 domains in a cycle form the lining segment of the ion channel (Figure 3). The extracellular amino terminus contains a conserved motif, called the Cys-loop (13-amino acid disulfide loop), which is characterized by 2 cysteine residues spaced by 13 different amino acid residues ; the amino terminus incorporates neurotransmitters and some modulator binding sites. For example, the extracellular domain of subunits contains the amino acid residue “CMMDLRRYPLDEQNC” (C stands for cysteine). For the structural details of Cys-loop receptors see review .
form pentameric chloride channels comprising various combinations from eight kinds of subunits (α, β, γ, δ, ε, θ, π, and ρ), each of which comprises several subtypes . These receptors belong to a superfamily of pentameric ligand-gated ion channels (pLGICs) with five-membered ring structures; pLGICs are also known as Cys-loop receptors including two classes: the cation-selective (e.g., nicotinic acetylcholine receptors and serotonin type 3 receptors) and anion-selective (e.g., glycine receptors (GlyRs) and ) . According to their extracellular domain, pentameric receptors can be further divided into these containing only one conserved Cys-loop and those containing an additional disulfide bond that links the β9-β10 strands in Loop C. Human subunits are encoded by 19 different genes, namely, α1–6, β1–3, γ1–3, δ, ε, θ, π, and ρ1–3; among these subunits, the crystallization shows that human subunit is unique to eukaryotic Cys-loop receptors . The α1–α6 subunits are encoded by GABRA1 to GABRA6 genes; the α1 subtype is widely expressed in the whole brain, whereas α2, α3, α4, α5, and α6 subtypes are expressed in specific brain areas . Most of the pentameric in the human brain are typically composed of two α subunits, two β subunits, and one γ subunit, and the GABA binding sites are located in the α-β subunit interface . The α1, β2, and γ2 subunits are expressed most abundantly in human brain , and the subunit variants may thus influence ion channel gating, expression, and GABA receptor trafficking to the cell surface. The GABRA1 and GABRA6 genes are located in human chromosome 5, whereas GABRA2 and GABRA3 are located in chromosome 4 and GABRA4 and GABRA5 are located in chromosome X and chromosome 15, respectively . These genes have been proposed to affect certain drug targets and the regulation of neuronal activities in human brain . Several antiepileptic drugs (AEDs) such as phenobarbital and gabapentin bind to in the CNS with a confined area distribution, and the alterations in subunits may regulate the responses elicited by AEDs . Several AEDs exert agonistic effects on . AEDs may react with comprising distinct subunits in diverse manners, and the composition and function of α subunits may influence the treatment efficacy of different AEDs . Targeted proteins of AEDs are involved in the regulation of extracellular K+ and intracellular homeostasis, cell volume, and pH, all of which are important for maintaining normal brain activity .
subunit mutations or genetic variations can lead to its dysfunctions, which have been thought to participate in the pathomechanisms of epilepsy , in which multiple epilepsy mutations result in protein misfolding and may cause degradation or retention of the protein molecules in cells; Kang et al. found that mutant subunits accumulate and aggregate intracellularly, activated caspase-3, and caused widespread and age-dependent neurodegeneration; these findings suggested the epilepsy-associated mutant γ2 subunit played important role in neurodegeneration . The gene mutations or genetic variation of the α1, α6, β2, β3, γ2, or δ subunits (GABRA1, GABRA6, GABRB2, GABRB3, GABRG2, and GABRD, resp.) compromises hyperpolarization through , and these variations have been associated with human epilepsy with or without febrile seizures .
Support vector machine (SVM) is a kind of supervised machine learning algorithms that have been broadly applied for classification and regression analysis [27–32], which is also a type of sparse kernel machines that rely on various data to predict unknown class labels and which has linear or nonlinear learning model for binary classifier [33–35]. Random forest (RF) is an ensemble machine learning technique based on random decision trees for classification and other tasks. Relying on the feature, a data point can be divided into a special category and is assigned a prediction. RF has been broadly applied in novel protein and target identification [36, 37], because it combines the merits of bagging idea and feature selection . Another decision tree learning is gradient boosting decision tree (GBDT), which has been very successfully applied for many fields such as smart city concept , and its major advantage is ability to find nonlinear interactions automatically through decision tree learning with the minimality error. GBDT is generally regarded as one of the best out-of-the-box classifiers which has the ability to generalize and can combine weak learners into a single strong learner; it has gradually acquired popularity in the field of machine learning methods although it still possesses many disadvantages [40–43].
Here, we performed an in silico analysis on the according to sequence information and other physicochemical features, including hydrophobicity, normalized van der Waals volume, polarity, polarizability, charge, surface tension, secondary structure, and solvent accessibility. Twenty natural amino acids can be divided into 3 different groups based on each of the above eight properties, and thus 188-dimensional (188D) feature vectors of proteins were constructed with an ensemble classifier , which performed well in membrane protein prediction . We employed PseAAC and ProtrWeb methods for human to adapt to the web server limit of sequence amounts; we also applied libSVM, RF, GBDT, and widely used -nearest neighbor (-NN) algorithms to make comparisons of performance with dataset at rigorous cd-hit filtration .
Since motif, a conserved short pattern of a protein , is one of the fundamental function units of molecular evolution, with regard to DNA, a motif may act as a protein-binding site; in proteins, a motif may directly correspond to the active site of an enzyme or a structural unit of the protein. Therefore, we also conducted motif analysis.
2. Materials and Methods
2.1. Data Retrieval and Treatment
All the primary sequences of both and the control Pfam proteins (in FASTA files) were retrieved from the UniProt database (http://www.uniprot.org/); the raw data are preprocessed by cd-hit program (http://cd-hit.org) to merge the sequence similarities and reduce the complexity . To avoid bias in the classifier, we set the identity at 90% similarity and obtained the results of 2353 sequences as positive dataset; the negative samples were obtained from the control proteins when the positive ones were deleted, and 10652 entries were obtained as negative dataset. When the four classifiers performance was measured, cd-hit was set at rigorous 40% identity and gained 360 and 9598 non-.
2.2. Prediction Analysis for Potential GABAAR Proteins
Machine learning is often employed in the bioinformatics and proteomics problem. Several important techniques facilitate the protein classification and identification, such as imbalanced classification strategies , ensemble learning [49–51], samples selection strategies [52, 53], features reduction, and ranking methods [54–56].
To predict the potential from the amino acid sequences, we constructed a classifier according to the protein features. First, we extracted the feature vectors from positive versus negative protein sequence dataset by using a novel machine-learning-based method developed by our group, we transformed all the positive and negative sequences into the corresponding protein family (Pfam) information, and the obtained features included sequence evolutional information, -skip--gram model, physicochemical properties, and local PsePSSM . Altogether, we assembled 188D feature vectors. Afterward, the resulting feature vectors were imported into Weka (http://www.cs.waikato.ac.nz/ml/weka/), which is a machine learning workbench used for automatic classification via visualization and cross-validation analysis [58, 59]. After several preliminary trials with the same dataset, we selected random forest method and set the parameters as default.
2.3. Conserved Motif Analysis of Human GABAAR Proteins
Conserved motif analyses were implemented using the online MEME Suite (http://meme-suite.org/, 4.11.1 version), a powerful motif-based sequence analysis tool, which integrated a set of web-based tools including Gene Ontology database for studying sequence motifs in proteins, DNA, and RNA . Currently, the MEME Suite has added six new tools and reached thirteen since the “Nucleic Acids Research” Web Server Issue in 2009. Human sequences in FASTA format were used as a file input. The maximum motif width, minimal motif width, and maximum number of motifs were set to 50, 6, and 9, respectively. The remaining parameters were set as default values.
2.4. Pseudo-Amino Acid Composition and ProtrWeb Analysis
Chou et al. [61–63] had proposed the concept of PseAAC to describe global or long-range sequence-order protein information early in 2001; their original design objective was to improve protein subcellular localization prediction and membrane protein type prediction. Since then, the PseAAC approach alone or incorporating other properties had rapidly penetrated many areas of computational proteomics. As the most intuitive features for protein biochemical reactions, the physicochemical properties of amino acids significantly influence the protein classification. Features that incorporate appropriate physicochemical properties can contain much valuable information for improving the performance of predictors. Single feature extraction of our own method has inevitably its own shortcomings and does not always perform well on all circumstances. Thus, we also used the concept of PseAAC and ProtrWeb (http://protrweb.scbdd.com/) to construct feature vectors for human proteins (58 entries) and other proteins (58 entries) in this study.
PseAAC is a web server that can generate numerous pseudo-amino acid compositions including sequence-order information in addition to the conventional 20D amino acid composition. It is a classification algorithm based on the amino acid composition and physicochemical characteristics of proteins; the server was designed in a flexible way to identify various pseudo-amino acid composition information for a given protein sequence by selecting different parameters and their combinations. PseAAC provides three PseAA modes and six amino acid characters for user to choose. ProtrWeb  is also a web server based on the R package routine protr, the first version of which was developed in November 2013. This server is dedicated to calculate protein sequence-derived structural and physicochemical descriptors such as amino acid composition. -gram and -skip are based on permutation and combination theory. ProtrWeb can be applied in various protein prediction studies, including protein structural and functional classes, protein subcellular locations, protein-protein interactions, and receptor-ligand interactions. ProtrWeb offers 12 types of commonly used descriptors presented in the web such as amino acid composition, dipeptide composition, and pseudo-amino acid composition. Recently, some studies have shown that the long-range sequence-order effects of DNA  can improve the performance of computational predictors .
To extract features from the physicochemical properties of proteins by using PseAAC, we considered all six physiochemical properties: hydrophobicity, hydrophilicity, mass, pK1 (alpha-COOH), pK2 (NH3), and pI (at 25°C). We selected type 2 PseAA mode, set Lambda parameter at 10, and set the weight factor as default. The results were shown as 80-dimensional (80D) data for each protein. For ProtrWeb, we chose amino acid composition (20 Dim) and pseudo-amino acid composition (50 Dim) adapted to the restricted parameter measure.
2.5. Prediction Ability Comparison of Four Classifiers on the 40% Identity cd-Hit Filtration Data
We extracted 188D feature vectors from 360 and 9598 non- as input to Weka performing category via RF, -NN, and SVM algorithm which was implemented using libSVM. GBDT classifier was carried out by python program developed by ourselves; the above 4 classifiers have the parameters set as default.
Four common measurements were used to illuminate the performance quality of the predictor more intuitively. Sensitivity (Sn), specificity (Sp), accuracy (Acc), and Matthew’s correlation coefficient (MCC) were adopted to evaluate the above three methods and four classifiers. These methods are formulated as follows:where TP, TN, FP, and FN stand for the numbers of true positive, true negative, false positive, and false negative, respectively.
3.1. Searching the Protein Family Number
To determine the Pfam families of , we ran the program with the positive and negative protein sequences ( versus non-) and obtained nonredundant Pfam numbers after combining the same ones (Table 1). The negative group was very large; thus, we only listed the positive ones.
3.2. Reclassification of Positive and Negative Proteins
We obtained the 188D (this work), 80D (from PseAAC), and 70D (from ProtrWeb) feature vector dataset from both positive and negative groups and used them as input to the Weka explorer (RF algorithm). The results showed that the correctly classified rates were 96.8%, 95.7%, and 94.8%. The confusion matrix is shown in Table 2, and the four common measurement values are illustrated in Figure 4.
3.3. Four Classifiers’ Prediction Ability Comparison
On the four classifiers, they all performed well and got high correctly classified rate over 96%, but GBDT and libSVM had a little better performance than RF and -NN assessed from all the indicators (Table 3).
3.4. Conserved Motif Analysis of Human GABAAR
To reveal the evolutionary correlation of from the conserved motifs, 92 human protein sequences were analyzed by using MEME software. The nine most significant and conserved motifs are shown in Figure 5 and Table 4.
The primary structures of amino acid sequences are often the basis for understanding the three-dimensional conformation and functional properties of proteins , which exhibit an intimate relationship between their primary structure and function . Twenty natural α-amino acids commonly constitute the primary sequences of proteins [69, 70]. Amino acids are biologically important organic nitrogenous compounds in the natural world. These compounds contain amine (-NH2) and carboxylic acid (-COOH) functional groups which link with the same carbon atom called α-carbon, usually along with a side-chain (called R group) specific to each amino acid. The elements of carbon, hydrogen, oxygen, and nitrogen are essential for an amino acid, though other elements are found in the R group. Amino acids can be classified in many ways, such as according to the core structure and side-chain group properties. However, 20 standard and encoding α-carbon amino acids are usually classified into five main groups on the basis of biochemistry , namely, a hydrophobe, if the side-chain is nonpolar; a hydrophile, if it is polar but uncharged; aromatic, if it includes an aromatic ring; acidic, if it is negatively charged; and basic, if it is positively charged.
Previous research has extracted information on protein feature according to composition, position, or physicochemical properties . In our work, we adopted 188D algorithm to extract feature vectors by combining amino acid compositions with physicochemical properties in a protein functional classifier . This 188D method includes amino acid composition (20D) and eight types of physicochemical properties, that is, hydrophobicity (21D), normalized van der Waals volume (21D), polarity (21D), polarizability (21D), charge (21D), surface tension (21D), secondary structure (21D), and solvent accessibility (21D). The CTD model was employed to describe global information about the protein sequence, where C represents the percentage of each type of hydrophobic amino acid in an amino acid sequence, T represents the frequency of one hydrophobic amino acid followed by another amino acid with different hydrophobic properties, and D represents the first, 25%, 50%, 75%, and last position of the amino acids that satisfy certain properties in the sequence; for details, see . In addition to this 188D feature vector extraction method, we used two web-based servers, PseAAC and ProtrWeb, for 80D and 70D feature vectors, respectively. The limited amount of sequence on the web allowed the analysis of only human and the corresponding non- by using the last two methods.
The abnormities of are associated with the pathology and progression of several neurological and psychiatric diseases, such as autism, schizophrenia , and alcoholism , particularly in epilepsy [75–79], Dravet syndrome , asthma , breast cancer , some psychiatric diseases , Alzheimer disease , and other neurodegenerative diseases. It is recently reported that may be involved in apoptosis in preeclampsia . Human conserved motifs analyses indicate that motifs 1, 3, and 6 are the frame of neurotransmitter-gated ion channel transmembrane region, which form the ion channel for cation transporter by the construction of transmembrane helix. Motifs 2, 4, and 5 are also composed of neurotransmitter-gated ion channel extracellular ligand binding domain by linking closely and forming a pentameric arrangement in the structure . Various GABA receptor genes are associated with many mental-disorder-related phenotypes. Alterations in GABAergic inhibitory actions, such as the subunit amount, composition, and gene expression of , may demonstrate neurophysiologic and functional consequences related to mental disorders. Some studies on protein prediction using Chou’s method have been reported in 2011 because of the importance of . However, similar studies on are rarely reported since then.
The current results showed that our method reached the most correctly classified instances at 96.8%; it suggested that our 188D algorithm performed well for classification and could correctly discriminate both positive and negative samples with relative high specificity. However, the Sn, Acc, and MCC indexes were lower than those of the PseAAC and ProtrWeb methods; this might be due to the large dataset size of our work. But the lowest value was higher than 85%. Overall, our project, which is mainly based on physicochemical properties, can reflect the characteristics of protein sequences and can be applied in the prediction of classification. Definitely, it needs to develop more precise methods based on 188D.
The authors declare that there are no competing interests.
The work was supported by the Natural Science Foundation of Fujian Province of China (no. 2016J01152) and National Natural Science Foundation of China (no. 61573235, no. 61272315, and no. 61302139).
- K. E. S. Locock, I. Yamamoto, P. Tran et al., “γ-aminobutyric acid(C) (GABAC) selective antagonists derived from the bioisosteric modification of 4-aminocyclopent-1-enecarboxylic acid: amides and hydroxamates,” Journal of Medicinal Chemistry, vol. 56, no. 13, pp. 5626–5630, 2013.
- A. Mayerhofer, B. Höhne-Zell, K. Gamel-Didelon et al., “Gamma-aminobutyric acid (GABA): a para- and/or autocrine hormone in the pituitary,” The FASEB Journal, vol. 15, no. 6, pp. 1089–1091, 2001.
- N. Okai, C. Takahashi, K. Hatada, C. Ogino, and A. Kondo, “Disruption of pknG enhances production of gamma-aminobutyric acid by Corynebacterium glutamicum expressing glutamate decarboxylase,” AMB Express, vol. 4, no. 1, article 20, pp. 1–8, 2014.
- F. C. Pereira, M. R. Rolo, E. Marques et al., “Acute increase of the glutamate-glutamine cycling in discrete brain areas after administration of a single dose of amphetamine,” Annals of the New York Academy of Sciences, vol. 1139, pp. 212–221, 2008.
- M. Rigby, S. G. Cull-Candy, and M. Farrant, “Transmembrane AMPAR regulatory protein γ-2 is required for the modulation of GABA release by presynaptic AMPARs,” The Journal of Neuroscience, vol. 35, no. 10, pp. 4203–4214, 2015.
- M. Zonouzi, J. Scafidi, P. Li et al., “GABAergic regulation of cerebellar NG2 cell development is altered in perinatal white matter injury,” Nature Neuroscience, vol. 18, no. 5, pp. 674–682, 2015.
- T. Irie, R. Kikura-Hanajiri, M. Usami, N. Uchiyama, Y. Goda, and Y. Sekino, “MAM-2201, a synthetic cannabinoid drug of abuse, suppresses the synaptic input to cerebellar Purkinje cells via activation of presynaptic CB1 receptors,” Neuropharmacology, vol. 95, pp. 479–491, 2015.
- S. R. Zukin, A. B. Young, and S. H. Snyder, “Gamma-aminobutyric acid binding to receptor sites in the rat central nervous system,” Proceedings of the National Academy of Sciences of the United States of America, vol. 71, no. 12, pp. 4802–4807, 1974.
- R. W. Olsen and W. Sieghart, “International union of pharmacology. LXX. Subtypes of γ-aminobutyric acidA receptors: classification on the basis of subunit composition, pharmacology, and function. Update,” Pharmacological Reviews, vol. 60, no. 3, pp. 243–260, 2008.
- J.-M. Fritschy, “Epilepsy, E/I balance and receptor plasticity,” Frontiers in Molecular Neuroscience, vol. 1, article 5, 2008.
- H. Mohabatkar, M. Mohammad Beigi, and A. Esmaeili, “Prediction of receptor proteins using the concept of Chou's pseudo-amino acid composition and support vector machine,” Journal of Theoretical Biology, vol. 281, no. 1, pp. 18–23, 2011.
- P. S. Miller and T. G. Smart, “Binding, activation and modulation of Cys-loop receptors,” Trends in Pharmacological Sciences, vol. 31, no. 4, pp. 161–174, 2010.
- A. Pöltl, B. Hauer, K. Fuchs, V. Tretter, and W. Sieghart, “Subunit composition and quantitative importance of GABAA receptor subtypes in the cerebellum of mouse and rat,” Journal of Neurochemistry, vol. 87, no. 6, pp. 1444–1455, 2003.
- G. Grenningloh, E. Gundelfinger, B. Schmitt et al., “Glycine vs GABA receptors,” Nature, vol. 330, no. 6143, pp. 25–26, 1987.
- P. S. Miller and A. R. Aricescu, “Crystal structure of a human receptor,” Nature, vol. 512, no. 7514, pp. 270–275, 2014.
- W. Sieghart and G. Sperk, “Subunit composition, distribution and function of GABA(A) receptor subtypes,” Current Topics in Medicinal Chemistry, vol. 2, no. 8, pp. 795–816, 2002.
- P. H. Torkkeli, H. Liu, and A. S. French, “Transcriptome analysis of the central and peripheral nervous systems of the spider Cupiennius salei reveals multiple putative Cys-loop ligand gated ion channel subunits and an acetylcholine binding protein,” PLoS ONE, vol. 10, no. 9, Article ID e0138068, 2015.
- C. A. Reid, S. F. Berkovic, and S. Petrou, “Mechanisms of human inherited epilepsies,” Progress in Neurobiology, vol. 87, no. 1, pp. 41–57, 2009.
- J. Simon, H. Wakimoto, N. Fujita, M. Lalande, and E. A. Barnard, “Analysis of the set of receptor genes in the human genome,” The Journal of Biological Chemistry, vol. 279, no. 40, pp. 41422–41435, 2004.
- I.-C. Chou, C.-C. Lee, C.-H. Tsai et al., “Association of GABRG2 polymorphisms with idiopathic generalized epilepsy,” Pediatric Neurology, vol. 36, no. 1, pp. 40–44, 2007.
- K. Bethmann, J.-M. Fritschy, C. Brandt, and W. Löscher, “Antiepileptic drug resistant rats differ from drug responsive rats in GABAA receptor subunit expression in a model of temporal lobe epilepsy,” Neurobiology of Disease, vol. 31, no. 2, pp. 169–187, 2008.
- M. SidAhmed-Mezi, I. Kurcewicz, C. Rose et al., “Mass spectrometric detection and characterization of atypical membrane-bound zinc-sensitive phosphatases modulating GABAA receptors,” PLoS ONE, vol. 9, no. 6, Article ID e100612, 2014.
- J. Uwera, S. Nedergaard, and M. Andreasen, “A novel mechanism for the anticonvulsant effect of furosemide in rat hippocampus in vitro,” Brain Research, vol. 1625, pp. 1–8, 2015.
- J. L. Fisher, “The anti-convulsant stiripentol acts directly on the receptor as a positive allosteric modulator,” Neuropharmacology, vol. 56, no. 1, pp. 190–197, 2009.
- J.-Q. Kang, W. Shen, C. Zhou, D. Xu, and R. L. Macdonald, “The human epilepsy mutation GABRG2(Q390X) causes chronic subunit accumulation and neurodegeneration,” Nature Neuroscience, vol. 18, no. 7, pp. 988–996, 2015.
- S. Hirose, “Mutant GABA(A) receptor subunits in genetic (idiopathic) epilepsy,” Progress in Brain Research, vol. 213, pp. 55–85, 2014.
- H. Ding, S.-H. Guo, E.-Z. Deng et al., “Prediction of Golgi-resident protein types by using feature selection technique,” Chemometrics and Intelligent Laboratory Systems, vol. 124, pp. 9–13, 2013.
- W.-C. Li, E.-Z. Deng, H. Ding, W. Chen, and H. Lin, “iORI-PseKNC: a predictor for identifying origin of replication with pseudo k-tuple nucleotide composition,” Chemometrics and Intelligent Laboratory Systems, vol. 141, pp. 100–106, 2015.
- H. Lin, W. Chen, and H. Ding, “AcalPred: a sequence-based tool for discriminating between acidic and alkaline enzymes,” PLoS ONE, vol. 8, no. 10, Article ID e75726, 2013.
- L.-F. Yuan, C. Ding, S.-H. Guo, H. Ding, W. Chen, and H. Lin, “Prediction of the types of ion channel-targeted conotoxins based on radial basis function network,” Toxicology in Vitro, vol. 27, no. 2, pp. 852–856, 2013.
- B. Liu, D. Zhang, R. Xu et al., “Combining evolutionary information extracted from frequency profiles with sequence-based kernels for protein remote homology detection,” Bioinformatics, vol. 30, no. 4, pp. 472–479, 2014.
- J. Chen, X. Wang, and B. Liu, “IMiRNA-SSF: improving the identification of MicroRNA precursors by combining negative sets with different distributions,” Scientific Reports, vol. 6, Article ID 19062, 2016.
- A. Besga, I. Gonzalez, E. Echeburua et al., “Discrimination between Alzheimer’s disease and late onset bipolar disorder using multivariate analysis,” Frontiers in Aging Neuroscience, vol. 7, article 231, 2015.
- Q. Yang, H.-Y. Zou, Y. Zhang et al., “Multiplex protein pattern unmixing using a non-linear variable-weighted support vector machine as optimized by a particle swarm optimization algorithm,” Talanta, vol. 147, pp. 609–614, 2016.
- R. Wang, Y. Xu, and B. Liu, “Recombination spot identification Based on gapped k-mers,” Scientific Reports, vol. 6, article 23934, 2016.
- A. K. Sharma, S. Kumar, K. Harish, D. B. Dhakan, and V. K. Sharma, “Prediction of peptidoglycan hydrolases—a new class of antibacterial proteins,” BMC Genomics, vol. 17, no. 1, article 411, 2016.
- Z. C. Li, M. H. Huang, W. Q. Zhong et al., “Identification of drug-target interaction from interactome network with ‘guilt-by-association’ principle and topology features,” Bioinformatics, vol. 32, no. 7, pp. 1057–1064, 2016.
- J. J. Jones, B. E. Wilcox, R. W. Benz et al., “A plasma-based protein marker panel for colorectal cancer detection identified by multiplex targeted mass spectrometry,” Clinical Colorectal Cancer, vol. 15, no. 2, pp. 186–194.e13, 2016.
- I. Semanjski and S. Gautama, “Smart city mobility application—gradient boosting trees for mobility prediction and analysis based on crowdsourced data,” Sensors, vol. 15, no. 7, pp. 15974–15987, 2015.
- R. Johnson and T. Zhang, “Learning nonlinear functions using regularized greedy forest,” IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. 36, no. 5, pp. 942–954, 2014.
- K. P. Singh and S. Gupta, “In silico prediction of toxicity of non-congeneric industrial chemicals using ensemble learning based modeling approaches,” Toxicology and Applied Pharmacology, vol. 275, no. 3, pp. 198–212, 2014.
- Y. Chen, Z. Jia, D. Mercola, and X. Xie, “A gradient boosting algorithm for survival analysis via direct optimization of concordance index,” Computational and Mathematical Methods in Medicine, vol. 2013, Article ID 873595, 8 pages, 2013.
- A. Decruyenaere, P. Decruyenaere, P. Peeters, F. Vermassen, T. Dhaene, and I. Couckuyt, “Prediction of delayed graft function after kidney transplantation: comparison between logistic regression and machine learning methods,” BMC Medical Informatics and Decision Making, vol. 15, article 83, 2015.
- C. Lin, Y. Zou, J. Qin et al., “Hierarchical classification of protein folds using a novel ensemble classifier,” PLoS ONE, vol. 8, no. 2, Article ID e56499, 2013.
- Q. Zou, X. Li, Y. Jiang, Y. Zhao, and G. Wang, “Binmempredict: a web server and software for predicting membrane protein types,” Current Proteomics, vol. 10, no. 1, pp. 2–9, 2013.
- Y. Huang, B. Niu, Y. Gao, L. Fu, and W. Li, “CD-HIT Suite: a web server for clustering and comparing biological sequences,” Bioinformatics, vol. 26, no. 5, pp. 680–682, 2010.
- C. Liu, P. Su, R. Li et al., “Molecular cloning, expression pattern, and molecular evolution of the spleen tyrosine kinase in lamprey, Lampetra japonica,” Development Genes and Evolution, vol. 225, no. 2, pp. 113–120, 2015.
- L. Song, D. Li, X. Zeng, Y. Wu, L. Guo, and Q. Zou, “nDNA-prot: identification of DNA-binding proteins based on unbalanced classification,” BMC Bioinformatics, vol. 15, article 298, 2014.
- Q. Zou, J. Guo, Y. Ju, M. Wu, X. Zeng, and Z. Hong, “Improving tRNAscan-SE annotation results via ensemble classifiers,” Molecular Informatics, vol. 34, no. 11-12, pp. 761–770, 2015.
- C. Lin, W. Chen, C. Qiu, Y. Wu, S. Krishnan, and Q. Zou, “LibD3C: ensemble classifiers with a clustering and dynamic selection strategy,” Neurocomputing, vol. 123, pp. 424–435, 2014.
- Q. Zou, Z. Wang, X. Guan, B. Liu, Y. Wu, and Z. Lin, “An approach for identifying cytokines based on a novel ensemble classifier,” BioMed Research International, vol. 2013, Article ID 686090, 11 pages, 2013.
- L. Wei, M. Liao, Y. Gao, R. Ji, Z. He, and Q. Zou, “Improved and promising identification of human microRNAs by incorporatinga high-quality negative set,” IEEE/ACM Transactions on Computational Biology and Bioinformatics, vol. 11, no. 1, pp. 192–201, 2014.
- X. Zeng, S. Yuan, X. Huang, and Q. Zou, “Identification of cytokine via an improved genetic algorithm,” Frontiers of Computer Science, vol. 9, no. 4, pp. 643–651, 2015.
- Q. Zou, J. Zeng, L. Cao, and R. Ji, “A novel features ranking metric with application to scalable visual and bioinformatics data classification,” Neurocomputing, vol. 173, pp. 346–354, 2016.
- H. Ding, Z. Y. Liang, F. B. Guo, J. Huang, W. Chen, and H. Lin, “Predicting bacteriophage proteins located in host cell with feature selection technique,” Computers in Biology and Medicine, vol. 71, pp. 156–161, 2016.
- H. Tang, W. Chen, and H. Lin, “Identification of immunoglobulins using Chou's pseudo amino acid composition with feature selection technique,” Molecular BioSystems, vol. 12, no. 4, pp. 1269–1275, 2016.
- L. Wei, Q. Zou, M. Liao, H. Lu, and Y. Zhao, “A novel machine learning method for cytokine-receptor interaction prediction,” Combinatorial Chemistry & High Throughput Screening, vol. 19, no. 2, pp. 144–152, 2016.
- E. Frank, M. Hall, L. Trigg, G. Holmes, and I. H. Witten, “Data mining in bioinformatics using Weka,” Bioinformatics, vol. 20, no. 15, pp. 2479–2481, 2004.
- T. C. Smith and E. Frank, “Introducing machine learning concepts with WEKA,” in Statistical Genomics, E. Mathé and S. Davis, Eds., vol. 1418 of Methods in Molecular Biology, pp. 353–378, Springer, Berlin, Germany, 2016.
- T. L. Bailey, J. Johnson, C. E. Grant, and W. S. Noble, “The MEME Suite,” Nucleic Acids Research, vol. 43, no. W1, pp. W39–W49, 2015.
- K.-C. Chou, “Prediction of protein cellular attributes using pseudo-amino acid composition,” Proteins, vol. 43, no. 3, pp. 246–255, 2001.
- H.-B. Shen and K.-C. Chou, “PseAAC: a flexible web server for generating various kinds of protein pseudo amino acid composition,” Analytical Biochemistry, vol. 373, no. 2, pp. 386–388, 2008.
- B. Liu, F. Liu, X. Wang, J. Chen, L. Fang, and K. Chou, “Pse-in-One: a web server for generating various modes of pseudo components of DNA, RNA, and protein sequences,” Nucleic Acids Research, vol. 43, no. W1, pp. W65–W71, 2015.
- N. Xiao, D.-S. Cao, M.-F. Zhu, and Q.-S. Xu, “Protr/ProtrWeb: R package and web server for generating various numerical representation schemes of protein sequences,” Bioinformatics, vol. 31, no. 11, pp. 1857–1859, 2015.
- W. Chen, P.-M. Feng, E.-Z. Deng, H. Lin, and K.-C. Chou, “iTIS-PseTNC: a sequence-based predictor for identifying translation initiation site in human genes using pseudo trinucleotide composition,” Analytical Biochemistry, vol. 462, pp. 76–83, 2014.
- B. Liu, F. L. Liu, L. Y. Fang, X. L. Wang, and K.-C. Chou, “repDNA: a Python package to generate various modes of feature vectors for DNA sequences by incorporating user-defined physicochemical properties and sequence-order effects,” Bioinformatics, vol. 31, no. 8, pp. 1307–1309, 2015.
- L. Au and D. F. Green, “Direct calculation of protein fitness landscapes through computational protein design,” Biophysical Journal, vol. 110, no. 1, pp. 75–84, 2016.
- J. T. S. Hopper and C. V. Robinson, “Mass spectrometry quantifies protein interactions-from molecular chaperones to membrane porins,” Angewandte Chemie—International Edition, vol. 53, no. 51, pp. 14002–14215, 2014.
- K. Kržišnik and T. Urbic, “Amino acid correlation functions in protein structures,” Acta Chimica Slovenica, vol. 62, no. 3, pp. 574–581, 2015.
- A. Olivera-Nappa, B. A. Andrews, and J. A. Asenjo, “Mutagenesis Objective Search and Selection Tool (MOSST): an algorithm to predict structure-function related mutations in proteins,” BMC Bioinformatics, vol. 12, article 122, 2011.
- C. B. Pinheiro, M. Shah, E. L. Soares et al., “Proteome analysis of plastids from developing seeds of Jatropha curcas L.,” Journal of Proteome Research, vol. 12, no. 11, pp. 5137–5145, 2013.
- C. Z. Cai, L. Y. Han, Z. L. Ji, X. Chen, and Y. Z. Chen, “SVM-Prot: web-based support vector machine software for functional classification of a protein from its primary sequence,” Nucleic Acids Research, vol. 31, no. 13, pp. 3692–3697, 2003.
- J. R. Glausier and D. A. Lewis, “Selective pyramidal cell reduction of GABA(A) receptor α1 subunit messenger RNA expression in schizophrenia,” Neuropsychopharmacology, vol. 36, no. 10, pp. 2103–2110, 2011.
- N. Onori, C. Turchi, G. Solito, R. Gesuita, L. Buscemi, and A. Tagliabracci, “GABRA2 and alcohol use disorders: no evidence of an association in an Italian case-control study,” Alcoholism: Clinical and Experimental Research, vol. 34, no. 4, pp. 659–668, 2010.
- H. Yuan, C.-M. Low, O. A. Moody, A. Jenkins, and S. F. Traynelis, “Ionotropic GABA and glutamate receptor mutations and human neurologic diseases,” Molecular Pharmacology, vol. 88, no. 1, pp. 203–217, 2015.
- J. Richetto, M. A. Labouesse, M. M. Poe et al., “Behavioral effects of the benzodiazepine-positive allosteric modulator SH-053-2′F-S-CH3 in an immune-mediated neurodevelopmental disruption model,” The International Journal of Neuropsychopharmacology, vol. 18, no. 4, pp. 1–11, 2014.
- R. J. Hatch, C. A. Reid, and S. Petrou, “Enhanced in vitro CA1 network activity in a sodium channel β1(C121W) subunit model of genetic epilepsy,” Epilepsia, vol. 55, no. 4, pp. 601–608, 2014.
- R. Kumari, R. Lakhan, J. Kalita, R. K. Garg, U. K. Misra, and B. Mittal, “Potential role of receptor subunit; GABRA6, GABRB2 and GABRR2 gene polymorphisms in epilepsy susceptibility and pharmacotherapy in North Indian population,” Clinica Chimica Acta, vol. 412, no. 13-14, pp. 1244–1248, 2011.
- Y. L. Murashima and M. Yoshii, “New therapeutic approaches for epilepsies, focusing on reorganization of the GABAA receptor subunits by neurosteroids,” Epilepsia, vol. 51, no. 3, pp. 131–134, 2010.
- C. Chiron, “Current therapeutic procedures in Dravet syndrome,” Developmental Medicine and Child Neurology, vol. 53, supplement 2, pp. 16–18, 2011.
- G. Gallos, P. Yim, S. Chang et al., “Targeting the restricted α-subunit repertoire of airway smooth muscle GABAA receptors augments airway smooth muscle relaxation,” American Journal of Physiology—Lung Cellular and Molecular Physiology, vol. 302, no. 2, pp. L248–L256, 2012.
- G. M. Sizemore, S. T. Sizemore, D. D. Seachrist, and R. A. Keri, “GABA(A) receptor Pi (GABRP) stimulates basal-like breast cancer cell migration through activation of extracellular-regulated kinase 1/2 (ERK1/2),” Journal of Biological Chemistry, vol. 289, no. 35, pp. 24102–24113, 2014.
- L. I. Sinclair, P. T. Dineen, and A. L. Malizia, “Modulation of ion channels in clinical psychopharmacology: adults and younger people,” Expert Review of Clinical Pharmacology, vol. 3, no. 3, pp. 397–416, 2010.
- A. S. Al Mansouri, D. E. Lorke, S. M. Nurulain et al., “Methylene blue inhibits the function of α7-nicotinic acetylcholine receptors,” CNS and Neurological Disorders—Drug Targets, vol. 11, no. 6, pp. 791–800, 2012.
- J. Lu, Q. Zhang, D. Tan et al., “GABA A receptor subunit promotes apoptosis of HTR-8/SVneo trophoblastic cells: implications in preeclampsia,” International Journal of Molecular Medicine, vol. 38, no. 1, pp. 105–112, 2016.
- A. P. Hanek, H. A. Lester, and D. A. Dougherty, “Photochemical proteolysis of an unstructured linker of the GABAAR extracellular domain prevents GABA but not pentobarbital activation,” Molecular Pharmacology, vol. 78, no. 1, pp. 29–35, 2010.
Copyright © 2016 Zhijun Liao et al. This is an open access article distributed under the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.