Table of Contents Author Guidelines Submit a Manuscript
BioMed Research International
Volume 2016, Article ID 2375268, 12 pages
http://dx.doi.org/10.1155/2016/2375268
Research Article

In Silico Prediction of Gamma-Aminobutyric Acid Type-A Receptors Using Novel Machine-Learning-Based SVM and GBDT Approaches

1Department of Biochemistry and Molecular Biology, School of Basic Medical Sciences, Fujian Medical University, Fuzhou, Fujian 350122, China
2College of Animal Science and Technology, Henan University of Science and Technology, Luoyang, Henan 471023, China
3School of Computer Engineering and Science, Shanghai University, Shanghai 200444, China
4College of Information Engineering, China Jiliang University, Hangzhou, Zhejiang 310018, China
5School of Computer Science and Technology, Heilongjiang University, Harbin, Heilongjiang 150080, China
6School of Information Science and Technology, Xiamen University, Xiamen, Fujian 361005, China

Received 24 April 2016; Revised 8 June 2016; Accepted 19 June 2016

Academic Editor: Yungang Xu

Copyright © 2016 Zhijun Liao et al. This is an open access article distributed under the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.

Abstract

Gamma-aminobutyric acid type-A receptors (s) belong to multisubunit membrane spanning ligand-gated ion channels (LGICs) which act as the principal mediators of rapid inhibitory synaptic transmission in the human brain. Therefore, the category prediction of s just from the protein amino acid sequence would be very helpful for the recognition and research of novel receptors. Based on the proteins’ physicochemical properties, amino acids composition and position, a classifier was first constructed using a 188-dimensional (188D) algorithm at 90% cd-hit identity and compared with pseudo-amino acid composition (PseAAC) and ProtrWeb web-based algorithms for human proteins. Then, four classifiers including gradient boosting decision tree (GBDT), random forest (RF), a library for support vector machine (libSVM), and k-nearest neighbor (-NN) were compared on the dataset at cd-hit 40% low identity. This work obtained the highest correctly classified rate at 96.8% and the highest specificity at 99.29%. But the values of sensitivity, accuracy, and Matthew’s correlation coefficient were a little lower than those of PseAAC and ProtrWeb; GBDT and libSVM can make a little better performance than RF and -NN at the second dataset. In conclusion, a classifier was successfully constructed using only the protein sequence information.