Research Article

An Intelligent System for Identifying Acetylated Lysine on Histones and Nonhistone Proteins

Table 3

Fivefold cross-validation results on histone and nonhistone model trained with various features.

DatasetTraining featuresSnSpAccMCC

Histone20D binary code0.7250.7430.7400.370
BLOSUM620.7450.7580.7560.400
Amino acid composition (AAC)0.7500.7610.3860.407
Amino acid pair composition (AAPC)0.7900.8020.8000.483
Accessible surface area (ASA)0.6450.6630.6600.236
Position weight matrix (PWM)0.7000.7210.7180.329
Position-specific scoring matrix (PSSM)0.7100.7240.7210.339

Nonhistone20D binary code0.6980.7140.7100.366
BLOSUM620.6970.7230.7120.375
Amino acid composition (AAC)0.6190.6400.6350.226
Amino acid pair composition (AAPC)0.6280.6600.6520.253
Accessible surface area (ASA)0.5620.6200.6050.159
Position weight matrix (PWM)0.6060.6250.6020.200
Position-specific scoring matrix (PSSM)0.6650.6950.6880.319
BLOSUM62 + AAPC0.7060.7180.7150.377

A total of 3525 lysine sequences were applied in positive and negative data. Sn, sensitivity; Sp, specificity; Acc, accuracy; MCC, Matthew’s correlation coefficient.