Long Noncoding RNA Identification: Comparing Machine Learning Based Tools for Long Noncoding Transcripts Discrimination
Table 2
Summary of the features of each method selected.
ā
ORF
Codon
Sequence structure
Ribosome interaction
Alignment
Protein conservation
CPC
Quality; coverage; integrity
No
No
No
BLASTX
Number and -value of hits; Distribution of hits
CPAT
Length; coverage
Hexamer Frequency
Content of the bases Position of the bases
No
No
No
CNCI
No
ANT matrix; Codon-bias
MLCDS
No
No
No
PLEK
No
No
Improved k-mer scheme
No
No
No
lncRNA-MFDL
Length; coverage
No
k-mer scheme Secondary structure MLCDS
No
No
No
LncRNA-ID
Length; coverage
No
Kozak motif
Ribosome release signal Changes of binding energy
Profile HMM based alignment
Score of HMMER Length of the profile Length of aligned region
lncRScan-SVM
No
Distribution of stop codon
Score of txCdsPredict; length of transcripts; length and count of exon
No
Phylo-HMM based alignment
Average PhastCons scores
LncRNApred
Length; coverage
No
Length of the sequence; signal to noise ratio; k-mer scheme; G + C content
No
No
No
All features are categorized into six groups according to the similarity or basic principles. Thus, some items in the table might not be exactly in one-to-one correspondence with the feature names given in the corresponding published references.