BioMed Research International

Review Article

Long Noncoding RNA Identification: Comparing Machine Learning Based Tools for Long Noncoding Transcripts Discrimination

Table 2

Summary of the features of each method selected.


	ORF	Codon	Sequence structure	Ribosome interaction	Alignment	Protein conservation

CPC	Quality; coverage; integrity	No	No	No	BLASTX	Number and -value of hits; Distribution of hits
CPAT	Length; coverage	Hexamer Frequency	Content of the bases Position of the bases	No	No	No
CNCI	No	ANT matrix; Codon-bias	MLCDS	No	No	No
PLEK	No	No	Improved k-mer scheme	No	No	No
lncRNA-MFDL	Length; coverage	No	k-mer scheme Secondary structure MLCDS	No	No	No
LncRNA-ID	Length; coverage	No	Kozak motif	Ribosome release signal Changes of binding energy	Profile HMM based alignment	Score of HMMER Length of the profile Length of aligned region
lncRScan-SVM	No	Distribution of stop codon	Score of txCdsPredict; length of transcripts; length and count of exon	No	Phylo-HMM based alignment	Average PhastCons scores
LncRNApred	Length; coverage	No	Length of the sequence; signal to noise ratio; k-mer scheme; G + C content	No	No	No

All features are categorized into six groups according to the similarity or basic principles. Thus, some items in the table might not be exactly in one-to-one correspondence with the feature names given in the corresponding published references.