Review Article

Long Noncoding RNA Identification: Comparing Machine Learning Based Tools for Long Noncoding Transcripts Discrimination

Table 4

Priority of employing different methods on different situations.

CPCCPATCNCIPLEKLncRNA-IDlncRScan-SVM

Coding potential assessment
Human lncRNAs
Mouse lncRNAs
Other Species1
Testing data with sequencing errors2
Lack of annotation
Massive-scale data3
Trained by users4
Web interface

This table only presents the preferences under different situations, which means a method with a tick can achieve a better performance under a certain circumstance.
1Only CPAT, LncRNA-ID, and lncRScan-SVM provide the model for mouse. When analysing other species, CPAT has the model for fly and zebrafish; CNCI and PLEK can predict the sequences of vertebrata and plant. CPAT, PLEK, and LncRNA-ID can build a new model based on users’ datasets. 2Users can choose CNCI for incomplete sequences and CPC or PLEK for the transcripts with indel errors. 3CPAT is the most efficient method. Though lncRScan-SVM needs more time than CPAT and LncRNA-ID, it is also acceptable. 4LncRNA-ID can handle the imbalanced training data. Training PLEK with users’ own datasets may be a time-consuming task.