Research Article
Utilizing Selected Di- and Trinucleotides of siRNA to Predict RNAi Activity
Algorithm 1
The calculation process of threshold
.
Input: A data set , where is the feature set extracted from siRNA | sequence and is the experimentally determined siRNA activities. The features of are first sorted | by the variable importance in descending order. The initial value of and are 1 and , | respectively. | Output: optimal features . | The dataset is divided into ten parts. Nine parts are used as the training set and the rest are used as | a testing set. We build a Random Forest model using the feature set and the training set and then | predict the testing siRNAs using the model. The correlation coefficient between the observed and predicted | siRNA activities is . | | while and do | Calculate the prediction accuracy using according to the first step. | If then | | | else | end if | | end while | . |
|