Research Article

Utilizing Selected Di- and Trinucleotides of siRNA to Predict RNAi Activity

Algorithm 1

The calculation process of threshold .
Input: A data set , where is the feature set extracted from siRNA
sequence and is the experimentally determined siRNA activities. The features of are first sorted
by the variable importance in descending order. The initial value of and are 1 and ,
respectively.
Output: optimal features .
  The dataset is divided into ten parts. Nine parts are used as the training set and the rest are used as
a testing set. We build a Random Forest model using the feature set and the training set and then
predict the testing siRNAs using the model. The correlation coefficient between the observed and predicted
siRNA activities is .
while and   do
  Calculate the prediction accuracy using according to the first step.
  If then
    
    
  else
  end if
  
end while
  .