Research Article

Efficient Data Mining Algorithms for Screening Potential Proteins of Drug Target

Algorithm 2

EM for negative Selection Algorithm.
Input: The unlabeled dataset , the positive dataset , the number of selection
Initialization the reliable negative set RN = NULL
Run EM on mixture model using U and P to derive the mixture probability distributions
             
For each sample in U:
  Compute the probability of the sample assigned as the negative
       
Rank the above probability likelihood in decreasing order
Select the top L samples to append the RN
Output: The reliable negative samples RN