Research Article

Virtual Screening of Drug Proteins Based on Imbalance Data Mining

Algorithm 1

GA_SMOTE (T, n, k, x).
Input: number of minority class samples T; number of attributes n; K nearest nearby data
Output: attributes coding
(1) while x < n
(2)  Randomize the minority class samples
(3)  k = Number of nearest Nearby
(4)  : k minority classes around
(5)  : k majority classes around
(6)  Classes around
(7)  if majority classes > minority classes
(8)   continue
(9)  End
(10)    = Average distance from to
(11)    = Average distance from to
(12)   
(13)   Randomize delete an attributes
(14)    = Average distance from to
(15)    = Average distance from to
(16)   
(17)  if
(18)   Then the attribute representative the majority classes
(19)  End
(20)  The attribute representative the majority classes
(21) End