Mathematical Problems in Engineering

Research Article

Virtual Screening of Drug Proteins Based on Imbalance Data Mining

GA_SMOTE (T, n, k, x).

	Input: number of minority class samples T; number of attributes n; K nearest nearby data
	Output: attributes coding
(1)	while x < n
(2)	Randomize the minority class samples
(3)	k = Number of nearest Nearby
(4)	: k minority classes around
(5)	: k majority classes around
(6)	Classes around
(7)	if majority classes > minority classes
(8)	continue
(9)	End
(10)	= Average distance from to
(11)	= Average distance from to
(12)
(13)	Randomize delete an attributes
(14)	= Average distance from to
(15)	= Average distance from to
(16)
(17)	if
(18)	Then the attribute representative the majority classes
(19)	End
(20)	The attribute representative the majority classes
(21)	End