Research Article

Prediction of Breast Cancer from Imbalance Respect Using Cluster-Based Undersampling Method

Algorithm 1

The clustering-based undersampling procedure.
Input: the data set .
, is the set of D-dimensional points, is the centroid point, is the average distance of each member to the centroid in same cluster , denotes the input element of member , denotes the input element of the centroid , is the total number of data points, is the dimension of an input vector, is the number of clusters, and is a factor to control the number of training samples.
Output: the final training data with informative samples .