(1) Input: positive sample set , negative sample set ; |

(2) Output: the new negative set ; |

(3) Calculate the number of samples in two sets, to , to ; |

(4) Select ( is defined as under-sampling ratio, ) samples randomly from set as initial clustering centroids, in |

our paper; |

(5) Repeat; |

(6) Calculate distances (Euclidean Distance) of each sample to all the clustering centroids; |

(7) Choose the nearest clustering centroids and add them to certain clusters; |

(8) Find the new centroids of all the new clusters; |

(9) Until each cluster stability; |

(10) Define the final centroids as ; |

(11) Output . |