The Scientific World Journal

Research Article

A Novel Algorithm for Imbalance Data Classification Based on Neighborhood Hypergraph

Neighbor hypergraph (N-HyperGraph).

Input: the training sample set: , the radius of neighborhood: .
Output: hyper-edge set: .
Step 1. (Initialization)
; //Initialize the hyper-edge set.
According to the formula (2), (14), calculate the radius of each sample.
FOR each in X DO
; //Count the number of hyper-edges generated by each sample.
WHILE () //Each of samples generates five hyper-edges.
Generate hyper-edge according to and seven tenth attributions of inherit attribution
values of ;
Calculate the distance between and each sample according to Formula (2) respectively;
Calculate and respectively, according to Definitions 10 and Definition 11;
IF THEN ;
ELSE ; // is a random number in .
Calculate and respectively, according to Definition 10;
IF THEN ;
ELSE ;
END IF
END IF
++;
END WHILE
END FOR
Step 2. (Training Set Classification)
Calculate of hyper-edge set according to Definition 7 and Formula (6);
FOR each in DO
Calculate and according to Definition 12;
IF THEN ;
ELSE ;
END IF
END FOR
Calculate the classification accuracy of training data set: Train-accuracy.
IF Train-accuracy > 0.95 THEN GOTO Step 4;
ELSE GOTO Step 3;
END IF
Step 3. (Hyper-edge Replacement)
; //Number of hyper-edges that should be replaced.
FOR each in DO
Calculate the confidence-degree of according to Definition 8 and Formula (8);
IF THEN ++;
END IF
END FOR
WHILE ()
Generate a new hyper-edge according to Step 1;
−−;
END WHILE
GOTO Step 2;
Step 4. (Return Hypergraph)
RETURN .