Research Article

Random Fuzzy Granular Decision Tree

Algorithm 2

Construct RFGDT.
Input: instance set and threshold
Output: root of fuzzy granular decision tree
(1) Normalize instances into .
(2) Calculate cluster center set (see Algorithm 1).
(3)and //Parallel distributed fuzzy granulation.
(4)//This is parallel process. Here, take as example.
 For to
  
  For to
  , instance is fuzzy granulated as
  end for
  Build a fuzzy granular array ;
  Get label of , ;
  A rule can be built. ;
 End for
(5) is randomly divided into sub fuzzy granular array sets .
(6) Map stage. In the Map function, each feature is used as the key of the Map function, and the information gain ratio is used as the value of the Map function, namely,
(7) IF for , its classification is , then set as a single-node tree, take as the label of the node, and return .
(8) Reduce stage. Between the Map phase and the Reduce phase, the Hadoop distributed system first aggregates the output results of all Map functions according to the key.
 Then, these aggregated intermediate results are used as the input of the Reduce stage, and the intermediate results after aggregation are as follows.
(9) If , set as a single-node tree, and use the label with the largest number of fuzzy granular arrays in as the label of the node, and return .
 Or else, , calculate the information gain ratio of to , namely , select the feature with the largest sum of information gain ratio .
(10) If the information gain of satisfies , then is set as a single-node tree, and the label with the largest number of fuzzy granular arrays in is used as the class of the node, and is returned;
 Or else, for each possible value of , according to , divide into subsets of nonempty, , take the label with the largest number of fuzzy granular arrays in as a mark, construct subnodes, and form tree based on the nodes and their subnodes, and return .
(11) For the node , use as the training set and as the feature set, recursively call 5 to 10, and get the subtree , and return .