Research Article

Diabetes Mellitus Disease Prediction Using Machine Learning Classifiers with Oversampling and Feature Augmentation

Algorithm 5

Decision tree (DT).
Input: data (n-dimensional), X1 ϵ R1n1 and outvalue (target), Y1 ϵ R1
Output: The pp, P1 ϵ [0, 1] of test data (unseen), x,
, C1 = 2. (diabetes in (C11) or not (C12))
(1)Divide θ = (j1, tm1) into (θ) and (θ) subsets; θ contains feature, j1, threshold, tm1
(2)Calculate the kth node using an impurity(i) function (H1),
(OR)
  AND
(3)Reduce the impurity(i) by selecting the right parameters, θ = argmin θ G1(Q′1, θ)
(4)Repeat the processes for subsets
(θ) and (θ) until depth reaches  < min samples or  = 1.