Advances in Human-Computer Interaction

Research Article

Diabetes Mellitus Disease Prediction Using Machine Learning Classifiers with Oversampling and Feature Augmentation

Decision tree (DT).

	Input: data (n-dimensional), X1 ϵ R1n1 and outvalue (target), Y1 ϵ R1
	Output: The pp, P1 ϵ [0, 1] of test data (unseen), x,
	, C1 = 2. (diabetes in (C11) or not (C12))
(1)	Divide θ = (j1, tm1) into (θ) and (θ) subsets; θ contains feature, j1, threshold, tm1
(2)	Calculate the kth node using an impurity(i) function (H1),

	(OR)
	AND

(3)	Reduce the impurity(i) by selecting the right parameters, θ = argmin θ G1(Q′1, θ)
(4)	Repeat the processes for subsets
	(θ) and (θ) until depth reaches < min samples or = 1.