Advances in Human-Computer Interaction

Research Article

Diabetes Mellitus Disease Prediction Using Machine Learning Classifiers with Oversampling and Feature Augmentation

XGBoost (XB).

	Invalue: data (n-dimensional), X1 ϵ R1_n1 and outvalue (target), Y1 ϵ R1
	Outvalue: The pp, P1 ϵ [0, 1] of test data (unseen), x1, where
	, C1 = 2 (diabetes in (C1) or not (C2))
(1)	Initiate the model with fixed value:
	,
	L1(Y1,F1(x1)) is the loss functions and N denotes the number of samples
(2)	for m = 1 to M (n_Iterations) do
(3)	Calculate pseudo-residuals,

	where i1 = 1, 2,...,N
(4)	Assign a base tree, h1_m1 using set (training) (X1_i1,r1_im) for i1 = 1, 2,...,N
(5)	Multiplier γ1_m1 is calculated by

(6)	Update the model by

(7)	F1 _m(x1) is the desired pp, P1 ϵ [0, 1] .