Advances in Human-Computer Interaction

Research Article

Diabetes Mellitus Disease Prediction Using Machine Learning Classifiers with Oversampling and Feature Augmentation

Feature selection using PCA.

	Invalue: Data used for n-dimension, X1 ∈ R1n1 consisting of threshold and samples with variance
	Outvalue: k-dimensional data that is reduced, Y1 ϵ R1k1
(1)	Given X1 ϵ R1n1 and obtain the mean,

	where ∈ R1n1
(2)	Covariance matrix, n1 × n1,

(3)	Decomposition of eigenvalue: given as P1DP−1, where P1 ϵ R1n1 is the eigenvector matrix and denotes the diagonal eigenvalues
(4)	The eigenvectors are then sorted in a descending order to select first k1 eigenvectors that is given as
	variance ≥ Tvariance

(5)	The data X1 is given into a k-dimension by , where Y1 ϵ R1k1.