Research Article

Diabetes Mellitus Disease Prediction Using Machine Learning Classifiers with Oversampling and Feature Augmentation

Algorithm 1

Feature selection using PCA.
Invalue: Data used for n-dimension, X1 ∈ R1n1 consisting of threshold and samples with variance
Outvalue: k-dimensional data that is reduced, Y1 ϵ R1k1
(1)Given X1 ϵ R1n1 and obtain the mean,
where ∈ R1n1
(2)Covariance matrix, n1 × n1,
(3)Decomposition of eigenvalue: given as P1DP−1, where P1 ϵ R1n1 is the eigenvector matrix and denotes the diagonal eigenvalues
(4)The eigenvectors are then sorted in a descending order to select first k1 eigenvectors that is given as
 variance ≥ Tvariance
(5)The data X1 is given into a k-dimension by , where Y1 ϵ R1k1.