Research Article

Integrating Correlation-Based Feature Selection and Clustering for Improved Cardiovascular Disease Diagnosis

Algorithm 1

Proposed feature selection algorithm using reversed correlations
Input: F = f1, f2, f3, … fn/ set of all the features /;
   P/ statistical significance level /;
   R/ a threshold for correlation coefficient levels /;
   N/ the maximum of features for the subset/;
Output: Fs/ selected subset of features /;
   (1) Initialize Fs with feature fj ϵ F that is the least correlated with other ones;
   (2) do
   (3) Compute Cij(Fs, F \ Fs) as a vector of correlation coefficients between Fs and each fi ϵ {F \ Fs};
   (4) Choose fj ϵ {F \ Fs} with the lowest value of correlation coefficient in a vector Cij(Fs, F \ Fs);
   (5) Include fj in Fs
   (6) while (s < N AND p > P AND Cij(Fs, F \ Fs) < R).