A Variable-Clustering-Based Feature Selection to Improve Positive and Negative Discrimination of P53 Protein in Colorectal Cancer Patients

<table class="algorithm-group"><tr><td><table class="algorithm" id="alg1"><tr><td colspan="2"><b>Input:</b> Datasets D, Feature sets F</td></tr><tr><td colspan="2"><b>Output:</b> K features</td></tr><tr><td colspan="2">1: Use the optimal decision tree algorithm to do variable binning on the feature set.</td></tr><tr><td colspan="2">2: Do WOE coding for variable binning results.</td></tr><tr><td colspan="2">3: Map the dataset D to a new dataset D1 through the WOE encoding of the variable.</td></tr><tr><td colspan="2">4: Use the formula (see <a href="https://static.hindawi.com/articles/cmmm/volume-2022/9261713/figures/#EEq1">2</a>) to calculate the variable IV value.</td></tr><tr><td colspan="2">5: According to the required K feature sets, use the clustering algorithm to do K-cluster variable clustering on D1.</td></tr><tr><td colspan="2">6: The variable with the largest IV value is selected from the K clusters to form K feature subsets.</td></tr></table></td></tr></table>

<div> Pseudocode of IV_Cluster Methodology.</div>

Computational and Mathematical Methods in Medicine

alg1

Algorithm 1

Algorithm 1: A Variable-Clustering-Based Feature Selection to Improve Positive and Negative Discrimination of P53 Protein in Colorectal Cancer Patients 

Algorithm 1 | A Variable-Clustering-Based Feature Selection to Improve Positive and Negative Discrimination of P53 Protein in Colorectal Cancer Patients