| Input: Feature set |
| Output: Optimal set of features |
| (1) For feature xi in x1, x2, …, xn |
| a. Read feature xi into the array named X |
| X = {x1, x2, x3,…xi} |
| b. Read the target variable into array named Y |
| c. Set the train–test split ratio |
| Train_r = 0.8 |
| Test_r = 0.2 |
| d. Fix the initial seed for random generator in train and test |
| random_state = n |
| e. Split the data set into x_train, y_train, y_train and y_test using train–test split ratio and random_state |
| f. Train and the classifier for feature xi and target |
| g. Compute accuracy using |
| Accuracy = (TP + TN)/(TP + FP + TN + FN) |
| h. Precision is computing by |
| Precision = TP/(TP + FP) |
| i. Calculate Recall rate using |
| Recall = TP/(TP + FN) |
| j. Find F-score using |
| F-score = 2 ∗ (Recall ∗ Precision)/(Recall + Precision) |
| k. Repeat steps c through j by setting different values of train test split ratio |
| (2) Combine features into categories C1 (Linguistic features (L)), C2 (Sentiment features (S)) and C3 (Contradictory features (C)) |
| a. For feature category Ci in C1, C2 and C3 |
| b. Repeat steps a through k |
| (3) Combine categories of features (L + S), (S + C), (L + C) and (L + S + C) |
| (4) For each category combination Ci in (L + S), (S + C), (L + C) and (L + S + C) |
| a. Repeat steps a through k |
| (5) Repeat steps 1 through 4 for different types of classifier |
| (6) Select the feature, category, or category combination that gives high accuracy |