Research Article

Multi-Rule Based Ensemble Feature Selection Model for Sarcasm Type Detection in Twitter

Algorithm 3

Proposed ensemble feature selection.
Input: Feature set
Output: Optimal set of features
 (1) For feature xi in x1, x2, …, xn
      a. Read feature xi into the array named X
       X = {x1, x2, x3,xi}
      b. Read the target variable into array named Y
      c. Set the train–test split ratio
       Train_r = 0.8
       Test_r = 0.2
      d. Fix the initial seed for random generator in train and test
        random_state = n
      e. Split the data set into x_train, y_train, y_train and y_test using train–test split ratio and random_state
      f. Train and the classifier for feature xi and target
      g. Compute accuracy using
       Accuracy = (TP + TN)/(TP + FP + TN + FN)
      h. Precision is computing by
       Precision = TP/(TP + FP)
      i. Calculate Recall rate using
      Recall = TP/(TP + FN)
      j. Find F-score using
      F-score = 2 ∗ (Recall ∗ Precision)/(Recall + Precision)
     k. Repeat steps c through j by setting different values of train test split ratio
 (2) Combine features into categories C1 (Linguistic features (L)), C2 (Sentiment features (S)) and C3 (Contradictory features (C))
     a. For feature category Ci in C1, C2 and C3
     b. Repeat steps a through k
 (3) Combine categories of features (L + S), (S + C), (L + C) and (L + S + C)
 (4) For each category combination Ci in (L + S), (S + C), (L + C) and (L + S + C)
     a. Repeat steps a through k
 (5) Repeat steps 1 through 4 for different types of classifier
 (6) Select the feature, category, or category combination that gives high accuracy