| Input: Multidimensional data discretization scheme |
| Output: Optimal discretization scheme |
| Initialize: global variable = 0, local variable = 0, crossover Q-Table = null, mutation Q-Table = null, t = 0; |
| begin |
| Get the initial breakpoints of the multidimensional data by sorting the values of each feature and removing duplicate values; |
| Binary encode the initial breakpoints of multidimensional data according to the method in Part B of Section2; |
| Randomly generate initial population P(t); |
| Calculate the fitness of each individual in P(t) using equation (8); |
| Update global variable with the optimal individual fitness value in P(t); |
| Generate state set based on the number of features of multidimensional data according to the definition of state in Part C of Section3; |
| Choose a state from the state set as the initial state S(t); |
| while t is less than the user’s termination iterations do |
| Choose an action from the action set {G, H, I} by e-greedy strategy according to the definition of action in Part C of Section3; |
| Execute the selected action on the current state S(t) to jump to the next state S(t + 1); |
| Perform crossover operation with global variable on the features contained in state S(t + 1); |
| Calculate the fitness of the multidimensional data discretization scheme after crossover operation using equation (5); |
| Measure the corresponding reward using equation (11) according to the definition of reward in Part C of Section3; |
| Update crossover Q-Table using equation (6); |
| if the fitness of the multidimensional data discretization scheme > local variable do |
| Update local variable with the fitness of the multidimensional data discretization scheme; |
| end |
| Perform crossover operation in P(t); |
| Calculate the fitness of each individual in P(t) using equation (8); |
| Update global variable with the optimal individual fitness value in P(t); |
| S(t) = S(t + 1); |
| Choose an action from the action set {G, H, I} by e-greedy strategy according to the definition of action in Part C of Section3; |
| Execute the selected action on the current state S(t) to jump to the next state S(t + 1); |
| Perform mutation operation on the features contained in state S(t + 1) |
| Calculate the fitness of the multidimensional data discretization scheme after mutation operation using equation (5); |
| Measure the corresponding reward using equation (11) according to the definition of reward in Part C of Section3; |
| Update mutation Q-Table using equation (6); |
| if the fitness of the multidimensional data discretization scheme > local variable do |
| Update local variable with the fitness of the multidimensional data discretization scheme; |
| end |
| Perform mutation operation in P(t); |
| Calculate the fitness of each individual in P(t) using equation (8); |
| Update global variable with the optimal individual fitness value in P(t); |
| t = t + 1; |
| end |
| Return Max(global variable, local variable); |
| end |