Research Article
A Knowledge-Fusion Ranking System with an Attention Network for Making Assignment Recommendations
Input: assignment records , learning rate , discount factor , and reward function | Output: | (1) | Initialize with random values | (2) | updated state in an episode (, ), Algorithm 2 | (3) | | (4) | fordo | (5) | sample ranking in an episode | (6) | fortodo | (7) | Sample an action | (8) | | (9) | Append at the end of | (10) | State transition | (11) | end for | (12) | fortodo | (13) | | (14) | | (15) | end for | (16) | | (17) | end for |
|