Nature-Inspired-Based Approach for Automated Cyberbullying Classification on Multimedia Social Networking
Algorithm 1
DRL algorithm.
Input: Eligibility trace decay term λ, learning rate α, number of objectives n, discounting term γ, a ⟵ action (r = reward or p = penalty), s ⟵ state, o ⟵ observer
Initialize Population
For all states s, actions a and objectives o do
Initialize Q (s, a, o)
Endfor
Evaluate each member of the Population
For each epoch do
For all states s and actions a do
e (s, a) = 0
Endfor
Observe initial state st
Select action at based on an exploratory policy derived from Q (st))
For each step of the episode do
Execute action at, observe s′ find the vector as reward r or penalty s
Select action a based on a greedy policy derived from Q (s′)
Select action a′ based on an exploratory policy derived from Q (s′)