Research Article

Novel Learning Algorithms for Efficient Mobile Sink Data Collection Using Reinforcement Learning in Wireless Sensor Network

Algorithm 3

Reinforcement learning based clustering algorithm (RLBCA).
Step 1. Initially all sensor nodes sends hello message packet to show their residual energy and current position.
Step 2. The learning agent records the total number of neighbour nodes and their residual energy. Periodically the residual
energy of each sensor nodes is set and return value of the node is set to zero.
Step 3. Based upon step 2, cluster head formation probability is computed. The base station selects the optimal number of
cluster heads among the desired cluster heads and creates the list.
Step 4. The base station announces the list of eligible cluster heads.
Step 5. The newly formed cluster heads send advertisement packets to their nearest
Neighbours for communication purpose.
Step 6. The state-action Q-values [10] are updated by reward function (equation ()) and Q-matrix (equation ()) to
achieve the optimal policy (equation ()):
Reward calculation
()
Q-matrix updation
()
Optimal policy
()
Step 7. if the current node’s residual energy is greater than other neighbour’s nodes, the sensor node with higher residual
energy is elected as a cluster head for next subsequent round.
Step 8. Repeat step 1 to step 7.