Wireless Power Transfer

Research Article

Optimal Wireless Information and Power Transfer Using Deep Q-Network

Deep Q-network algorithm training process.

(1)	Randomly generate the weight parameter for the . The clones the weight parameters . . . . . .
(2)	At the beginning of the time slot, randomly generate a probability .
	and :
	we choose the action as
	:
	Randomly choose the action from action set .
	The transmitter transmits with the selected beam pattern.
(3)	Throughout the whole time slot, the RF energy is accumulated in the harvesters’ energy buffer, as , . At the end of each time slot, each harvester feedbacks the energy level to the transmitter and the system state is updated to .
(4)	. . If reaches the maximum of experience pool, remains constant, , otherwise, . . . .
(5)	After experience pool accumulates enough data, from experiences, randomly select experiences to train the neural network . Backpropagation method is applied to minimize the loss function . Clone the weight parameters from to after several time intervals.
(6)	:
	. . . . If , algorithm terminates; otherwise, go back to step 2.
	:
	go to step 3.