Journal of Advanced Transportation

Research Article

Rebalancing Docked Bicycle Sharing System with Approximate Dynamic Programming and Reinforcement Learning

RTDP.

	Initialize the value function approximation for all states

	while do
	Choose an initial state and a sample path .
	for do


	if then

	else