Journal of Advanced Transportation

Research Article

Rebalancing Docked Bicycle Sharing System with Approximate Dynamic Programming and Reinforcement Learning

The nomenclatures used in this study.


Sets
	Set of stations (0: depot)
	Set of time steps
	Set of states
	Set of feasible actions
	Set of policies
	Sequence of decision points

Indices
	Decision point
	Decision state
	Point in time in state

Parameters
	Cargo vehicle capacity
	Travel time between two stations
	Service time for rebalancing per bicycle
	Station capacity
	Safety buffer
	z-score for the safety stock
	Station observed pickup demand at time
	Station observed return demand at time
	Station predicted pickup demand at time
	Station predicted return demand at time

Variables
	Cargo vehicle load at time
	Cargo vehicle location at time
	The number of delivered bikes from the cargo vehicle at time
	Station fill levels at time
	Station fill rate index in time
	Delivery decision
	Next station decision