Research Article
Rebalancing Docked Bicycle Sharing System with Approximate Dynamic Programming and Reinforcement Learning
| Initialize the value function approximation for all states | | | | while do | | Choose an initial state and a sample path . | | for do | | | | | | if then | | | | else | | | | | | |
|