Mathematical Problems in Engineering / 2015 / Article / Alg 1

Research Article

Two-Phase Iteration for Value Function Approximation and Hyperparameter Optimization in Gaussian-Kernel-Based Adaptive Critic Design

Algorithm 1

Gaussian-kernel-based Approximate Dynamic Programming.
Initialize:
    : hyperparameters of Gaussian kernel model
    : sample set
    : initial policy
    , : learning step size
Let = 0;
Loop:
      k = 1     
     = t + 1;
    
    Get the reward
    Observe next state
    Update according to (12)
    Update the policy according to optimum seeking
      
    Update according to (12)
Until the termination criterion is satisfied

Article of the Year Award: Outstanding research contributions of 2020, as selected by our Chief Editors. Read the winning articles.