Research Article

A Direct Reinforcement Learning Approach for Nonautonomous Thermoacoustic Generator

Algorithm 2

Data collection-based RL for time varying systems.
Step 1: Initializing the starting control signal and the starting state , let and .
Step 2: Solving the optimization problem to find and then obtain the admissible control [4]:
With constraint: .
Step 3: Solving a positive definite value function from the admissible control [4] in Step 2:
Step 4: If then and go to Step 2. Else, go to Step 5.
Step 5: Obtaining the approximate optimal value function and optimal control .