Table of Contents Author Guidelines Submit a Manuscript
Mathematical Problems in Engineering
Volume 2015, Article ID 760459, 14 pages
http://dx.doi.org/10.1155/2015/760459
Research Article

Two-Phase Iteration for Value Function Approximation and Hyperparameter Optimization in Gaussian-Kernel-Based Adaptive Critic Design

1School of Automation, China University of Geosciences, Wuhan, Hubei 430074, China
2School of Information Science and Engineering, Central South University, Changsha, Hunan 410083, China

Received 7 January 2015; Accepted 26 May 2015

Academic Editor: Simon X. Yang

Copyright © 2015 Xin Chen et al. This is an open access article distributed under the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.

Linked References

  1. R. Sutton and A. Barto, Reinforcement Learning: An Introduction, Adaptive Computation and Machine Learning, MIT Press, Cambridge, Mass, USA, 1998.
  2. F. L. Lewis and D. Vrabie, “Reinforcement learning and adaptive dynamic programming for feedback control,” IEEE Circuits and Systems Magazine, vol. 9, no. 3, pp. 32–50, 2009. View at Publisher · View at Google Scholar · View at Scopus
  3. F. L. Lewis, D. Vrabie, and K. Vamvoudakis, “Reinforcement learning and feedback control: using natural decision methods to design optimal adaptive controllers,” IEEE Control Systems Magazine, vol. 32, no. 6, pp. 76–105, 2012. View at Publisher · View at Google Scholar · View at MathSciNet · View at Scopus
  4. H. Zhang, Y. Luo, and D. Liu, “Neural-network-based near-optimal control for a class of discrete-time affine nonlinear systems with control constraints,” IEEE Transactions on Neural Networks, vol. 20, no. 9, pp. 1490–1503, 2009. View at Publisher · View at Google Scholar · View at Scopus
  5. Y. Huang and D. Liu, “Neural-network-based optimal tracking control scheme for a class of unknown discrete-time nonlinear systems using iterative ADP algorithm,” Neurocomputing, vol. 125, pp. 46–56, 2014. View at Publisher · View at Google Scholar · View at Scopus
  6. X. Xu, L. Zuo, and Z. Huang, “Reinforcement learning algorithms with function approximation: recent advances and applications,” Information Sciences, vol. 261, pp. 1–31, 2014. View at Publisher · View at Google Scholar · View at MathSciNet · View at Scopus
  7. A. Al-Tamimi, F. L. Lewis, and M. Abu-Khalaf, “Discrete-time nonlinear HJB solution using approximate dynamic programming: convergence proof,” IEEE Transactions on Systems, Man, and Cybernetics, Part B: Cybernetics, vol. 38, no. 4, pp. 943–949, 2008. View at Publisher · View at Google Scholar · View at Scopus
  8. S. Ferrari, J. E. Steck, and R. Chandramohan, “Adaptive feedback control by constrained approximate dynamic programming,” IEEE Transactions on Systems, Man, and Cybernetics, Part B: Cybernetics, vol. 38, no. 4, pp. 982–987, 2008. View at Publisher · View at Google Scholar · View at Scopus
  9. F.-Y. Wang, H. Zhang, and D. Liu, “Adaptive dynamic programming: an introduction,” IEEE Computational Intelligence Magazine, vol. 4, no. 2, pp. 39–47, 2009. View at Publisher · View at Google Scholar · View at Scopus
  10. D. Wang, D. Liu, Q. Wei, D. Zhao, and N. Jin, “Optimal control of unknown nonaffine nonlinear discrete-time systems based on adaptive dynamic programming,” Automatica, vol. 48, no. 8, pp. 1825–1832, 2012. View at Publisher · View at Google Scholar · View at MathSciNet · View at Scopus
  11. D. Vrabie and F. Lewis, “Neural network approach to continuous-time direct adaptive optimal control for partially unknown nonlinear systems,” Neural Networks, vol. 22, no. 3, pp. 237–246, 2009. View at Publisher · View at Google Scholar · View at Scopus
  12. D. Liu, D. Wang, and X. Yang, “An iterative adaptive dynamic programming algorithm for optimal control of unknown discrete-time nonlinear systems with constrained inputs,” Information Sciences, vol. 220, pp. 331–342, 2013. View at Publisher · View at Google Scholar · View at MathSciNet · View at Scopus
  13. N. Jin, D. Liu, T. Huang, and Z. Pang, “Discrete-time adaptive dynamic programming using wavelet basis function neural networks,” in Proceedings of the IEEE Symposium on Approximate Dynamic Programming and Reinforcement Learning, pp. 135–142, Honolulu, Hawaii, USA, April 2007. View at Publisher · View at Google Scholar · View at Scopus
  14. P. Koprinkova-Hristova, M. Oubbati, and G. Palm, “Adaptive critic design with echo state network,” in Proceedings of the IEEE International Conference on Systems, Man and Cybernetics (SMC '10), pp. 1010–1015, IEEE, Istanbul, Turkey, October 2010. View at Publisher · View at Google Scholar · View at Scopus
  15. X. Xu, Z. Hou, C. Lian, and H. He, “Online learning control using adaptive critic designs with sparse kernel machines,” IEEE Transactions on Neural Networks and Learning Systems, vol. 24, no. 5, pp. 762–775, 2013. View at Publisher · View at Google Scholar · View at Scopus
  16. V. Vapnik, Statistical Learning Theory, Wiley, New York, NY, USA, 1998.
  17. C. E. Rasmussen and C. K. Williams, Gaussian Processes for Machine Learning, Adaptive Computation and Machine Learning, MIT Press, Cambridge, Mass, USA, 2006. View at MathSciNet
  18. Y. Engel, S. Mannor, and R. Meir, “The kernel recursive least-squares algorithm,” IEEE Transactions on Signal Processing, vol. 52, no. 8, pp. 2275–2285, 2004. View at Publisher · View at Google Scholar · View at MathSciNet · View at Scopus
  19. T. G. Dietterich and X. Wang, “Batch value function approximation via support vectors,” in Advances in Neural Information Processing Systems, pp. 1491–1498, MIT Press, Cambridge, Mass, USA, 2002. View at Google Scholar
  20. X. Wang, X. Tian, Y. Cheng, and J. Yi, “Q-learning system based on cooperative least squares support vector machine,” Acta Automatica Sinica, vol. 35, no. 2, pp. 214–219, 2009. View at Publisher · View at Google Scholar · View at Scopus
  21. T. Hofmann, B. Schölkopf, and A. J. Smola, “Kernel methods in machine learning,” The Annals of Statistics, vol. 36, no. 3, pp. 1171–1220, 2008. View at Publisher · View at Google Scholar · View at MathSciNet · View at Scopus
  22. M. F. Huber, “Recursive Gaussian process: on-line regression and learning,” Pattern Recognition Letters, vol. 45, no. 1, pp. 85–91, 2014. View at Publisher · View at Google Scholar · View at Scopus
  23. Y. Engel, S. Mannor, and R. Meir, “Bayes meets bellman: the Gaussian process approach to temporal difference learning,” in Proceedings of the 20th International Conference on Machine Learning, pp. 154–161, August 2003. View at Scopus
  24. Y. Engel, S. Mannor, and R. Meir, “Reinforcement learning with Gaussian processes,” in Proceedings of the 22nd International Conference on Machine Learning, pp. 201–208, ACM, August 2005. View at Publisher · View at Google Scholar · View at Scopus
  25. C. E. Rasmussen and M. Kuss, “Gaussian processes in reinforcement learning,” in Advances in Neural Information Processing Systems, S. Thrun, L. K. Saul, and B. Schölkopf, Eds., pp. 751–759, MIT Press, Cambridge, Mass, USA, 2004. View at Google Scholar
  26. M. P. Deisenroth, C. E. Rasmussen, and J. Peters, “Gaussian process dynamic programming,” Neurocomputing, vol. 72, no. 7–9, pp. 1508–1524, 2009. View at Publisher · View at Google Scholar · View at Scopus
  27. X. Xu, C. Lian, L. Zuo, and H. He, “Kernel-based approximate dynamic programming for real-time online learning control: an experimental study,” IEEE Transactions on Control Systems Technology, vol. 22, no. 1, pp. 146–156, 2014. View at Publisher · View at Google Scholar · View at Scopus
  28. A. Girard, C. E. Rasmussen, and R. Murray-Smith, “Gaussian process priors with uncertain inputs-multiple-step-ahead prediction,” Tech. Rep., 2002. View at Google Scholar
  29. X. Xu, T. Xie, D. Hu, and X. Lu, “Kernel least-squares temporal difference learning,” International Journal of Information Technology, vol. 11, no. 9, pp. 54–63, 2005. View at Google Scholar
  30. H. J. Kushner and G. G. Yin, Stochastic Approximation and Recursive Algorithms and Applications, vol. 35 of Applications of Mathematics, Springer, Berlin, Germany, 2nd edition, 2003. View at MathSciNet