Neural-network-based discounted optimal control via an integrated value iteration with accuracy guarantee.

Journal: Neural networks : the official journal of the International Neural Network Society
Published Date:

Abstract

A data-based value iteration algorithm with the bidirectional approximation feature is developed for discounted optimal control. The unknown nonlinear system dynamics is first identified by establishing a model neural network. To improve the identification precision, biases are introduced to the model network. The model network with biases is trained by the gradient descent algorithm, where the weights and biases across all layers are updated. The uniform ultimate boundedness stability with a proper learning rate is analyzed, by using the Lyapunov approach. Moreover, an integrated value iteration with the discounted cost is developed to fully guarantee the approximation accuracy of the optimal value function. Then, the effectiveness of the proposed algorithm is demonstrated by carrying out two simulation examples with physical backgrounds.

Authors

  • Mingming Ha
    School of Automation and Electrical Engineering, University of Science and Technology Beijing, Beijing 100083, China. Electronic address: hamingming_0705@foxmail.com.
  • Ding Wang
  • Derong Liu
    State Key Laboratory of Management and Control for Complex Systems, Institute of Automation, Chinese Academy of Sciences, Beijing, China.