The original paper is in English. Non-English content has been machine-translated and may contain typographical errors or mistranslations. ex. Some numerals are expressed as "XNUMX".
Copyrights notice
The original paper is in English. Non-English content has been machine-translated and may contain typographical errors or mistranslations. Copyrights notice
본 논문은 연속시간 비선형 시스템에 대한 무한 시간 지평 최적 제어 문제를 연구합니다. 시스템의 동적 모델 대신 궤적에서 실시간으로 측정된 데이터만 사용하는 모델이 전혀 없는 근사 최적 제어 설계 방법이 제안되었습니다. 이 접근법은 비평가 신경망과 배우 신경망의 가중치가 가중치 잔차 방법에 의해 순차적으로 업데이트되는 배우-비평 구조를 기반으로 합니다. 제어 정책을 개선하기 위해 입력-상태 역학을 대체하기 위해 외부 입력이 도입되었다는 점에 유의해야 합니다. 또한, 폐쇄 루프 시스템의 안정성과 함께 최적의 솔루션으로의 수렴에 대한 엄격한 증거가 제공됩니다. 마지막으로, 방법의 효율성을 보여주기 위해 수치 예가 제공됩니다.
Zhenhui XU
Sophia University
Tielong SHEN
Sophia University
Daizhan CHENG
Chinese Academy of Sciences
The copyright of the original papers published on this site belongs to IEICE. Unauthorized use of the original or translated papers is prohibited. See IEICE Provisions on Copyright for details.
부
Zhenhui XU, Tielong SHEN, Daizhan CHENG, "Neural Network-Based Model-Free Learning Approach for Approximate Optimal Control of Nonlinear Systems" in IEICE TRANSACTIONS on Fundamentals,
vol. E104-A, no. 2, pp. 532-541, February 2021, doi: 10.1587/transfun.2020EAP1022.
Abstract: This paper studies the infinite time horizon optimal control problem for continuous-time nonlinear systems. A completely model-free approximate optimal control design method is proposed, which only makes use of the real-time measured data from trajectories instead of a dynamical model of the system. This approach is based on the actor-critic structure, where the weights of the critic neural network and the actor neural network are updated sequentially by the method of weighted residuals. It should be noted that an external input is introduced to replace the input-to-state dynamics to improve the control policy. Moreover, strict proof of convergence to the optimal solution along with the stability of the closed-loop system is given. Finally, a numerical example is given to show the efficiency of the method.
URL: https://global.ieice.org/en_transactions/fundamentals/10.1587/transfun.2020EAP1022/_p
부
@ARTICLE{e104-a_2_532,
author={Zhenhui XU, Tielong SHEN, Daizhan CHENG, },
journal={IEICE TRANSACTIONS on Fundamentals},
title={Neural Network-Based Model-Free Learning Approach for Approximate Optimal Control of Nonlinear Systems},
year={2021},
volume={E104-A},
number={2},
pages={532-541},
abstract={This paper studies the infinite time horizon optimal control problem for continuous-time nonlinear systems. A completely model-free approximate optimal control design method is proposed, which only makes use of the real-time measured data from trajectories instead of a dynamical model of the system. This approach is based on the actor-critic structure, where the weights of the critic neural network and the actor neural network are updated sequentially by the method of weighted residuals. It should be noted that an external input is introduced to replace the input-to-state dynamics to improve the control policy. Moreover, strict proof of convergence to the optimal solution along with the stability of the closed-loop system is given. Finally, a numerical example is given to show the efficiency of the method.},
keywords={},
doi={10.1587/transfun.2020EAP1022},
ISSN={1745-1337},
month={February},}
부
TY - JOUR
TI - Neural Network-Based Model-Free Learning Approach for Approximate Optimal Control of Nonlinear Systems
T2 - IEICE TRANSACTIONS on Fundamentals
SP - 532
EP - 541
AU - Zhenhui XU
AU - Tielong SHEN
AU - Daizhan CHENG
PY - 2021
DO - 10.1587/transfun.2020EAP1022
JO - IEICE TRANSACTIONS on Fundamentals
SN - 1745-1337
VL - E104-A
IS - 2
JA - IEICE TRANSACTIONS on Fundamentals
Y1 - February 2021
AB - This paper studies the infinite time horizon optimal control problem for continuous-time nonlinear systems. A completely model-free approximate optimal control design method is proposed, which only makes use of the real-time measured data from trajectories instead of a dynamical model of the system. This approach is based on the actor-critic structure, where the weights of the critic neural network and the actor neural network are updated sequentially by the method of weighted residuals. It should be noted that an external input is introduced to replace the input-to-state dynamics to improve the control policy. Moreover, strict proof of convergence to the optimal solution along with the stability of the closed-loop system is given. Finally, a numerical example is given to show the efficiency of the method.
ER -