Parameter design and optimization of a flight attitude simulator system based on PILCO framework

Ye-feng YANG; Kai DENG; Ying-qi ZUO; Xiao-jun BAN; Xian-lin HUANG

doi:10.3788/OPE.20192711.2365

您当前的位置：

首页 >

文章列表页 >

Parameter design and optimization of a flight attitude simulator system based on PILCO framework

Micro/Nano Technology and Fine Mechanics | 更新时间：2020-08-13

- Parameter design and optimization of a flight attitude simulator system based on PILCO framework
- Optics and Precision Engineering Vol. 27, Issue 11, Pages: 2365-2373(2019)
- 作者机构：
  
  1.哈尔滨工业大学控制理论与制导技术研究中心，黑龙江哈尔滨 150001
  2.四川航天系统工程研究所，四川成都 610100
- 作者简介：
- 基金信息：
- DOI：10.3788/OPE.20192711.2365
  CLC： TP273;TP249
- Received：08 July 2019，
  
  Accepted：14 August 2019，
  
  Published：15 November 2019
- 稿件说明：
移动端阅览
Ye-feng YANG, Kai DENG, Ying-qi ZUO, et al. Parameter design and optimization of a flight attitude simulator system based on PILCO framework[J]. Optics and precision engineering, 2019, 27(11): 2365-2373.
DOI：

Ye-feng YANG, Kai DENG, Ying-qi ZUO, et al. Parameter design and optimization of a flight attitude simulator system based on PILCO framework[J]. Optics and precision engineering, 2019, 27(11): 2365-2373. DOI： 10.3788/OPE.20192711.2365.

摘要

PID控制是飞行器控制中应用最广泛的控制方法，但是PID参数的调节往往十分繁琐。为了实现飞行模拟器控制系统自主优化PID控制器的参数，从而完成系统的稳定控制，本文使用强化学习中的概率推理学习控制算法(Probabilistic Inference for Learning Control

PILCO)自主优化PID控制器的参数。首先，利用输入输出数据拟合出系统的概率动力学模型，并使用策略评估的方法对当前PID控制器进行评价；最后，使用策略提升的方式对当前PID控制器进行优化。在系统采样频率为100 Hz，每次采集8 s数据的实验中，经过10个回合的离线训练之后，系统控制效果已经可以满足要求，PID控制器参数已经收敛。经过PILCO优化的飞行姿态模拟器在定点实验中表现出良好的鲁棒性，表明PILCO算法可以优化PID控制器的参数，并且在解决非线性控制和参数优化方面具有很大潜能。

Abstract

Proportional-integral-derivative (PID) controllers are widely used in flight control systems. However

it is often very cumbersome to adjust the parameters of a PID controller. In this study

we use Probabilistic Inference for Learning Control (PILCO) to optimize the parameters of a PID controller. As the first step

we develop a probabilistic dynamics model of the flight control system using input and output data. Next

the existing PID controller is evaluated using the policy evaluation method. Finally

the evaluated PID controller is optimized by policy update. The sampling frequency of the system is 100 Hz and the data acquisition time per round is 8 s. The optimized PID controller can achieve stable control post 10 rounds of offline training. Through PILCO optimization

the flight attitude simulator performed robustly in a fixed-point experiment

indicating that PILCO has tremendous potential in solving nonlinear control and parameter optimization problems.

关键词

Keywords

references

高九州, 贾宏光.无人机自主着陆纵向控制律设计[J].光学精密工程, 2016, 24(7):1799-1806.

GAO J ZH, JIA H G. Design of longitudinal control law for small fixed-wing UAV during auto landing[J]. Opt. Precision Eng. , 2016, 24(7):1799-1806. (in Chinese)

李迪, 陈向坚, 续志军.增益自适应滑模控制器在微型飞行器飞行姿态控制中的应用[J].光学精密工程, 2013, 21(5):1183-1191.

LI D, CHEN X J, XU ZH J. Gain adaptive sliding mode controller used for flight attitude control of MAV[J]. Opt. Precision Eng. , 2013, 21(5):1183-1191. (in Chinese)

尹航, 杨烨峰, 赵岩.二自由度飞行姿态模拟器自整定控制器设计[J].电机与控制学报, 2018, 22(4):105-112.

YIN H, YANG Y F, ZHAO Y, Self-tuning controller design for a 2-DOF flight attitude simulator[J]. Electric Machines and Control , 2018, 22(4):105-112. (in Chinese)

BUȘONIUA L, BRUINB T D, TOLIÇC D , et al .. Reinforcement learning for control: performance, stability, and deep approximators[J]. Annual Reviews in Control , 2018(5):1-18.

RECHT B. A tour of reinforcement learning: the view from continuous control[Z/OL]. https://arxiv.org/pdf/1806.09460.pdf https://arxiv.org/pdf/1806.09460.pdf .[2018-09-09].

DONG L, GUANG-HONG Y. Model-free adaptive control design for nonlinear discrete-time processes with reinforcement learning techniques[J]. International Journal of Systems Science , 2018, 49(11):2298-2308.

LEVINE S. Reinforcement learning and control as probabilistic inference: tutorial and review[Z/OL]. https://arxiv.org/abs/1805.00909 https://arxiv.org/abs/1805.00909 .[2018-05-20].

张天泽.基于强化学习的四旋翼无人机路径规划方法研究[D].哈尔滨: 哈尔滨工业大学, 2018. http://cdmd.cnki.com.cn/Article/CDMD-10213-1018896327.htm

ZHANG T. Research on Path Planning Method of Quadrotor UAV Based on Reinforcement Learning [D]. Harbin: Harbin Institute of Technology, 2018.(in Chinese)

KAELBLING L P, LITTMAN M L, MOORE A W. Reinforcement learning: a survey[J]. Artificial Intelligence Research, 1996, 4(1):237-285.

CHUA K, CALANDRA R, MCALLISTER R, et al .. Deep reinforcement learning in a handful of trials using probabilistic dynamics models[Z/OL]. https://arxiv.org/abs/1805.12114 https://arxiv.org/abs/1805.12114 .[2018-11-02].

DEISENROTH M, RASMUSSEN C. PILCO: A model-based and data-efficient approach to policy search[C]. International Conference on International Conference on Machine Learning. Omnipress , 2011. https://www.researchgate.net/publication/221345233_PILCO_A_Model-Based_and_Data-Efficient_Approach_to_Policy_Search

RICHARD S, ANDREW G. Reinforcement Learning: An Introduction [M]. Second Edition. London: The MIT Press, 2016:78-88.

DURRANT-WHYTE H, ROY N, ABBEEL P. Learning to control a low-cost manipulator using data-efficient reinforcement learning[C]. Robotics : Science and Systems Ⅶ . MIT Press , 2011. https://www.researchgate.net/publication/221344493_Learning_to_Control_a_Low-Cost_Manipulator_using_Data-Efficient_Reinforcement_Learning?_sg=o9yGsYC6qIAsC1CWW6DUbSVs8zFaNclJWJx0ptwVgZOkwd1FKS62ir30zS5JczbLhVa3rj6mS6kPt7vOi3TcUw

DEISENROTH M P. Efficient Reinforcement Learning using Gaussian Processes [D]. Karlsruhe: Karlsruhe Institute of Technology, 2015.

Views

下载量

CSCD

Alert me when the article has been cited

提交

Tools

Publicity Resources

Adaptive optimization control method for overexposure of industrial camera

Design and parameter optimization of frictionless cylinder based on aerostatic bearing

Active disturbance rejection controller of fast steering mirror based on improved dragonfly algorithm

Application of improved artificial bee colony algorithm on chatter suppression in three-dimensional elliptical vibration cutting

Parameter optimization and direction recognition in angle measurement by laser self-mixing interference

Related Author

ZHUANG Jian

ZHOU Jun

LI Junzhong

LIAO Xiaobo

WU Wenlin

Yong CAI

Xingzhan LI

Shunshun LI

Related Institution

School of Manufacturing Science and Engineering， Key Laboratory of Testing Technology for Manufacturing Process， Minsitry of Education， Southwest University of Science and Technology

School of Mechanical Engineering， Xi’ an Jiaotong University， Xi’ an

Institute of Manufacturing Technology， China Academy of Engineering Physics

School of Manufacturing Science and Engineering， Southwest University of Science and Technology

Hunan Provincial Key Laboratory of High Efficiency and Precision Machining of Difficult-to-Cut Material， Intelligent Manufacturing Institute， Hunan University of Science and Technology

AI问答

Address：No.3888 Dong Nanhu Road, Changchun, Jilin, China Postal code：130033
Tel：0431-86176855 Email：gxjmgc@ciomp.ac.cn
Technical support is provided by Beijing Founder electronics co., LTD 吉ICP备11002662号-17 京公网安备11010802024621
It is recommended to read the content of this site in Chrome&IE9+. Please switch to extreme mode in browser 360.
Cookies We use cookies to help provide and enhance our service and tailor content. By continuing, you agree to the use of cookies.

⁰