ZHANG Fang, YAN Guo-zheng, LIN Liang-ming. Multi-robot path planning-oriented and fuzzy model-based reinforcement function structures[J]. Editorial Office of Optics and Precision Engineering, 2002,(2): 148-153
ZHANG Fang, YAN Guo-zheng, LIN Liang-ming. Multi-robot path planning-oriented and fuzzy model-based reinforcement function structures[J]. Editorial Office of Optics and Precision Engineering, 2002,(2): 148-153DOI:
Multi-robot path planning-oriented and fuzzy model-based reinforcement function structures
reinforcement learning is being applied more and more in a learning system with unknown environment model because of its simple learning mechanism and no need of knowledge of the system or sample data in advance. However
one of the problems of the reinforcement learning method is that its learning speed is too low to ensure the real-time system. Researchers have studied to speed up learning by improving learning algorithm and adopting intelligent exploration policy or applying the hierarchical reinforcement learning method
etc. However
how to describe the reinforcement function and how the reinforcement function affects the learning speed are seldom studied. In the existing reinforcement learning system
the model-free reinforcement function artificially defined is usually used. Its simple and rough expression is one of the causes of the low efficiency of learning. In this article
a new fuzzy model-based reinforcement function structure is presented. It is described according to the actual application in the conflict-free path planning problem of a cooperative multiple mobile robot system. In this system
the robot behaviors are divided into three basic kinds moving to the goal
avoiding obstacles and other robots. Then
the subfunctions reflecting these basic behaviors of robots are hierarchically and fuzzily modeled
and the final reinforcement function is expressed by the sum of fuzzy weighted sub-functions. The fuzzy model based reinforcement function has more accurate expression of the influence of each robot’s action on the environment. The simulation shows that using the fuzzy model based reinforcement functions in reinforcement learning algorithm can further speed up the convergence than using model-free reinforcement functions.
关键词
Keywords
references
Mataric M J.Reinforcement learning in the multi-robot domain[J].Autonomous Robots, 1997,4(1): 73-83.
Balch.Reward and diversity in multirobot foraging[A].IJCAI-99 Workshop on Agents Learning About, From and with Other Agents[C].1999.