Hierarchical Reinforcement Learning Based on Planning Operators

Jing Zhang; Emmanuel Dean; Karinne Ramirez-Amaro

doi:10.1109/CASE59546.2024.10711595

Hierarchical Reinforcement Learning Based on Planning Operators
Paper i proceeding, 2024

Learning long-horizon manipulation tasks such as stacking, presents a longstanding challenge in the field of robotic manipulation, particularly when using Reinforcement Learning (RL) methods. RL algorithms focus on learning a policy for executing the entire task instead of learning the correct sequence of actions required to achieve complex goals. While RL aims to find a sequence of actions that maximises the total reward of the task, the main challenge arises when there are infinite possibilities of chaining these actions (e.g. reach, grasp, etc.) to achieve the same task (stacking). In these cases, RL methods may struggle to find the optimal policy. This paper introduces a novel framework that integrates the operator concepts from the symbolic planning domain with hierarchical RL methods. We propose to change the way complex tasks are trained by learning independent policies of the actions defined by high-level operators instead of learning a policy for the complete complex task. Our contribution integrates planning operators (e.g. preconditions and effects) as part of the hierarchical RL algorithm based on the Scheduled Auxiliary Control (SAC-X) method. We developed a dual-purpose high-level operator, which can be used both in holistic planning and as independent, reusable policies. Our approach offers a flexible solution for long-horizon tasks, e.g., stacking and inserting a cube. The experimental results show that our proposed method achieved an average success rate of 97.2% for learning and executing the whole stack. Furthermore, we obtain a high success rate when learning independent policies, e.g. reach (98.9%), lift (99.7%), move (97.4%), etc. The training time is also reduced by 68% when using our proposed approach.

Författare

Jing Zhang

Chalmers, Elektroteknik, System- och reglerteknik

Forskning Andra publikationer

Emmanuel Dean

Chalmers, Elektroteknik, System- och reglerteknik

Forskning Andra publikationer

Karinne Ramirez-Amaro

Chalmers, Elektroteknik, System- och reglerteknik

Forskning Andra publikationer

IEEE International Conference on Automation Science and Engineering

21618070 (ISSN) 21618089 (eISSN)

2006-2012
9798350358513 (ISBN)

20th IEEE International Conference on Automation Science and Engineering, CASE 2024
Bari, Italy,

Ämneskategorier (SSIF 2011)

Robotteknik och automation

Datavetenskap (datalogi)

DOI

10.1109/CASE59546.2024.10711595

Publikationsdata kopplat till DOI

Mer information

Senast uppdaterat

2024-11-14

Hierarchical Reinforcement Learning Based on Planning Operators Paper i proceeding, 2024

Författare

Jing Zhang

Emmanuel Dean

Karinne Ramirez-Amaro

IEEE International Conference on Automation Science and Engineering

Ämneskategorier (SSIF 2011)

DOI

Mer information

Senast uppdaterat

Hierarchical Reinforcement Learning Based on Planning Operators
Paper i proceeding, 2024