Safe Trajectory Sampling in Model-Based Reinforcement Learning
Paper i proceeding, 2023

Model-based reinforcement learning aims to learn a policy to solve a target task by leveraging a learned dynamics model. This approach, paired with principled handling of uncertainty allows for data-efficient policy learning in robotics. However, the physical environment has feasibility and safety constraints that need to be incorporated into the policy before it is safe to execute on a real robot. In this work, we study how to enforce the aforementioned constraints in the context of model-based reinforcement learning with probabilistic dynamics models. In particular, we investigate how trajectories sampled from the learned dynamics model can be used on a real robot, while fulfilling user-specified safety requirements. We present a model-based reinforcement learning approach using Gaussian processes where safety constraints are taken into account without simplifying Gaussian assumptions on the predictive state distributions. We evaluate the proposed approach on different continuous control tasks with varying complexity and demonstrate how our safe trajectory-sampling approach can be directly used on a real robot without violating safety constraints.

Författare

Sicelukwanda Zwane

University College London (UCL)

Denis Hadjivelichkov

University College London (UCL)

Yicheng Luo

University College London (UCL)

Yasemin Bekiroglu

Chalmers, Elektroteknik, System- och reglerteknik

University College London (UCL)

Dimitrios Kanoulas

University College London (UCL)

Marc Peter Deisenroth

University College London (UCL)

IEEE International Conference on Automation Science and Engineering

21618070 (ISSN) 21618089 (eISSN)

Vol. 2023-August
9798350320695 (ISBN)

19th IEEE International Conference on Automation Science and Engineering, CASE 2023
Auckland, New Zealand,

Ämneskategorier

Robotteknik och automation

Sannolikhetsteori och statistik

Reglerteknik

Datavetenskap (datalogi)

Datorseende och robotik (autonoma system)

DOI

10.1109/CASE56687.2023.10260496

Mer information

Senast uppdaterat

2023-11-03