SENTINEL: Taming Uncertainty with Ensemble-based Distributional Reinforcement Learning
Paper in proceeding, 2022
Our contributions are two-fold. First, we introduce a novel and coherent quantification of risk, namely composite risk, which quantifies the joint effect of aleatory and epistemic risk during the learning process.
Existing works considered either aleatory or epistemic risk individually, or as an additive combination.
We prove that the additive formulation is a particular case of the composite risk when the epistemic risk measure is replaced with expectation.
Thus, the composite risk is more sensitive to both aleatory and epistemic uncertainty than the individual and additive formulations.
We also propose an algorithm, SENTINEL-K, based on ensemble bootstrapping and distributional RL for representing epistemic and aleatory uncertainty respectively. The ensemble of K learners uses Follow The Regularised Leader (FTRL) to aggregate the return distributions and obtain the composite risk.
We experimentally verify that SENTINEL-K estimates the return distribution better, and while used with composite risk estimates, demonstrates higher risk-sensitive performance than state-of-the-art risk-sensitive and distributional RL algorithms.
Epistemic uncertainty
Reinforcement Learning
Ensemble methods
Author
Hannes Eriksson
Zenseact AB
Chalmers, Computer Science and Engineering (Chalmers), Data Science
Debabrota Basu
Cent Lille CRIStAL
Inria Lille Nord Europe
Mina Alibeigi
Zenseact AB
Christos Dimitrakakis
Chalmers, Computer Science and Engineering (Chalmers), Data Science
University of Oslo
University of Neuchatel
Proceedings of the 38th Conference on Uncertainty in Artificial Intelligence, UAI 2022
Vol. 180 631-640
9781713863298 (ISBN)
Eindhoven, Netherlands,
Subject Categories
Other Computer and Information Science
Computer Vision and Robotics (Autonomous Systems)