SENTINEL: Taming Uncertainty with Ensemble-based Distributional Reinforcement Learning

Hannes Eriksson; Debabrota Basu; Mina Alibeigi; Christos Dimitrakakis

SENTINEL: Taming Uncertainty with Ensemble-based Distributional Reinforcement Learning
Paper i proceeding, 2022

In this paper, we consider risk-sensitive sequential decision-making in Reinforcement Learning (RL).
Our contributions are two-fold. First, we introduce a novel and coherent quantification of risk, namely composite risk, which quantifies the joint effect of aleatory and epistemic risk during the learning process.
Existing works considered either aleatory or epistemic risk individually, or as an additive combination.
We prove that the additive formulation is a particular case of the composite risk when the epistemic risk measure is replaced with expectation.
Thus, the composite risk is more sensitive to both aleatory and epistemic uncertainty than the individual and additive formulations.
We also propose an algorithm, SENTINEL-K, based on ensemble bootstrapping and distributional RL for representing epistemic and aleatory uncertainty respectively. The ensemble of K learners uses Follow The Regularised Leader (FTRL) to aggregate the return distributions and obtain the composite risk.
We experimentally verify that SENTINEL-K estimates the return distribution better, and while used with composite risk estimates, demonstrates higher risk-sensitive performance than state-of-the-art risk-sensitive and distributional RL algorithms.

Ensemble methods

Epistemic uncertainty

Reinforcement Learning

Författare

Hannes Eriksson

Chalmers, Data- och informationsteknik, Data Science

Zenseact AB

Forskning Andra publikationer

Debabrota Basu

Institut National de Recherche en Informatique et en Automatique (INRIA)

Centre national de la recherche scientifique (CNRS)

Forskning Andra publikationer

Mina Alibeigi

Zenseact AB

Christos Dimitrakakis

Chalmers, Data- och informationsteknik, Data Science

Université de Neuchâtel

Universitetet i Oslo

Forskning Andra publikationer

Proceedings of the 38th Conference on Uncertainty in Artificial Intelligence, UAI 2022

26403498 (eISSN)

Vol. 180 631-640
9781713863298 (ISBN)

38th Conference on Uncertainty in Artificial Intelligence
Eindhoven, Netherlands,

Ämneskategorier (SSIF 2011)

Annan data- och informationsvetenskap

Datorseende och robotik (autonoma system)

Mer information

Senast uppdaterat

2023-10-27

SENTINEL: Taming Uncertainty with Ensemble-based Distributional Reinforcement Learning Paper i proceeding, 2022

Författare

Hannes Eriksson

Debabrota Basu

Mina Alibeigi

Christos Dimitrakakis

Proceedings of the 38th Conference on Uncertainty in Artificial Intelligence, UAI 2022

Ämneskategorier (SSIF 2011)

Mer information

Senast uppdaterat

SENTINEL: Taming Uncertainty with Ensemble-based Distributional Reinforcement Learning
Paper i proceeding, 2022