Priors and uncertainty in reinforcement learning
Licentiatavhandling, 2023

Handling uncertainty is an important part of decision-making. Leveraging uncertainty for guiding exploration to discover higher rewards has been a standard approach for a long time, using both ad hoc and more principled approaches. Additionally, in the last decades, more work has been done with treating uncertainty as something to be avoided and creating risk-sensitive decision makers that wish to avoid risky behaviour. In this licentiate thesis, we study different approaches for managing uncertainty by presenting two papers. In the first paper, we look at how to model value function distributions in a way that captures the dependence between models and future values. We use the observation that the probability of a particular model depends on the value function to create a Monte Carlo algorithm that takes this into account. In the second paper, we study how a zero-sum minimax game between nature that selects a task distribution and an agent that selects a policy can be used to find minimax priors. We show some properties of this game and propose methods for finding its solution. Additionally, we show experimentally that the agents that optimize for this prior are robust to prior misspecification.

Bayesian reinforcement learning

reinforcement learning

Minimax

Markov decision processes

Analysen, Rännvägen 6B
Opponent: Brendan O'Donoghue, DeepMind

Författare

Emilio Jorge

Chalmers, Data- och informationsteknik, Data Science och AI

Inferential Induction: A Novel Framework for Bayesian Reinforcement Learning

Proceedings of Machine Learning Research,;Vol. 137(2020)p. 43-52

Paper i proceeding

Infrastruktur

C3SE (Chalmers Centre for Computational Science and Engineering)

Ämneskategorier

Sannolikhetsteori och statistik

Datavetenskap (datalogi)

Datorseende och robotik (autonoma system)

Utgivare

Chalmers

Analysen, Rännvägen 6B

Online

Opponent: Brendan O'Donoghue, DeepMind

Mer information

Senast uppdaterat

2023-01-24