Priors and  uncertainty in reinforcement learning

Emilio Jorge

Priors and uncertainty in reinforcement learning
Licentiatavhandling, 2023

Handling uncertainty is an important part of decision-making. Leveraging uncertainty for guiding exploration to discover higher rewards has been a standard approach for a long time, using both ad hoc and more principled approaches. Additionally, in the last decades, more work has been done with treating uncertainty as something to be avoided and creating risk-sensitive decision makers that wish to avoid risky behaviour. In this licentiate thesis, we study different approaches for managing uncertainty by presenting two papers. In the first paper, we look at how to model value function distributions in a way that captures the dependence between models and future values. We use the observation that the probability of a particular model depends on the value function to create a Monte Carlo algorithm that takes this into account. In the second paper, we study how a zero-sum minimax game between nature that selects a task distribution and an agent that selects a policy can be used to find minimax priors. We show some properties of this game and propose methods for finding its solution. Additionally, we show experimentally that the agents that optimize for this prior are robust to prior misspecification.

Bayesian reinforcement learning

reinforcement learning

Minimax

Markov decision processes

Analysen, Rännvägen 6B

Opponent: Brendan O'Donoghue, DeepMind

Online disputation

Författare

Emilio Jorge

Chalmers, Data- och informationsteknik, Data Science och AI

Forskning Andra publikationer

Inferential Induction: A Novel Framework for Bayesian Reinforcement Learning

Proceedings of Machine Learning Research,;Vol. 137(2020)p. 43-52

Paper i proceeding

Minimax-Bayes Reinforcement Learning

Poster (konferens)

Infrastruktur

C3SE (Chalmers Centre for Computational Science and Engineering)

Ämneskategorier (SSIF 2011)

Sannolikhetsteori och statistik

Datavetenskap (datalogi)

Datorseende och robotik (autonoma system)

Utgivare

Chalmers