Priors and uncertainty in reinforcement learning
Licentiate thesis, 2023

Handling uncertainty is an important part of decision-making. Leveraging uncertainty for guiding exploration to discover higher rewards has been a standard approach for a long time, using both ad hoc and more principled approaches. Additionally, in the last decades, more work has been done with treating uncertainty as something to be avoided and creating risk-sensitive decision makers that wish to avoid risky behaviour. In this licentiate thesis, we study different approaches for managing uncertainty by presenting two papers. In the first paper, we look at how to model value function distributions in a way that captures the dependence between models and future values. We use the observation that the probability of a particular model depends on the value function to create a Monte Carlo algorithm that takes this into account. In the second paper, we study how a zero-sum minimax game between nature that selects a task distribution and an agent that selects a policy can be used to find minimax priors. We show some properties of this game and propose methods for finding its solution. Additionally, we show experimentally that the agents that optimize for this prior are robust to prior misspecification.

Bayesian reinforcement learning

reinforcement learning

Minimax

Markov decision processes

Author

Emilio Jorge

Chalmers, Computer Science and Engineering (Chalmers), Data Science and AI

Included papers

Inferential Induction: A Novel Framework for Bayesian Reinforcement Learning

Proceedings of Machine Learning Research,;Vol. 137(2020)p. 43-52

Paper in proceeding

Categorizing

Infrastructure

C3SE (Chalmers Centre for Computational Science and Engineering)

Subject Categories

Probability Theory and Statistics

Computer Science

Computer Vision and Robotics (Autonomous Systems)

Other

Publisher

Chalmers

Public defence

2023-01-26 15:00 -- 17:00

Analysen, Rännvägen 6B

Online

Opponent: Brendan O'Donoghue, DeepMind

More information

Latest update

1/24/2023