Minimax-Bayes Reinforcement Learning
Conference poster, 2023

While the Bayesian decision-theoretic framework offers an elegant
solution to the problem of decision making under uncertainty, one
question is how to appropriately select the prior distribution. One
idea is to employ a worst-case prior. However, this is not as easy to
specify in sequential decision making as in simple statistical
estimation problems. This paper studies (sometimes approximate)
minimax-Bayes solutions for various reinforcement learning problems
to gain insights into the properties of the corresponding priors and
policies. We find that while the worst-case prior depends on the
setting, the corresponding minimax policies are more robust than
those that assume a standard (i.e. uniform) prior.

Minimax

reinforcement learning

Markov decision processes

Author

Thomas Kleine Buening

University of Oslo

Christos Dimitrakakis

University of Neuchatel

University of Oslo

Chalmers, Computer Science and Engineering (Chalmers), Data Science and AI

Hannes Eriksson

University of Gothenburg

Zenseact AB

Divya Grover

Chalmers, Computer Science and Engineering (Chalmers), Data Science and AI

Emilio Jorge

Chalmers, Computer Science and Engineering (Chalmers), Data Science and AI

Artificial Intelligence and Statistics, AISTATS 2023
Valencia, Spain,

Subject Categories

Computational Mathematics

Computer Science

Computer Vision and Robotics (Autonomous Systems)

More information

Latest update

10/26/2023