Minimax-Bayes Reinforcement Learning
Paper in proceeding, 2023

While the Bayesian decision-theoretic framework offers an elegant solution to the problem of decision making under uncertainty, one question is how to appropriately select the prior distribution. One idea is to employ a worst-case prior. However, this is not as easy to specify in sequential decision making as in simple statistical estimation problems. This paper studies (sometimes approximate) minimax-Bayes solutions for various reinforcement learning problems to gain insights into the properties of the corresponding priors and policies. We find that while the worst-case prior depends on the setting, the corresponding minimax policies are more robust than those that assume a standard (i.e. uniform) prior.

Author

Thomas Kleine Buening

University of Oslo

Christos Dimitrakakis

University of Neuchatel

Hannes Eriksson

Zenseact AB

Divya Grover

Chalmers, Computer Science and Engineering (Chalmers), Data Science and AI

Emilio Jorge

Chalmers, Computer Science and Engineering (Chalmers), Data Science and AI

Proceedings of Machine Learning Research

26403498 (eISSN)

Vol. 206 7511-7527

26th International Conference on Artificial Intelligence and Statistics, AISTATS 2023
Valencia, Spain,

Subject Categories (SSIF 2011)

Computational Mathematics

Probability Theory and Statistics

Computer Science

Computer Vision and Robotics (Autonomous Systems)

More information

Latest update

8/8/2023 2