Minimax-Bayes Reinforcement Learning
Paper i proceeding, 2023

While the Bayesian decision-theoretic framework offers an elegant solution to the problem of decision making under uncertainty, one question is how to appropriately select the prior distribution. One idea is to employ a worst-case prior. However, this is not as easy to specify in sequential decision making as in simple statistical estimation problems. This paper studies (sometimes approximate) minimax-Bayes solutions for various reinforcement learning problems to gain insights into the properties of the corresponding priors and policies. We find that while the worst-case prior depends on the setting, the corresponding minimax policies are more robust than those that assume a standard (i.e. uniform) prior.

Författare

Thomas Kleine Buening

Universitetet i Oslo

Christos Dimitrakakis

Université de Neuchâtel

Hannes Eriksson

Zenseact AB

Divya Grover

Chalmers, Data- och informationsteknik, Data Science och AI

Emilio Jorge

Chalmers, Data- och informationsteknik, Data Science och AI

Proceedings of Machine Learning Research

26403498 (eISSN)

Vol. 206 7511-7527

26th International Conference on Artificial Intelligence and Statistics, AISTATS 2023
Valencia, Spain,

Ämneskategorier

Beräkningsmatematik

Sannolikhetsteori och statistik

Datavetenskap (datalogi)

Datorseende och robotik (autonoma system)

Mer information

Senast uppdaterat

2023-08-08