Minimax-Bayes Reinforcement Learning
Paper i proceeding, 2023

While the Bayesian decision-theoretic framework offers an elegant solution to the problem of decision making under uncertainty, one question is how to appropriately select the prior distribution. One idea is to employ a worst-case prior. However, this is not as easy to specify in sequential decision making as in simple statistical estimation problems. This paper studies (sometimes approximate) minimax-Bayes solutions for various reinforcement learning problems to gain insights into the properties of the corresponding priors and policies. We find that while the worst-case prior depends on the setting, the corresponding minimax policies are more robust than those that assume a standard (i.e. uniform) prior.

Författare

Thomas Kleine Buening

Universitetet i Oslo

Christos Dimitrakakis

Université de Neuchâtel

Hannes Eriksson

Zenseact AB

Divya Grover

Chalmers, Data- och informationsteknik, Data Science och AI

Emilio Jorge

Chalmers, Data- och informationsteknik, Data Science och AI

Proceedings of Machine Learning Research

26403498 (eISSN)

Vol. 206 7511-7527

26th International Conference on Artificial Intelligence and Statistics, AISTATS 2023
Valencia, Spain,

Ämneskategorier (SSIF 2011)

Beräkningsmatematik

Sannolikhetsteori och statistik

Datavetenskap (datalogi)

Datorseende och robotik (autonoma system)

Mer information

Senast uppdaterat

2023-08-08