Minimax-Bayes Reinforcement Learning
Poster (konferens), 2023
solution to the problem of decision making under uncertainty, one
question is how to appropriately select the prior distribution. One
idea is to employ a worst-case prior. However, this is not as easy to
specify in sequential decision making as in simple statistical
estimation problems. This paper studies (sometimes approximate)
minimax-Bayes solutions for various reinforcement learning problems
to gain insights into the properties of the corresponding priors and
policies. We find that while the worst-case prior depends on the
setting, the corresponding minimax policies are more robust than
those that assume a standard (i.e. uniform) prior.
Minimax
reinforcement learning
Markov decision processes
Författare
Thomas Kleine Buening
Universitetet i Oslo
Christos Dimitrakakis
Université de Neuchâtel
Universitetet i Oslo
Chalmers, Data- och informationsteknik, Data Science och AI
Hannes Eriksson
Göteborgs universitet
Zenseact AB
Divya Grover
Chalmers, Data- och informationsteknik, Data Science och AI
Emilio Jorge
Chalmers, Data- och informationsteknik, Data Science och AI
Valencia, Spain,
Ämneskategorier
Beräkningsmatematik
Datavetenskap (datalogi)
Datorseende och robotik (autonoma system)