Generalised entropy MDPs and Minimax Regret
Paper in proceedings, 2014

Bayesian methods suffer from the problem of how to specify prior beliefs. One interesting idea is to consider worst-case priors. This requires solving a stochastic zero-sum game. In this paper, we extend well-known results from bandit theory in order to discover minimax-Bayes policies and discuss when they are practical.

Author

Emmanouil Androulakis

Chalmers, Mathematical Sciences

University of Gothenburg

Christos Dimitrakakis

Chalmers, Computer Science and Engineering (Chalmers), Computing Science (Chalmers)

NIPS 2014, From bad models to good policies workshop.

Areas of Advance

Information and Communication Technology

Subject Categories

Probability Theory and Statistics

More information

Created

10/7/2017