Generalised entropy MDPs and Minimax Regret
Paper i proceeding, 2014

Bayesian methods suffer from the problem of how to specify prior beliefs. One interesting idea is to consider worst-case priors. This requires solving a stochastic zero-sum game. In this paper, we extend well-known results from bandit theory in order to discover minimax-Bayes policies and discuss when they are practical.

Författare

Emmanouil Androulakis

Chalmers, Matematiska vetenskaper

Göteborgs universitet

Christos Dimitrakakis

Chalmers, Data- och informationsteknik, Datavetenskap

NIPS 2014, From bad models to good policies workshop.

Styrkeområden

Informations- och kommunikationsteknik

Ämneskategorier

Sannolikhetsteori och statistik

Mer information

Skapat

2017-10-07