Preference elicitation and inverse reinforcement learning

C.A. Rothkopf; Christos Dimitrakakis

doi:10.1007/978-3-642-23808-6_3

Preference elicitation and inverse reinforcement learning
Paper in proceeding, 2011

We state the problem of inverse reinforcement learning in terms of preference elicitation, resulting in a principled (Bayesian) statistical formulation. This generalises previous work on Bayesian inverse reinforcement learning and allows us to obtain a posterior distribution on the agent's preferences, policy and optionally, the obtained reward sequence, from observations. We examine the relation of the resulting approach to other statistical methods for inverse reinforcement learning via analysis and experimental results. We show that preferences can be determined accurately, even if the observed agent's policy is sub-optimal with respect to its own preferences. In that case, significantly improved policies with respect to the agent's preferences are obtained, compared to both other methods and to the performance of the demonstrated policy. © 2011 Springer-Verlag.

preference elicitation

Inverse reinforcement learning

decision theory

Bayesian inference

Author

C.A. Rothkopf

Christos Dimitrakakis

Chalmers, Computer Science and Engineering (Chalmers), Computing Science (Chalmers)

Other publications Research

Machine Learning and Knowledge Discovery in Databases, ECML 2011

9783642238079 (ISBN)

Areas of Advance

Information and Communication Technology

Life Science Engineering (2010-2018)

Subject Categories (SSIF 2011)

Computer and Information Science

DOI

10.1007/978-3-642-23808-6_3

Publication data connected to DOI

ISBN

9783642238079

More information

Created

10/6/2017

Preference elicitation and inverse reinforcement learning Paper in proceeding, 2011