Rollout sampling approximate policy iteration
Journal article, 2008
Approximate policy iteration
Bandit problems
Rollouts
Reinforcement learning
Classification
Sample complexity
Author
Christos Dimitrakakis
Chalmers, Computer Science and Engineering (Chalmers), Computing Science (Chalmers)
M.G. Lagoudakis
Machine Learning
0885-6125 (ISSN) 1573-0565 (eISSN)
Vol. 72 3 157-171Areas of Advance
Information and Communication Technology
Subject Categories
Computer and Information Science
DOI
10.1007/s10994-008-5069-3