Bayesian Inference for Least Squares Temporal Difference Regularization
Paper in proceeding, 2017
inference of value functions that avoids the overfitting commonly experienced with classical LSTD when the number of features is larger than the number of samples. Sparse Bayesian learning provides an elegant
solution through the introduction of a prior over value function parameters. This gives us the advantages of probabilistic predictions, a sparse model, and good generalisation capabilities, as irrelevant parameters are marginalised out. The algorithm efficiently approximates the posterior distribution through variational inference. We demonstrate the ability of
the algorithm in avoiding overfitting experimentally.
Author
Nikolaos Tziortztiois
École polytechnique
Christos Dimitrakakis
Harvard University
University of Lille
Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)
03029743 (ISSN) 16113349 (eISSN)
Vol. Volume 10535 LNAI 126-141978-331971245-1 (ISBN)
Skopje, Macedonia,
Areas of Advance
Information and Communication Technology
Subject Categories
Probability Theory and Statistics
DOI
10.1007/978-3-319-71246-8_8