Bayesian Inference for Least Squares Temporal Difference Regularization
Paper in proceeding, 2017

This paper proposes a fully Bayesian approach for LeastSquares Temporal Differences (LSTD), resulting in fully probabilistic
inference of value functions that avoids the overfitting commonly experienced with classical LSTD when the number of features is larger than the number of samples. Sparse Bayesian learning provides an elegant
solution through the introduction of a prior over value function parameters. This gives us the advantages of probabilistic predictions, a sparse model, and good generalisation capabilities, as irrelevant parameters are marginalised out. The algorithm efficiently approximates the posterior distribution through variational inference. We demonstrate the ability of
the algorithm in avoiding overfitting experimentally.

Author

Nikolaos Tziortztiois

École polytechnique

Christos Dimitrakakis

Harvard University

University of Lille

Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)

03029743 (ISSN) 16113349 (eISSN)

Vol. Volume 10535 LNAI 126-141
978-331971245-1 (ISBN)

European Conference on Machine Learning and Principles and Practice of Knowledge Discovery in Databases, ECML PKDD 2017
Skopje, Macedonia,

Areas of Advance

Information and Communication Technology

Subject Categories (SSIF 2011)

Probability Theory and Statistics

DOI

10.1007/978-3-319-71246-8_8

More information

Latest update

8/30/2023