Bayesian Inference for Least Squares Temporal Difference Regularization
Paper i proceeding, 2017

This paper proposes a fully Bayesian approach for LeastSquares Temporal Differences (LSTD), resulting in fully probabilistic
inference of value functions that avoids the overfitting commonly experienced with classical LSTD when the number of features is larger than the number of samples. Sparse Bayesian learning provides an elegant
solution through the introduction of a prior over value function parameters. This gives us the advantages of probabilistic predictions, a sparse model, and good generalisation capabilities, as irrelevant parameters are marginalised out. The algorithm efficiently approximates the posterior distribution through variational inference. We demonstrate the ability of
the algorithm in avoiding overfitting experimentally.


Nikolaos Tziortztiois

École polytechnique

Christos Dimitrakakis

Harvard University

Université de Lille

Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)

03029743 (ISSN) 16113349 (eISSN)

Vol. Volume 10535 LNAI 126-141
978-331971245-1 (ISBN)

European Conference on Machine Learning and Principles and Practice of Knowledge Discovery in Databases, ECML PKDD 2017
Skopje, Macedonia,


Informations- och kommunikationsteknik


Sannolikhetsteori och statistik



Mer information

Senast uppdaterat