Bayesian Inference for Least Squares Temporal Difference Regularization
Paper i proceeding, 2017
inference of value functions that avoids the overfitting commonly experienced with classical LSTD when the number of features is larger than the number of samples. Sparse Bayesian learning provides an elegant
solution through the introduction of a prior over value function parameters. This gives us the advantages of probabilistic predictions, a sparse model, and good generalisation capabilities, as irrelevant parameters are marginalised out. The algorithm efficiently approximates the posterior distribution through variational inference. We demonstrate the ability of
the algorithm in avoiding overfitting experimentally.
Författare
Nikolaos Tziortztiois
École polytechnique
Christos Dimitrakakis
Harvard University
Université de Lille
Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)
03029743 (ISSN) 16113349 (eISSN)
Vol. Volume 10535 LNAI 126-141978-331971245-1 (ISBN)
Skopje, Macedonia,
Styrkeområden
Informations- och kommunikationsteknik
Ämneskategorier
Sannolikhetsteori och statistik
DOI
10.1007/978-3-319-71246-8_8