Tree Ensembles for Contextual Bandits
Journal article, 2024

We propose a new framework for contextual multi-armed bandits based on tree ensem-bles. Our framework adapts two widely used bandit methods, Upper Confidence Bound and Thompson Sampling, for both standard and combinatorial settings. As part of this frame-work, we propose a novel method of estimating the uncertainty in tree ensemble predictions. We further demonstrate the effectiveness of our framework via several experimental studies, employing XGBoost and random forests, two popular tree ensemble methods. Compared to state-of-the-art methods based on decision trees and neural networks, our methods ex-hibit superior performance in terms of both regret minimization and computational runtime, when applied to benchmark datasets and the real-world application of navigation over road networks.

Author

Hannes Nilsson

Chalmers, Computer Science and Engineering (Chalmers)

University of Gothenburg

Rikard Johansson

Chalmers, Computer Science and Engineering (Chalmers)

University of Gothenburg

Niklas Åkerblom

Chalmers, Computer Science and Engineering (Chalmers)

Volvo Cars

University of Gothenburg

Morteza Haghir Chehreghani

University of Gothenburg

Chalmers, Computer Science and Engineering (Chalmers)

Transactions on Machine Learning Research

28358856 (eISSN)

Vol. 2024

Subject Categories (SSIF 2025)

Probability Theory and Statistics

Computer Sciences

DOI

10.48550/arXiv.2402.06963

More information

Latest update

3/21/2025