Tree Ensembles for Contextual Bandits
Preprint, 2024

We propose a novel framework for contextual multi-armed bandits based on tree ensembles. Our framework integrates two widely used bandit methods, Upper Confidence Bound and Thompson Sampling, for both standard and combinatorial settings. We demonstrate the effectiveness of our framework via several experimental studies, employing XGBoost, a popular tree ensemble method. Compared to state-of-the-art methods based on neural networks, our methods exhibit superior performance in terms of both regret minimization and computational runtime, when applied to benchmark datasets and the real-world application of navigation over road networks.

Combinatorial semi-bandits

Contextual multi-armed bandits

Tree ensemble methods

Online learning

Författare

Hannes Nilsson

Chalmers, Data- och informationsteknik, Data Science och AI

Rikard Johansson

Chalmers, Data- och informationsteknik

Niklas Åkerblom

Chalmers, Data- och informationsteknik, Data Science och AI

Morteza Haghir Chehreghani

Chalmers, Data- och informationsteknik, Data Science och AI

EENE: Energieffektiv Navigering för Elfordon

FFI - Fordonsstrategisk forskning och innovation (2018-01937), 2019-01-01 -- 2022-12-31.

Ämneskategorier

Sannolikhetsteori och statistik

Datavetenskap (datalogi)

DOI

10.48550/arXiv.2402.06963

Mer information

Skapat

2024-02-13