Bayesian Analysis of Combinatorial Gaussian Process Bandits
Paper i proceeding, 2025

We consider the combinatorial volatile Gaussian process (GP) semi-bandit problem. Each round, an agent is provided a set of available base arms and must select a subset of them to maximize the long-term cumulative reward. We study the Bayesian setting and provide novel Bayesian cumulative regret bounds for three GP-based algorithms: GP-UCB, GP-BayesUCB and GP-TS. Our bounds extend previous results for GP-UCB and GP-TS to the infinite, volatile and combinatorial setting, and to the best of our knowledge, we provide the first regret bound for GP-BayesUCB. Volatile arms encompass other widely considered bandit problems such as contextual bandits. Furthermore, we employ our framework to address the challenging real-world problem of online energy-efficient navigation, where we demonstrate its effectiveness compared to the alternatives.

Författare

Jack Sandberg

Göteborgs universitet

Chalmers, Data- och informationsteknik

Niklas Åkerblom

Volvo Group

Chalmers, Data- och informationsteknik

Göteborgs universitet

Morteza Haghir Chehreghani

Göteborgs universitet

Chalmers, Data- och informationsteknik

13th International Conference on Learning Representations Iclr 2025

8895-8928
9798331320850 (ISBN)

13th International Conference on Learning Representations, ICLR 2025
Singapore, Singapore,

Energieffektiv Navigering för Elfordon (EENE)

VINNOVA (2018-01937), 2019-01-01 -- 2022-12-31.

EENE: Energieffektiv Navigering för Elfordon

FFI - Fordonsstrategisk forskning och innovation (2018-01937), 2019-01-01 -- 2022-12-31.

Ämneskategorier (SSIF 2025)

Datavetenskap (datalogi)

Annan data- och informationsvetenskap

Mer information

Senast uppdaterat

2025-07-21