Leveraging Structural Priors and Historical Data for Practical Treatment Personalization with Multi-Armed Bandits
Doktorsavhandling, 2025

Personalizing treatments for patients often requires sequentially trying different options from a set of available therapies until the most effective one is identified for the patient’s characteristics. In chronic diseases such as Alzheimer’s Disease, where interventions mainly have short-term effects, this search process can be formulated as a multi-armed bandit (MAB) problem. Reducing the length of the search is essential to limit patient burden and other associated costs, while practical constraints, such as limiting switches between therapies, introduce additional complexity to exploration. This thesis advances the foundational understanding and applications of MAB algorithms in the context of treatment personalization, focusing on improving sample efficiency by leveraging latent structure revealed from historical data, and accommodating practical treatment switching constraints. Key contributions include: (i) latent bandit algorithms for fixed-confidence pure exploration, providing new insights into exploration dynamics; (ii) the Identifiable Latent Bandit framework, which learns reward models from observational data under identifiability assumptions; and (iii) Latent Preference Bandits, which relax structural requirements by modeling preference orderings instead of full reward vectors. The work addresses the challenge of switching constraints through batched exploration approaches. Furthermore, the Alzheimer’s Disease Causal estimation Benchmark (ADCB), a semi-synthetic simulator integrating real-world Alzheimer's data with domain expertise is designed and employed as a causally sound evaluation platform for bandit algorithms in personalized medicine. Together, these contributions connect theoretical MAB developments with clinically motivated constraints, offering methodologies for more efficient and practical treatment personalization.

policy learning with historical data

Treatment personalization

healthcare bandit simulators

fixed-confidence pure exploration

latent bandits

exploration with switching constraints

structural priors

multi-armed bandits

Lecture Room EC, EDIT Building Elektrogården 1, Chalmers Campus Johanneberg
Opponent: Professor Sandeep Juneja, Ashoka University, India

Författare

Newton Mwai Kinyanjui

Data Science och AI 3

Fast Treatment Personalization with Latent Bandits in Fixed-Confidence Pure Exploration

Transactions on Machine Learning Research,;Vol. 2023(2023)

Artikel i vetenskaplig tidskrift

ADCB: An Alzheimer’s disease simulator for benchmarking observational estimators of causal effects

Proceedings of Machine Learning Research,;Vol. 174(2022)p. 103-118

Paper i proceeding

Ahmet Zahid Balcioglu, Newton Mwai, Emil Carlsson, Fredrik D. Johansson. Identifiable Latent Bandits: Leveraging observational data for personalized decision-making.

Newton Mwai, Milad Malekipirbazari, Fredrik D. Johansson. Understanding exploration in bandits with switching constraints: A batched approach in fixed-confidence pure exploration.

Newton Mwai, Emil Carlsson, Fredrik D. Johansson. Latent Preference Bandits.

Effektiva och praktiska behandlingsbeslut med AI

Att hitta rätt behandling för en patient innebär ofta att man måste prova flera alternativ innan man hittar det mest effektiva. Vid kroniska sjukdomar som Alzheimers, där behandlingseffekterna är kortvariga och regelbundna justeringar behövs, kan denna process av att pröva sig fram bli både lång och kostsam för patienter och vårdsystem. Målet med den här forskningen är att utveckla AI-baserade metoder som förkortar sökandet, samtidigt som praktiska begränsningar respekteras – till exempel att minska hur ofta behandlingar byts för att undvika obehag och biverkningar.

I avhandlingen används en grupp av algoritmer som kallas multi-armed bandits vilka är framtagna för lärande genom trial and error, för användning i medicinskt beslutsfattande. De introducerar nya strategier som hittar och använder mönster från tidigare patientjournaler för att kunna fatta snabbare och mer effektiva beslut för nya patienter. Vidare undersöks flexibla sätt att modellera patienters olikheter för att utveckla metoder som begränsar behandlingsbyten utan att försämra effektiviteten. För att testa algoritmerna på ett säkert sätt byggdes en realistisk simuleringsmiljö för Alzheimerbehandling, baserad på verklig klinisk data och medicinsk expertkunskap.

Avhandlingen banar väg för mer effektiva och patientvänliga beslut med AI inom framtidens individanpassade medicin.

Efficient and Practical Treatment Decisions with AI

Finding the right treatment for a patient often involves testing several options before settling on the most effective one. In chronic diseases such as Alzheimer’s, where treatment benefits are short‑lived and regular adjustments are needed, this trial‑and‑error process can be a long and costly process for both patients and healthcare systems. The aim of this research is to develop AI-based methods to shorten this search, while also respecting practical constraints, for example, reducing how often treatments are switched to avoid discomfort and side effects.

The work adapts a family of algorithms known as multi‑armed bandits, designed for learning through trial and error, for application in medical decision-making. It introduces new strategies that use patterns hidden in past records of patient treatment to guide quicker, effective decisions for new patients, explores ways to model patient differences flexibly, and develops methods that limit treatment changes without harming effectiveness. To test these algorithms safely, a realistic simulation environment for Alzheimer’s treatment was built using real clinical data and expert medical knowledge.

The thesis paves the way for more efficient and patient‑friendly decision‑making with AI in personalized medicine for the future.

Ämneskategorier (SSIF 2025)

Annan teknik

Datavetenskap (datalogi)

Algoritmer

Artificiell intelligens

Styrkeområden

Informations- och kommunikationsteknik

Hälsa och teknik

DOI

10.63959/chalmers.dt/5753

ISBN

978-91-8103-295-6

Doktorsavhandlingar vid Chalmers tekniska högskola. Ny serie: 5753

Utgivare

Chalmers

Lecture Room EC, EDIT Building Elektrogården 1, Chalmers Campus Johanneberg

Online

Opponent: Professor Sandeep Juneja, Ashoka University, India

Mer information

Senast uppdaterat

2025-09-12