Learning to Make Decisions for Autonomous Drug Design
Doktorsavhandling, 2025

Drug design is an iterative process aimed at identifying suitable molecules for specific biological targets. Modern computer-aided drug design increasingly leverages machine learning to inform decision-making throughout this process. However, a key challenge remains: the interactive acquisition of new knowledge to improve machine learning models using relevant data. This thesis examines sequential decision-making problems in machine learning for optimizing data collection strategies in computer-aided drug design.

To experimentally test a molecule's properties, it must first be synthesized through a sequence of chemical reactions to obtain the desired product. Machine learning can identify and validate suitable chemical reactions by predicting reaction outcomes, but this approach requires sufficient data for each reaction type of interest. This thesis presents work that combinatorially investigates different aspects of active learning to improve predictive capabilities for determining whether a reaction will produce a sufficient amount of product. In practice, only a limited number of molecules can be synthesized per design cycle due to cost and time constraints, whereas current generative models can produce numerous molecular candidates. Therefore, another work in this thesis investigates how to optimally select which generated molecules to test, given a constrained experimental budget. We formulate this challenge as a multi-armed bandit problem and propose a novel algorithm to address it.

To generate novel molecules with desired predicted properties, previous research has successfully employed reinforcement learning to align generative model outputs to a specific biological target. This thesis examines additional perspectives on applying reinforcement learning to sequentially utilize and collect target-specific data. We present a systematic comparison of various reinforcement learning algorithms for generating drug molecules and investigate methods for effectively learning from generated samples. Moreover, designing a diverse set of promising molecules is crucial for a successful drug discovery pipeline. Therefore, we propose new methods to enhance chemical exploration by adaptively modifying the reward signal. We also introduce a mini-batch diversification framework for on-policy reinforcement learning and apply it to molecular generation, thereby improving chemical exploration during the generative process. Together, these contributions advance sequential decision-making in drug design by optimizing the acquisition of new data.

active learning

de novo drug design

reinforcement learning

reaction yield prediction

chemical exploration

multi-armed bandits

EA, Hörsalsvägen 11
Opponent: Prof. Jan Halborg Jensen, University of Copenhagen, Denmark

Författare

Hampus Gummesson Svensson

Chalmers, Data- och informationsteknik, Data Science och AI

Using Active Learning to Develop Machine Learning Models for Reaction Yield Prediction

Molecular Informatics,;Vol. 41(2022)

Artikel i vetenskaplig tidskrift

Autonomous Drug Design with Multi-Armed Bandits

Proceedings - 2022 IEEE International Conference on Big Data, Big Data 2022,;(2022)p. 5584-5592

Paper i proceeding

Utilizing reinforcement learning for de novo drug design

Machine Learning,;Vol. 113(2024)p. 4811-4843

Artikel i vetenskaplig tidskrift

Diversity-Aware Reinforcement Learning for de novo Drug Design

IJCAI International Joint Conference on Artificial Intelligence,;(2025)p. 9194-9204

Paper i proceeding

Kortare tid till nya läkemedel med hjälp av AI 

Att skapa ett nytt läkemedel är som att leta efter en nål i en höstack – det tar ofta 10-15 år och kostar miljarder kronor. Den bistra sanningen är också att 9 av 10 läkemedelskandidater misslyckas längs vägen, trots stora investeringar. Därför är det avgörande att fatta rätt beslut tidigt i processen.  Artificiell intelligens (AI) har blivit ett viktigt verktyg för att skapa nya läkemedel. AI kan analysera stora mängder av existerande data och föreslå vilka kemiska strukturer som har bäst chans att bli framgångsrika läkemedel. Men AI är inte bättre än den data den har tillgång till och att samla in ny, högkvalitativ data är både dyrt och tidskrävande. Därför måste forskarna vara strategiska och noggrant välja den data som behövs för att förbättra AI:s tillförlitlighet. Den här avhandlingen handlar om att utveckla smarta system som kan avgöra vilken data som behöver samlas in härnäst. Målet är att skapa autonoma system som effektivare kan designa läkemedel, vilket skulle kunna förkorta utvecklingstiden drastiskt. Föreställ dig en framtid där AI-system arbetar dygnet runt för att hitta botemedel mot cancer, alzheimer eller andra sjukdomar – mycket snabbare än vad som idag är möjligt.

Shorten the Hunt for New Medicines Using AI 

Creating a new medicine is one of the most challenging puzzles in science. It typically takes 10-15 years and costs a billion dollars, with a heartbreaking reality: 9 out of 10 potential medicines fail somewhere along the journey, despite massive investments. This makes grounded decision-making absolutely critical from day one. Artificial Intelligence (AI) has emerged as a crucial tool in the hunt for new medicines. AI can sift through vast amounts of existing data and predict which chemical compounds are most likely to become successful treatments. However, AI is only as good as the data it learns from, and gathering high-quality experimental data is both expensive and time-intensive. This creates a crucial dilemma: researchers must carefully choose which experiments to run next to improve the accuracy and reliability of their AI systems. This research focuses on developing intelligent systems that can determine which data to collect next. The goal is autonomous discovery platforms that can design medicines, potentially cutting development time dramatically. Imagine a future where AI systems work around the clock to discover treatments for cancer, Alzheimer's, or other diseases—delivering them to patients faster.

Styrkeområden

Informations- och kommunikationsteknik

Ämneskategorier (SSIF 2025)

Bioinformatik (beräkningsbiologi)

Datavetenskap (datalogi)

Infrastruktur

Chalmers e-Commons (inkl. C3SE, 2020-)

DOI

10.63959/chalmers.dt/5792

ISBN

978-91-8103-335-9

Doktorsavhandlingar vid Chalmers tekniska högskola. Ny serie: 5792

Utgivare

Chalmers

EA, Hörsalsvägen 11

Online

Opponent: Prof. Jan Halborg Jensen, University of Copenhagen, Denmark

Mer information

Senast uppdaterat

2025-12-08