Sequential Decision-Making for Drug Design: Towards closed-loop drug design

Hampus Gummesson Svensson

Sequential Decision-Making for Drug Design: Towards closed-loop drug design
Licentiate thesis, 2023

Drug design is a process of trial and error to design molecules with a desired response toward a biological target, with the ultimate goal of finding a new medication. It is estimated to be up to 10^{60} molecules that are of potential interest as drugs, making it a difficult problem to find suitable molecules. A crucial part of drug design is to design and determine what molecules should be experimentally tested, to determine their activity toward the biological target. To experimentally test the properties of a molecule, it has to be successfully made, often requiring a sequence of reactions to obtain the desired product. Machine learning can be utilized to predict the outcome of a reaction, helping to find successful reactions, but requires data for the reaction type of interest. This thesis presents a work that combinatorially investigates the use of active learning to acquire training data for reaching a certain level of predictive ability in predicting whether a reaction is successful or not. However, only a limited number of molecules can often be synthesized every time. Therefore, another line of work in this thesis investigates which designed molecules should be experimentally tested, given a budget of experiments, to sequentially acquire new knowledge. This is formulated as a multi-armed bandit problem and we propose an algorithm to solve this problem. To suggest potential drug molecules to choose from, recent advances in machine learning have also enabled the use of generative models to design novel molecules with certain predicted properties. Previous work has formulated this as a reinforcement learning problem with success in designing and optimizing molecules with drug-like properties. This thesis presents a systematic comparison of different reinforcement learning algorithms for string-based generation of drug molecules. This includes a study of different ways of learning from previous and current batches of samples during the iterative generation.

de novo drug design

multi-armed bandits

active learning

reinforcement learning

reaction yield prediction

EE, EDIT Building Hörsalsvägen 11

Opponent: Prof. Alexandre Varnek, University of Strasbourg, France

Online defence

Author

Hampus Gummesson Svensson

Chalmers, Computer Science and Engineering (Chalmers), Data Science and AI

Other publications Research

Using Active Learning to Develop Machine Learning Models for Reaction Yield Prediction

Molecular Informatics,;Vol. 41(2022)

Journal article

Autonomous Drug Design with Multi-Armed Bandits

Proceedings - 2022 IEEE International Conference on Big Data, Big Data 2022,;(2022)p. 5584-5592

Paper in proceeding

H. Gummesson Svensson, C. Tyrchan, O. Engkvist, M. Haghir Chehreghani. Utilizing Reinforcement Learning for Drug Design

Areas of Advance

Information and Communication Technology

Health Engineering

Subject Categories (SSIF 2011)

Probability Theory and Statistics

Computer Science

Publisher

Chalmers