Intelligent data acquisition for drug design through combinatorial library design
Licentiatavhandling, 2023
need for standardized data. Methods and interest exist for producing new data
but due to material and budget constraints it is desirable that each iteration of
producing data is as efficient as possible. In this thesis, we present two papers
methods detailing different problems for selecting data to produce. We invest-
igate Active Learning for models that use the margin in model decisiveness to
measure the model uncertainty to guide data acquisition. We demonstrate that
the models perform better with Active Learning than with random acquisition
of data independent of machine learning model and starting knowledge. We
also study the multi-objective optimization problem of combinatorial library
design. Here we present a framework that could process the output of gener-
ative models for molecular design and give an optimized library design. The
results show that the framework successfully optimizes a library based on
molecule availability, for which the framework also attempts to identify using
retrosynthesis prediction. We conclude that the next step in intelligent data
acquisition is to combine the two methods and create a library design model
that use the information of previous libraries to guide subsequent designs.
determinantal point processes
generative models
machine learning
drug discovery
active learning
Cheminformatics
Författare
Simon Johansson
Chalmers, Data- och informationsteknik, Data Science och AI
Using Active Learning to Develop Machine Learning Models for Reaction Yield Prediction
Molecular Informatics,;Vol. In Press(2022)
Artikel i vetenskaplig tidskrift
Johansson, S.V., Chehreghani, M.H., Engkvist, O., Schliep, A., de novo generated combinatorial library design
Styrkeområden
Informations- och kommunikationsteknik
Hälsa och teknik
Ämneskategorier
Design
Annan kemi
Datavetenskap (datalogi)
Utgivare
Chalmers