QSARtuna: An Automated QSAR Modeling Platform for Molecular Property Prediction in Drug Design

Lewis H. Mervin; Alexey Voronov; Mikhail Kabeshov; Ola Engkvist

doi:10.1021/acs.jcim.4c00457

QSARtuna: An Automated QSAR Modeling Platform for Molecular Property Prediction in Drug Design
Artikel i vetenskaplig tidskrift, 2024

Machine-learning (ML) and deep-learning (DL) approaches to predict the molecular properties of small molecules are increasingly deployed within the design-make-test-analyze (DMTA) drug design cycle to predict molecular properties of interest. Despite this uptake, there are only a few automated packages to aid their development and deployment that also support uncertainty estimation, model explainability, and other key aspects of model usage. This represents a key unmet need within the field, and the large number of molecular representations and algorithms (and associated parameters) means it is nontrivial to robustly optimize, evaluate, reproduce, and deploy models. Here, we present QSARtuna, a molecule property prediction modeling pipeline, written in Python and utilizing the Optuna, Scikit-learn, RDKit, and ChemProp packages, which enables the efficient and automated comparison between molecular representations and machine learning models. The platform was developed by considering the increasingly important aspect of model uncertainty quantification and explainability by design. We provide details for our framework and provide illustrative examples to demonstrate the capability of the software when applied to simple molecular property, reaction/reactivity prediction, and DNA encoded library enrichment classification. We hope that the release of QSARtuna will further spur innovation in automatic ML modeling and provide a platform for education of best practices in molecular property modeling. The code for the QSARtuna framework is made freely available via GitHub.

Författare

Lewis H. Mervin

AstraZeneca AB

Alexey Voronov

AstraZeneca AB

Mikhail Kabeshov

AstraZeneca AB

Ola Engkvist

Chalmers, Data- och informationsteknik

AstraZeneca AB

Göteborgs universitet

Forskning Andra publikationer

Journal of Chemical Information and Modeling

1549-9596 (ISSN) 1549960x (eISSN)

Vol. 64 14 5365-5374

Ämneskategorier (SSIF 2011)

Annan data- och informationsvetenskap

Farmaceutisk vetenskap

Läkemedelskemi

DOI

10.1021/acs.jcim.4c00457

Publikationsdata kopplat till DOI

Mer information

Senast uppdaterat

2024-08-03

QSARtuna: An Automated QSAR Modeling Platform for Molecular Property Prediction in Drug Design Artikel i vetenskaplig tidskrift, 2024

Författare

Lewis H. Mervin

Alexey Voronov

Mikhail Kabeshov

Ola Engkvist

Journal of Chemical Information and Modeling

Ämneskategorier (SSIF 2011)

DOI

Mer information

Senast uppdaterat

QSARtuna: An Automated QSAR Modeling Platform for Molecular Property Prediction in Drug Design
Artikel i vetenskaplig tidskrift, 2024