Practically Significant Method Comparison Protocols for Machine Learning in Small Molecule Drug Discovery
Reviewartikel, 2025

Machine Learning (ML) methods that relate molecular structure to properties are frequently proposed as in silico surrogates for expensive or time-consuming experiments. In small molecule drug discovery, such methods inform high-stakes decisions like compound synthesis and in vivo studies. This application lies at the intersection of multiple scientific disciplines. When comparing new ML methods to baseline or state-of-the-art approaches, statistically rigorous method comparison protocols and domain-appropriate performance metrics are essential to ensure replicability and ultimately the adoption of ML in small molecule drug discovery. This paper proposes a set of guidelines to incentivize rigorous and domain-appropriate techniques for method comparison tailored to small molecule property modeling. These guidelines, accompanied by annotated examples using open-source software tools, lay a foundation for robust ML benchmarking and thus the development of more impactful methods.

Författare

Jeremy R. Ash

Johnson & Johnson Innovative Medicine

Cas Wognum

Recursion Pharmaceuticals

Valence Discovery

Raquel Rodríguez-Pérez

Novartis International AG

Matteo Aldeghi

Bayer AG

Alan C. Cheng

Merck & Co., Inc.

Djork Arné Clevert

Pfizer

Ola Engkvist

Chalmers, Data- och informationsteknik, Data Science och AI

AstraZeneca AB

Cheng Fang

Blueprint Medicines Corporation

Daniel J. Price

Nimbus Therapeutics

Jacqueline M. Hughes-Oliver

North Carolina State University

W. Patrick Walters

Relay Therapeutics

Journal of Chemical Information and Modeling

1549-9596 (ISSN) 1549960x (eISSN)

Vol. 65 18 9398-9411

Ämneskategorier (SSIF 2025)

Bioinformatik (beräkningsbiologi)

Datavetenskap (datalogi)

Signalbehandling

DOI

10.1021/acs.jcim.5c01609

PubMed

40932128

Mer information

Senast uppdaterat

2025-10-01