Evaluation guidelines for machine learning tools in the chemical sciences
Journal article, 2022

Machine learning (ML) promises to tackle the grand challenges in chemistry and speed up the generation, improvement and/or ordering of research hypotheses. Despite the overarching applicability of ML workflows, one usually finds diverse evaluation study designs. The current heterogeneity in evaluation techniques and metrics leads to difficulty in (or the impossibility of) comparing and assessing the relevance of new algorithms. Ultimately, this may delay the digitalization of chemistry at scale and confuse method developers, experimentalists, reviewers and journal editors. In this Perspective, we critically discuss a set of method development and evaluation guidelines for different types of ML-based publications, emphasizing supervised learning. We provide a diverse collection of examples from various authors and disciplines in chemistry. While taking into account varying accessibility across research groups, our recommendations focus on reporting completeness and standardizing comparisons between tools. We aim to further contribute to improved ML transparency and credibility by suggesting a checklist of retro-/prospective tests and dissecting their importance. We envisage that the wide adoption and continuous update of best practices will encourage an informed use of ML on real-world problems related to the chemical sciences. [Figure not available: see fulltext.]

Author

Andreas Bender

University of Cambridge

Nadine Schneider

Novartis International AG

Marwin Segler

Microsoft Research

W. Patrick Walters

Relay Therapeutics

Ola Engkvist

AstraZeneca AB

Chalmers, Computer Science and Engineering (Chalmers)

Tiago Rodrigues

University of Lisbon

Nature Reviews Chemistry

23973358 (eISSN)

Vol. 6 6 428-442

Subject Categories

Media and Communication Technology

Information Studies

Information Science

DOI

10.1038/s41570-022-00391-9

More information

Latest update

3/7/2024 9