Registries in Machine Learning-Based Drug Discovery: A Shortcut to Code Reuse
Paper in proceeding, 2025

Computer-aided drug discovery gradually builds on previous work and requires reusable code to advance research. Currently, research code is mainly used to provide further insights into the original research whilst code reuse has a lower priority. Modularity, the segmentation of code for independent modules, promotes good coding practices and code reuse. The registry pattern has been proposed as a way to call functionalities dynamically, but it is currently overlooked as a shortcut to promote code reuse. In this work, we expand the registry pattern to better suit computer-aided drug discovery and achieve a unified, reusable, and interchangeable interface with optional meta information. Our reformulated pattern is particularly suitable for collaborative research with standardized frameworks where multiple internal and external modules are used interchangeably and coding is more focused on fast iteration over low-debt technical code, such as in machine learning-based research for drug discovery. In a workflow, we exemplify the usage of the design patterns. Additionally, we provide two case studies where we 1) showcase the effectiveness of registration in a larger collaborative research group, and 2) overview the potential of registration in currently available open-source tools. Finally, we empirically evaluate the registry pattern through previous implementations and indicate where additional functionality can improve its use.

modularity

machine learning

design pattern

code reuse

drug discovery

registration

Author

Peter B.R. Hartog

AstraZeneca AB

Helmholtz Munich-Deutsches Forschungszentrum für Gesundheit und Umwelt (GmbH)

Emma Svensson

Johannes Kepler University of Linz (JKU)

AstraZeneca AB

Lewis H. Mervin

AstraZeneca AB

Samuel Genheden

AstraZeneca AB

Ola Engkvist

AstraZeneca AB

Chalmers, Computer Science and Engineering (Chalmers), Data Science and AI

Igor V. Tetko

Helmholtz Munich-Deutsches Forschungszentrum für Gesundheit und Umwelt (GmbH)

Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)

03029743 (ISSN) 16113349 (eISSN)

Vol. 14894 LNCS 98-115
9783031723803 (ISBN)

1st International Workshop on AI in Drug Discovery, AIDD 2024, held as a part of the 33rd International Conference on Artificial Neural Networks, ICANN 2024
Lugano, Switzerland,

Subject Categories

Computer Science

DOI

10.1007/978-3-031-72381-0_9

More information

Latest update

10/7/2024