Achieving well-informed decision-making in drug discovery: a comprehensive calibration study using neural network-based structure-activity models
Artikel i vetenskaplig tidskrift, 2025

In the drug discovery process, where experiments can be costly and time-consuming, computational models that predict drug-target interactions are valuable tools to accelerate the development of new therapeutic agents. Estimating the uncertainty inherent in these neural network predictions provides valuable information that facilitates optimal decision-making when risk assessment is crucial. However, such models can be poorly calibrated, which results in unreliable uncertainty estimates that do not reflect the true predictive uncertainty. In this study, we compare different metrics, including accuracy and calibration scores, used for model hyperparameter tuning to investigate which model selection strategy achieves well-calibrated models. Furthermore, we propose to use a computationally efficient Bayesian uncertainty estimation method named HMC Bayesian Last Layer (HBLL), which generates Hamiltonian Monte Carlo (HMC) trajectories to obtain samples for the parameters of a Bayesian logistic regression fitted to the hidden layer of the baseline neural network. We report that this approach improves model calibration and achieves the performance of common uncertainty quantification methods by combining the benefits of uncertainty estimation and probability calibration methods. Finally, we show that combining post hoc calibration method with well-performing uncertainty quantification approaches can boost model accuracy and calibration.

Hamiltonian monte carlo sampling

Uncertainty estimation

Drug discovery

Probability calibration

Deep learning

Bayesian neural network

QSAR

Författare

Hannah Rosa Friesacher

KU Leuven

AstraZeneca AB

Ola Engkvist

Chalmers, Data- och informationsteknik, Data Science och AI

AstraZeneca AB

Lewis H. Mervin

AstraZeneca AB

Yves Moreau

KU Leuven

Ádám Arany

KU Leuven

Journal of Cheminformatics

1758-2946 (ISSN) 17582946 (eISSN)

Vol. 17 1 29

Ämneskategorier (SSIF 2025)

Sannolikhetsteori och statistik

Datorgrafik och datorseende

Reglerteknik

DOI

10.1186/s13321-025-00964-y

Mer information

Senast uppdaterat

2025-03-21