A methodology to correctly assess the applicability domain of cell membrane permeability predictors for cyclic peptides
Journal article, 2024

Being able to predict the cell permeability of cyclic peptides is essential for unlocking their potential as a drug modality for intracellular targets. With a wide range of studies of cell permeability but a limited number of data points, the reliability of the machine learning (ML) models to predict previously unexplored chemical spaces becomes a challenge. In this work, we systemically investigate the predictive capability of ML models from the perspective of their extrapolation to never-before-seen applicability domains, with a particular focus on the permeability task. Four predictive algorithms, namely Support-Vector Machine, Random Forest, LightGBM and XGBoost, jointly with a conformal prediction framework were employed to characterize and evaluate the applicability through uncertainty quantification. Efficiency and validity of the models' predictions with multiple calibration strategies were assessed with respect to several external datasets from different parts of the chemical space through a set of experiments. The experiments showed that the predictors generalizing well to the applicability domain defined by the training data, can fail to achieve similar model performance on other parts of the chemical spaces. Our study proposes an approach to overcome such limitations by the means of improving the efficiency of models without sacrificing the validity. The trade-off between the reliability and informativeness was balanced when the models were calibrated with a subset of the data from the new targeted domain. This study outlines an approach to enable the extrapolation of predictive power and restore the models' reliability via a recalibration strategy without the need for retraining the underlying model.

Author

Gökçe Geylan

AstraZeneca AB

Chalmers, Life Sciences, Systems and Synthetic Biology

Leonardo De Maria

AstraZeneca AB

Ola Engkvist

Chalmers, Computer Science and Engineering (Chalmers)

AstraZeneca AB

Florian David

Chalmers, Life Sciences, Systems and Synthetic Biology

Ulf Norinder

Örebro University

Uppsala University

Stockholm University

Digital Discovery

2635098X (eISSN)

Vol. 3 9 1761-1775

Subject Categories

Biological Sciences

Chemical Sciences

DOI

10.1039/d4dd00056k

More information

Latest update

10/7/2024