A methodology to correctly assess the applicability domain of cell membrane permeability predictors for cyclic peptides

Gökçe Geylan; Leonardo De Maria; Ola Engkvist; Florian David; Ulf Norinder

doi:10.1039/d4dd00056k

A methodology to correctly assess the applicability domain of cell membrane permeability predictors for cyclic peptides
Artikel i vetenskaplig tidskrift, 2024

Being able to predict the cell permeability of cyclic peptides is essential for unlocking their potential as a drug modality for intracellular targets. With a wide range of studies of cell permeability but a limited number of data points, the reliability of the machine learning (ML) models to predict previously unexplored chemical spaces becomes a challenge. In this work, we systemically investigate the predictive capability of ML models from the perspective of their extrapolation to never-before-seen applicability domains, with a particular focus on the permeability task. Four predictive algorithms, namely Support-Vector Machine, Random Forest, LightGBM and XGBoost, jointly with a conformal prediction framework were employed to characterize and evaluate the applicability through uncertainty quantification. Efficiency and validity of the models' predictions with multiple calibration strategies were assessed with respect to several external datasets from different parts of the chemical space through a set of experiments. The experiments showed that the predictors generalizing well to the applicability domain defined by the training data, can fail to achieve similar model performance on other parts of the chemical spaces. Our study proposes an approach to overcome such limitations by the means of improving the efficiency of models without sacrificing the validity. The trade-off between the reliability and informativeness was balanced when the models were calibrated with a subset of the data from the new targeted domain. This study outlines an approach to enable the extrapolation of predictive power and restore the models' reliability via a recalibration strategy without the need for retraining the underlying model.

Författare

Gökçe Geylan

AstraZeneca AB

Chalmers, Life sciences, Systembiologi

Forskning Andra publikationer

Leonardo De Maria

AstraZeneca AB

Ola Engkvist

Chalmers, Data- och informationsteknik

AstraZeneca AB

Forskning Andra publikationer

Florian David

Chalmers, Life sciences, Systembiologi

Forskning Andra publikationer

Ulf Norinder

Örebro universitet

Uppsala universitet

Stockholms universitet

Digital Discovery

2635098X (eISSN)

Vol. 3 9 1761-1775

Ämneskategorier (SSIF 2011)

Biologiska vetenskaper

Kemi

DOI

10.1039/d4dd00056k

Publikationsdata kopplat till DOI

Mer information

Senast uppdaterat

2024-10-07

A methodology to correctly assess the applicability domain of cell membrane permeability predictors for cyclic peptides Artikel i vetenskaplig tidskrift, 2024

Författare

Gökçe Geylan

Leonardo De Maria

Ola Engkvist

Florian David

Ulf Norinder

Digital Discovery

Ämneskategorier (SSIF 2011)

DOI

Mer information

Senast uppdaterat

A methodology to correctly assess the applicability domain of cell membrane permeability predictors for cyclic peptides
Artikel i vetenskaplig tidskrift, 2024