LAGOM: A transformer-based chemical language model for drug metabolite prediction
Artikel i vetenskaplig tidskrift, 2025

Metabolite identification studies are an essential but costly and time-consuming component of drug development. Computational methods have the potential to accelerate early-stage drug discovery, particularly with recent advances in deep learning which offer new opportunities to accelerate the process of metabolite prediction. We present LAGOM (Language-model Assisted Generation Of Metabolites), a Transformer-based approach built upon the Chemformer architecture, designed to predict likely metabolic transformations of drug candidates. Our results show that LAGOM performs competitively with, and in some cases surpasses, existing state-of-the-art metabolite prediction tools, demonstrating the potential of language-model-based architectures in chemoinformatics. By integrating diverse data sources and employing data augmentation strategies, we further improve the model's generalisation and predictive accuracy. The implementation of LAGOM is publicly available at github.com/tsofiac/LAGOM.

Language models

Artificial intelligence

Transformers

Deep learning

Drug metabolism

Drug discovery

Författare

Sofia Larsson

AstraZeneca AB

Göteborgs universitet

Chalmers, Data- och informationsteknik, Data Science och AI

Miranda Carlsson

Student vid Chalmers

AstraZeneca AB

Richard Beckmann

Chalmers, Data- och informationsteknik, Data Science och AI

Göteborgs universitet

Filip Miljković

AstraZeneca AB

Rocio Mercado

Chalmers, Data- och informationsteknik, Data Science och AI

Göteborgs universitet

Artificial Intelligence in the Life Sciences

26673185 (eISSN)

Vol. 8 100142

Ämneskategorier (SSIF 2025)

Bioinformatik (beräkningsbiologi)

DOI

10.1016/j.ailsci.2025.100142

Relaterade dataset

LAGOM: Language-model-Assisted Generation Of Metabolites [dataset]

URI: https://github.com/tsofiac/LAGOM

Mer information

Senast uppdaterat

2025-09-25