DTA-GNN: a toolkit for constructing target-specific drug–target affinity datasets and training graph neural networks
Artikel i vetenskaplig tidskrift, 2026

Drug–target affinity (DTA) prediction is a key task in computational drug discovery, yet current research is often compromised by data leakage and non-reproducible preprocessing. We present DTA-GNN, an end-to-end Python toolkit that automates the rigorous construction of target-specific datasets and streamlines the training of Graph Neural Network (GNN) based DTA predictors. To address data validity, the toolkit’s dataset construction pipeline handles ChEMBL data ingestion and unit standardization, and implements scaffold- and temporal-splitting strategies to prevent overestimation of performance. Integrated leakage audits quantify split integrity prior to modeling. Following dataset construction, DTA-GNN provides a modular trainer that supports ten state-of-the-art GNN architectures and includes built-in hyperparameter optimization. In addition, DTA-GNN supports latent space analysis either by extracting learned molecular embeddings or leveraging molecular fingerprints, and provides interactive visualizations to explore chemical space and interpret model behavior. By unifying robust dataset construction with accessible model training and latent-space analysis via Python library, CLI, and Web UI, DTA-GNN enables researchers to produce standardized, reproducible, and leakage-free DTA benchmarks.

Data leakage

Graph neural networks

Cheminformatics

Reproducible research

Drug–target binding affinity prediction

Författare

Gökhan Özsari

Orta Doğu Teknik Üniversitesi

Chalmers, Fysik, E-commons

Ahmet Süreyya Rifaioğlu

Universitätsklinikum Heidelberg

Aybar Can Acar

Orta Doğu Teknik Üniversitesi

Tunca Doğan

Hacettepe Üniversitesi

M. Volkan Atalay

Loyola University of Chicago

SoftwareX

2352-7110 (eISSN)

Vol. 34 102671

Ämneskategorier (SSIF 2025)

Bioinformatik (beräkningsbiologi)

Datavetenskap (datalogi)

DOI

10.1016/j.softx.2026.102671

Mer information

Senast uppdaterat

2026-04-30