DTA-GNN: a toolkit for constructing target-specific drug–target affinity datasets and training graph neural networks
Journal article, 2026

Drug–target affinity (DTA) prediction is a key task in computational drug discovery, yet current research is often compromised by data leakage and non-reproducible preprocessing. We present DTA-GNN, an end-to-end Python toolkit that automates the rigorous construction of target-specific datasets and streamlines the training of Graph Neural Network (GNN) based DTA predictors. To address data validity, the toolkit’s dataset construction pipeline handles ChEMBL data ingestion and unit standardization, and implements scaffold- and temporal-splitting strategies to prevent overestimation of performance. Integrated leakage audits quantify split integrity prior to modeling. Following dataset construction, DTA-GNN provides a modular trainer that supports ten state-of-the-art GNN architectures and includes built-in hyperparameter optimization. In addition, DTA-GNN supports latent space analysis either by extracting learned molecular embeddings or leveraging molecular fingerprints, and provides interactive visualizations to explore chemical space and interpret model behavior. By unifying robust dataset construction with accessible model training and latent-space analysis via Python library, CLI, and Web UI, DTA-GNN enables researchers to produce standardized, reproducible, and leakage-free DTA benchmarks.

Data leakage

Graph neural networks

Cheminformatics

Reproducible research

Drug–target binding affinity prediction

Author

Gökhan Özsari

Middle East Technical University (METU)

Chalmers, Physics, E-commons

Ahmet Süreyya Rifaioğlu

University Hospital Heidelberg

Aybar Can Acar

Middle East Technical University (METU)

Tunca Doğan

Hacettepe University

M. Volkan Atalay

Loyola University of Chicago

SoftwareX

2352-7110 (eISSN)

Vol. 34 102671

Subject Categories (SSIF 2025)

Bioinformatics (Computational Biology)

Computer Sciences

DOI

10.1016/j.softx.2026.102671

More information

Latest update

4/30/2026