Diverse Mini-Batch Selection in Reinforcement Learning for Efficient Chemical Exploration in de novo Drug Design
Preprint, 2025

In many real-world applications, evaluating the quality of instances is costly and time-consuming, e.g., human feedback and physics simulations, in contrast to proposing new instances. In particular, this is even more critical in reinforcement learning, since it relies on interactions with the environment (i.e., new instances) that must be evaluated to provide a reward signal for learning. At the same time, performing sufficient exploration is crucial in reinforcement learning to find high-rewarding solutions, meaning that the agent should observe and learn from a diverse set of experiences to find different solutions. Thus, we argue that learning from a diverse mini-batch of experiences can have a large impact on the exploration and help mitigate mode collapse. In this paper, we introduce mini-batch diversification for reinforcement learning and study this framework in the context of a real-world problem, namely, drug discovery. We extensively evaluate how our proposed framework can enhance the effectiveness of chemical exploration in de novo drug design, where finding diverse and high-quality solutions is crucial. Our experiments demonstrate that our proposed diverse mini-batch selection framework can substantially enhance the diversity of solutions while maintaining high-quality solutions. In drug discovery, such an outcome can potentially lead to fulfilling unmet medical needs faster.

Diversity

de novo Drug Design

Reinforcement Learning

Författare

Hampus Gummesson Svensson

Chalmers, Data- och informationsteknik, Data Science och AI

AstraZeneca AB

Ola Engkvist

Chalmers, Data- och informationsteknik, Data Science och AI

AstraZeneca AB

Jon Paul Janet

AstraZeneca AB

Christian Tyrchan

AstraZeneca AB

Morteza Haghir Chehreghani

Data Science och AI 2

Styrkeområden

Informations- och kommunikationsteknik

Hälsa och teknik

Ämneskategorier (SSIF 2025)

Datavetenskap (datalogi)

Läkemedel- och medicinsk processbioteknik

Artificiell intelligens

Medicinsk bioteknologi

Infrastruktur

Chalmers e-Commons (inkl. C3SE, 2020-)

DOI

10.48550/arXiv.2506.21158

Mer information

Senast uppdaterat

2025-11-11