Optimal protocols for continual learning via statistical physics and control theory
Artikel i vetenskaplig tidskrift, 2025

Artificial neural networks often struggle with catastrophic forgetting when learning multiple tasks sequentially, as training on new tasks degrades performance on previously learned tasks. Recent theoretical work has addressed this issue by analysing learning curves in synthetic frameworks under predefined training protocols. However, these protocols rely on heuristics and lack a solid theoretical foundation for assessing their optimality. In this paper, we fill this gap by combining exact equations for training dynamics, derived using statistical physics techniques, with optimal control methods. We apply this approach to teacher-student models for continual learning and multi-task problems, obtaining a theory for task-selection protocols that maximises performance while minimising forgetting. Our theoretical analysis offers nontrivial yet interpretable strategies for mitigating catastrophic forgetting, shedding light on how optimal learning protocols modulate established effects, such as the influence of task similarity on forgetting. Finally, we validate our theoretical findings with experiments on real-world data.

online dynamics

machine learning

learning theory

Författare

Francesco Mori

University of Oxford

Stefano Sarao Mannelli

University of Witwatersrand

Göteborgs universitet

Data Science och AI 3

Francesca Mignacco

Princeton University

City University of New York (CUNY)

Journal of Statistical Mechanics: Theory and Experiment

17425468 (eISSN)

Vol. 2025 8 084004

Ämneskategorier (SSIF 2025)

Datavetenskap (datalogi)

Annan fysik

DOI

10.1088/1742-5468/adf296

Mer information

Senast uppdaterat

2025-08-27