Optimal protocols for continual learning via statistical physics and control theory
Journal article, 2025

Artificial neural networks often struggle with catastrophic forgetting when learning multiple tasks sequentially, as training on new tasks degrades performance on previously learned tasks. Recent theoretical work has addressed this issue by analysing learning curves in synthetic frameworks under predefined training protocols. However, these protocols rely on heuristics and lack a solid theoretical foundation for assessing their optimality. In this paper, we fill this gap by combining exact equations for training dynamics, derived using statistical physics techniques, with optimal control methods. We apply this approach to teacher-student models for continual learning and multi-task problems, obtaining a theory for task-selection protocols that maximises performance while minimising forgetting. Our theoretical analysis offers nontrivial yet interpretable strategies for mitigating catastrophic forgetting, shedding light on how optimal learning protocols modulate established effects, such as the influence of task similarity on forgetting. Finally, we validate our theoretical findings with experiments on real-world data.

online dynamics

machine learning

learning theory

Author

Francesco Mori

University of Oxford

Stefano Sarao Mannelli

University of Witwatersrand

University of Gothenburg

Data Science and AI 3

Francesca Mignacco

Princeton University

City University of New York (CUNY)

Journal of Statistical Mechanics: Theory and Experiment

17425468 (eISSN)

Vol. 2025 8 084004

Subject Categories (SSIF 2025)

Computer Sciences

Other Physics Topics

DOI

10.1088/1742-5468/adf296

More information

Latest update

8/27/2025