Exploration and generalization in deep learning with SwitchPath activations
Journal article, 2025

This work provides a comprehensive theoretical and empirical analysis of SwitchPath, a stochastic activation function that improves learning dynamics by probabilistically toggling between a neuron standard activation and its negation. We develop theoretical foundations and demonstrate its impact in multiple scenarios. By maintaining gradient flow and injecting controlled stochasticity, the method improves generalization, uncertainty estimation, and training efficiency. Experiments in classification show consistent gains over ReLU and Leaky ReLU across CNNs and Vision Transformers, with reduced overfitting and better test accuracy. In generative modeling, a novel two-phase training scheme significantly mitigates mode collapse and accelerates convergence. Our theoretical analysis reveals that SwitchPath introduces a form of multiplicative noise that acts as a structural regularizer. Additional empirical investigations show improved information propagation and reduced model complexity. These results establish this activation mechanism as a simple yet effective way to enhance exploration, regularization, and reliability in modern neural networks.

Generative networks

Neural network algorithms

Deep learning

Author

Antonio Di Cecco

G. d'Annunzio University of Chieti-Pescara

Andrea Papini

Chalmers, Mathematical Sciences, Applied Mathematics and Statistics

University of Gothenburg

Carlo Metta

National Research Council of Italy (CNR)

Marco Fantozzi

University of Parma

Silvia Giulia Galfre

University of Pisa

Francesco Morandin

University of Parma

Maurizio Parton

G. d'Annunzio University of Chieti-Pescara

Machine Learning

0885-6125 (ISSN) 1573-0565 (eISSN)

Vol. 114 9 200

Subject Categories (SSIF 2025)

Computer Vision and learning System

Computational Mathematics

DOI

10.1007/s10994-025-06840-y

More information

Latest update

8/15/2025