Achieving Machine Learning Dependability Through Model Switching and Compression
Artikel i vetenskaplig tidskrift, 2025

Machine learning (ML) can be often distributed, owing to the need to harness more resources and/or to preserve privacy. Accordingly, distributed learning has received significant attention from the literature; however, most works focus on the expected learning quality (e.g., loss) attained and do not consider the distribution thereof. It follows that ML models are not dependable, and may fall short of the required performance in many real-world cases. In this work, we tackle this challenge and propose DepL, a framework attaining dependable learning orchestration. DepL efficiently makes joint, near-optimal decisions concerning (i) which data to use for learning, (ii) the ML models to use – chosen within a set of full-size models and compressed versions thereof – and when to switch from one model to another, and (iii) the clusters of physical nodes to use for the learning. DepL improves over previous works by guaranteeing that the learning quality target (e.g., a minimum loss) is achieved with a target probability, while minimizing the learning (e.g., energy) cost. DepL has provably low polynomial computational complexity and a constant competitive ratio. Further, experimental results using the CIFAR-10 and GTSRB datasets show that it consistently matches the optimum and outperforms state-of-theart approaches (30% faster learning and 40–80% lower cost).

learning guarantees

network support to machine learning

dependable learning

Distributed learning

Författare

Francesco Malandrino

Consiglo Nazionale Delle Richerche

Giuseppe Di Giacomo

Politecnico di Torino

Marco Levorato

University of California

Carla Fabiana Chiasserini

Chalmers, Data- och informationsteknik, Dator- och nätverkssystem

Consiglo Nazionale Delle Richerche

Politecnico di Torino

Göteborgs universitet

IEEE Transactions on Mobile Computing

1536-1233 (ISSN) 15580660 (eISSN)

Vol. In Press 0b00006494953812

Ämneskategorier (SSIF 2025)

Datavetenskap (datalogi)

Datorsystem

DOI

10.1109/TMC.2025.3619560

Mer information

Senast uppdaterat

2025-11-03