ASAP.SGD: Instance-based Adaptiveness to Staleness in Asynchronous SGD

Karl Bäckström; Marina Papatriantafilou; Philippas Tsigas

ASAP.SGD: Instance-based Adaptiveness to Staleness in Asynchronous SGD
Paper i proceeding, 2022

Concurrent algorithmic implementations of Stochastic Gradient Descent (SGD) give rise to critical questions for compute-intensive Machine Learning (ML). Asynchrony implies speedup in some contexts, and challenges in others, as stale updates may lead to slower, or non-converging executions. While previous works showed asynchrony-adaptiveness can improve stability and speedup by reducing the step size for stale updates according to static rules, there is no one-size-fits-all adaptation rule, since the optimal strategy depends on several factors. We introduce (i) ASAP.SGD, an analytical framework capturing necessary and desired properties of staleness-adaptive step size functions and (ii) TAIL-T, a method for utilizing key properties of the execution instance, generating a tailored strategy that not only dampens the impact of stale updates, but also leverages fresh ones. We recover convergence bounds for adaptiveness functions satisfying the ASAP.SGD conditions, for general, convex and non-convex problems, and establish novel bounds for ones satisfying the Polyak-Lojasiewicz property. We evaluate TAIL-T with representative AsyncSGD concurrent algorithms, for Deep Learning problems, showing TAIL-T is a vital complement to AsyncSGD, with (i) persistent speedup in wall-clock convergence time in the parallelism spectrum, (ii) considerably lower risk of non-convergence, as well as (iii) precision levels for which original SGD implementations fail.

Författare

Karl Bäckström

Nätverk och System

Forskning Andra publikationer

Marina Papatriantafilou

Nätverk och System

Forskning Andra publikationer

Philippas Tsigas

Nätverk och System

Forskning Andra publikationer

Proceedings of Machine Learning Research

26403498 (eISSN)

Vol. 162 1261-1271

38th International Conference on Machine Learning (ICML)
Baltimore, MD, USA,

WASP SAS

Wallenberg AI, Autonomous Systems and Software Program, 2018-01-01 -- 2023-01-01.

Visa projekt

VR EPITOME - Sammanfattning och strukturering av kontinuerlig data i pipelines för samtidig behandling

Vetenskapsrådet (VR) (2021-05424), 2022-01-01 -- 2025-12-31.

Visa projekt

Styrkeområden

Informations- och kommunikationsteknik

Drivkrafter

Hållbar utveckling

Innovation och entreprenörskap

Ämneskategorier (SSIF 2011)

Beräkningsmatematik

Reglerteknik

Matematisk analys

Mer information

Senast uppdaterat

2024-10-07

ASAP.SGD: Instance-based Adaptiveness to Staleness in Asynchronous SGD Paper i proceeding, 2022

Författare

Karl Bäckström

Marina Papatriantafilou

Philippas Tsigas

Proceedings of Machine Learning Research

WASP SAS

VR EPITOME - Sammanfattning och strukturering av kontinuerlig data i pipelines för samtidig behandling

Styrkeområden

Drivkrafter

Ämneskategorier (SSIF 2011)

Mer information

Senast uppdaterat

ASAP.SGD: Instance-based Adaptiveness to Staleness in Asynchronous SGD
Paper i proceeding, 2022