ODIN: Overcoming Dynamic Interference inĀ iNference Pipelines
Paper in proceeding, 2023

As an increasing number of businesses becomes powered by machine-learning, inference becomes a core operation, with a growing trend to be offered as a service. In this context, the inference task must meet certain service-level objectives (SLOs), such as high throughput and low latency. However, these targets can be compromised by interference caused by long- or short-lived co-located tasks. Prior works focus on the generic problem of co-scheduling to mitigate the effect of interference on the performance-critical task. In this work, we focus on inference pipelines and propose ODIN, a technique to mitigate the effect of interference on the performance of the inference task, based on the online scheduling of the pipeline stages. Our technique detects interference online and automatically re-balances the pipeline stages to mitigate the performance degradation of the inference task. We demonstrate that ODIN successfully mitigates the effect of interference, sustaining the latency and throughput of CNN inference, and outperforms the least-loaded scheduling (LLS), a common technique for interference mitigation. Additionally, it is effective in maintaining service-level objectives for inference, and it is scalable to large network models executing on multiple processing elements.

Inference serving

CNN parallel pipelines

Online tuning

Design space exploration

Interference mitigation

Author

Pirah Noor Soomro

Chalmers, Computer Science and Engineering (Chalmers), Computer Engineering (Chalmers)

Nikela Papadopoulou

Chalmers, Computer Science and Engineering (Chalmers), Computer Engineering (Chalmers)

Miquel Pericas

Chalmers, Computer Science and Engineering (Chalmers), Computer Engineering (Chalmers)

Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)

03029743 (ISSN) 16113349 (eISSN)

Vol. 14100 LNCS 169-183
9783031396977 (ISBN)

29th International European Conference on Parallel and Distributed Computing, Euro-Par 2023
Limassol, Cyprus,

Subject Categories (SSIF 2011)

Computer Engineering

Computer Systems

DOI

10.1007/978-3-031-39698-4_12

More information

Latest update

9/29/2023