RAVAS: Interference-Aware Model Selection and Resource Allocation for Live Edge Video Analytics
Paper in proceeding, 2023

Numerous edge applications that rely on video analytics demand precise, low-latency processing of multiple video streams from cameras. When these cameras are mobile, such as when mounted on a car or a robot, the processing load on the shared edge GPU can vary considerably. Provisioning the edge with GPUs for the worst-case load can be expensive and, for many applications, not feasible. In this paper, we introduce RAVAS, a Real-time Adaptive stream Video Analytics System that enables efficient edge GPU sharing for processing streams from various mobile cameras. RAVAS uses Q-Learning to choose between a set of Deep Neural Network (DNN) models with varying accuracy and processing requirements based on the current GPU utilization and workload. RAVAS employs an innovative resource allocation strategy to mitigate interference during concurrent GPU execution. Compared to state-of-the-art approaches, our results show that RAVAS incurs 57% less compute overhead, achieves 41% improvement in latency, and 43% savings in total GPU usage for a single video stream. Processing multiple concurrent video streams results in up to 99% and 40% reductions in latency and overall GPU usage, respectively, while meeting the accuracy constraints.

Model Selection

Edge Video Analytics

Resource Allocation

Interference-aware GPU Multiplexing

Author

Ali Rahmanian

Umeå University

Ahmed Ali-Eldin Hassan

Network and Systems

Selome Kostentinos Tesfatsion

Ericsson

Bjorn Skubic

Ericsson

Harald Gustafsson

Ericsson

Prashant Shenoy

University of Massachusetts

Erik Elmroth

Umeå University

Proceedings - 2023 IEEE/ACM Symposium on Edge Computing, SEC 2023

27-39
9798400701238 (ISBN)

8th Annual IEEE/ACM Symposium on Edge Computing, SEC 2023
Wilmington, USA,

Subject Categories

Communication Systems

Computer Science

Computer Systems

DOI

10.1145/3583740.3628443

More information

Latest update

3/13/2024