Accelerating Stream Processing Queries with Congestion-aware Scheduling and Real-time Linux Threads
Paper i proceeding, 2023

Stream Processing Engines (SPEs) have been used by companies and industries to develop queries able to extract insights from data streams. The Edge/IoT context poses additional challenges, since streaming queries need to run closer to data producers to save latency, i.e., on resource-constrained devices. Lachesis is a middleware helping Linux to schedule more efficiently threads of the SPE, which revealed useful especially for devices with limited CPU resources. Lachesis does not require any architectural change to the SPE implementation. It collects metrics from the SPE, and computes high-level priorities that are converted into hints to the Operating System to affect its actual scheduling of threads. This paper extends the initial contribution of Lachesis in two main directions: i) we optimize the policy assigning to threads a priority proportional to their actual load by accurately studying the implementation of Storm and Flink, two popular SPEs; ii) instead of restricting the OS scheduling to traditional SCHED_OTHER threads as done previously by Lachesis, we leverage the real-time capability of the modern Linux kernel. Our experimental evaluation shows that both enhancements provide important benefits compared with the previous version of Lachesis: we get +9.75% (average) throughput (+19% peak) with-27% latency on average (-40% peak).

Apache Flink

Real-time Threads

Data Stream Processing

Linux Scheduler

Apache Storm

Författare

Fausto Frasca

Universita di Pisa

Vincenzo Massimiliano Gulisano

Nätverk och System

Gabriele Mencagli

Universita di Pisa

Dimitrios Palyvos-Giannas

Nätverk och System

Massimo Torquati

Universita di Pisa

Proceedings of the 20th ACM International Conference on Computing Frontiers 2023, CF 2023

144-153
9798400701405 (ISBN)

20th ACM International Conference on Computing Frontiers, CF 2023
Bologna, Italy,

Ämneskategorier

Datorteknik

Datavetenskap (datalogi)

Datorsystem

DOI

10.1145/3587135.3592202

Mer information

Senast uppdaterat

2024-01-03