On the Semantic Overlap of Operators in Stream Processing Engines
Paper i proceeding, 2024

Stream Processing Engines (SPEs) extract value from data streams in the Edge-to-Cloud continuum through graphs of operators that progressively transform data. State-of-the-art SPEs are bridged into shared models based on their overlapping APIs. The overlap in their semantic expressiveness, though, goes beyond their APIs and can be formally assessed by distilling the semantics they support into minimal sets of operators, and by checking whether such sets overlap. As we show, stream Aggregates suffice to enforce the semantics of other common operators. Moreover, compositions of Aggregates can match the performance of other operators in state-of-the-art SPEs, and micro-SPEs building on a single Aggregate operator can even surpass other SPEs’ performance while holding the same semantic expressiveness with a minimal code footprint. Our approach lays down new analytical findings with practical implications in minimizing the operational effort to use SPEs, especially at the edge, while seamlessly benefiting existing distribution/parallelization techniques.

Stream processing

Stream Aggregates

Semantic Equivalence

Författare

Vincenzo Massimiliano Gulisano

Chalmers, Data- och informationsteknik, Dator- och nätverkssystem

Alessandro Margara

Politecnico di Milano

Marina Papatriantafilou

Chalmers, Data- och informationsteknik, Dator- och nätverkssystem

Middleware 2024 - Proceedings of the 25th ACM International Middleware Conference

8-21
9798400706233 (ISBN)

25th ACM International Middleware Conference, Middleware 2024
Hong Kong, Hong Kong,

WASP-WISE STRATIFIER

Wallenberg AI, Autonomous Systems and Software Program, 2024-01-01 -- 2025-01-01.

Wallenberg Initiative Materials Science for Sustainability, 2024-01-01 -- 2025-01-01.

Skalbarhet och kvalitetskontroll i AM -- Big Data och ML i tillverkningsprocesser

Chalmers, 2020-01-01 -- .

INDEED: Information and Data-processing in Focus for Energy Efficiency

Chalmers, 2020-01-01 -- .

Ämneskategorier (SSIF 2025)

Programvaruteknik

Datavetenskap (datalogi)

Datorsystem

Styrkeområden

Informations- och kommunikationsteknik

Produktion

Energi

DOI

10.1145/3652892.3654790

Mer information

Senast uppdaterat

2025-12-17