On the Semantic Overlap of Operators in Stream Processing Engines
Paper in proceeding, 2024

Stream Processing Engines (SPEs) extract value from data streams in the Edge-to-Cloud continuum through graphs of operators that progressively transform data. State-of-the-art SPEs are bridged into shared models based on their overlapping APIs. The overlap in their semantic expressiveness, though, goes beyond their APIs and can be formally assessed by distilling the semantics they support into minimal sets of operators, and by checking whether such sets overlap. As we show, stream Aggregates suffice to enforce the semantics of other common operators. Moreover, compositions of Aggregates can match the performance of other operators in state-of-the-art SPEs, and micro-SPEs building on a single Aggregate operator can even surpass other SPEs’ performance while holding the same semantic expressiveness with a minimal code footprint. Our approach lays down new analytical findings with practical implications in minimizing the operational effort to use SPEs, especially at the edge, while seamlessly benefiting existing distribution/parallelization techniques.

Stream Aggregates

Stream processing

Semantic Equivalence

Author

Vincenzo Massimiliano Gulisano

Network and Systems

Alessandro Margara

Polytechnic University of Milan

Marina Papatriantafilou

Network and Systems

Middleware 2024 - Proceedings of the 25th ACM International Middleware Conference

8-21
9798400706233 (ISBN)

25th ACM International Middleware Conference, Middleware 2024
Hong Kong, Hong Kong,

Subject Categories (SSIF 2025)

Software Engineering

Computer Sciences

Computer Systems

DOI

10.1145/3652892.3654790

More information

Latest update

1/28/2025