On the Semantic Overlap of Operators in Stream Processing Engines
Paper in proceeding, 2024

Stream Processing Engines (SPEs) extract value from data streams in the Edge-to-Cloud continuum through graphs of operators that progressively transform data. State-of-the-art SPEs are bridged into shared models based on their overlapping APIs. The overlap in their semantic expressiveness, though, goes beyond their APIs and can be formally assessed by distilling the semantics they support into minimal sets of operators, and by checking whether such sets overlap. As we show, stream Aggregates suffice to enforce the semantics of other common operators. Moreover, compositions of Aggregates can match the performance of other operators in state-of-the-art SPEs, and micro-SPEs building on a single Aggregate operator can even surpass other SPEs’ performance while holding the same semantic expressiveness with a minimal code footprint. Our approach lays down new analytical findings with practical implications in minimizing the operational effort to use SPEs, especially at the edge, while seamlessly benefiting existing distribution/parallelization techniques.

Stream processing

Stream Aggregates

Semantic Equivalence

Author

Vincenzo Massimiliano Gulisano

Chalmers, Computer Science and Engineering (Chalmers), Computer and Network Systems

Alessandro Margara

Polytechnic University of Milan

Marina Papatriantafilou

Chalmers, Computer Science and Engineering (Chalmers), Computer and Network Systems

Middleware 2024 - Proceedings of the 25th ACM International Middleware Conference

8-21
9798400706233 (ISBN)

25th ACM International Middleware Conference, Middleware 2024
Hong Kong, Hong Kong,

WASP-WISE STRATIFIER

Wallenberg AI, Autonomous Systems and Software Program, 2024-01-01 -- 2025-01-01.

Wallenberg Initiative Materials Science for Sustainability, 2024-01-01 -- 2025-01-01.

Scalability and quality control in AM - Big Data and ML in Production

Chalmers, 2020-01-01 -- .

INDEED: Information and Data-processing in Focus for Energy Efficiency

Chalmers, 2020-01-01 -- .

Subject Categories (SSIF 2025)

Software Engineering

Computer Sciences

Computer Systems

Areas of Advance

Information and Communication Technology

Production

Energy

DOI

10.1145/3652892.3654790

More information

Latest update

12/17/2025