Nona: A Framework for Elastic Stream Provenance
Paper in proceeding, 2024

Forward Provenance for streaming queries run by distributed and parallel Stream Processing Engines gives fine-grained insights on input-output data dependencies enabling, e.g., precise debugging and smart data selection. State-of-the-art provenance frameworks, though, build on an assumption that is unrealistic for distributed systems like Vehicular Networks and Smart Grids, namely, that the whole set of queries in need of provenance is known in advance and static. In real-world use cases, queries are continuously added, removed, and modified over time by both data analysts and SPE systems themselves. Motivated by the lack of solutions for the forward provenance of dynamic sets of queries, we introduce a novel framework, named Nona, for parallel and distributed streaming queries. We formalize the notion of forward provenance for evolving query sets and prove it is possible to extend the same guarantees the state-of-the-art offers for static query sets. Our evaluation shows that Nona can cope with adaptations to changes in query sets with sub-second responsiveness; moreover, it incurs negligible overheads compared to the state-of-the-art, during the periods in which a query set does not undergo changes.

Stream Processing

Provenance

Elasticity

Author

Bastian Havers

Network and Systems

Marina Papatriantafilou

Network and Systems

Vincenzo Massimiliano Gulisano

Network and Systems

Proceedings - International Conference on Distributed Computing Systems

10636927 (ISSN) 25758411 (eISSN)

703-714
9798350386059 (ISBN)

44th IEEE International Conference on Distributed Computing Systems, ICDCS 2024
Jersey City, USA,

AUTOSPADA (Automotive Stream Processing and Distributed Analytics) OODIDA Phase 2

VINNOVA (2019-05884), 2020-03-12 -- 2022-12-31.

Relaxed Semantics Across the Data Analytics Stack (RELAX)

European Commission (EC) (EC/H2020/101072456), 2023-03-01 -- 2027-02-28.

BADA - On-board Off-board Distributed Data Analytics

VINNOVA (2016-04260), 2016-12-01 -- 2019-12-31.

Subject Categories

Computer Engineering

Computer Science

Computer Systems

DOI

10.1109/ICDCS60910.2024.00071

More information

Latest update

9/18/2024