Erebus: Explaining the Outputs of Data Streaming Queries
Paper in proceeding, 2022

In data streaming, why-provenance can explain why a given outcome is observed but offers no help in understanding why an expected outcome is missing. Explaining missing answers has been addressed in DBMSs, but these solutions are not directly applicable to the streaming setting, because of the extra challenges posed by limited storage and by the unbounded nature of data streams. With our framework, Erebus, we tackle the unaddressed challenges behind explaining missing answers in streaming applications. Erebus allows users to define expectations about the results of a query, verifying at runtime if such expectations hold, and also providing explanations when expected and observed outcomes diverge (missing answers). To the best of our knowledge, Erebus is the first such solution in data streaming. Our thorough evaluation on real data shows that Erebus can explain the (missing) answers with small overheads, both in low-and higher-end devices, even when large portions of the processed data are part of such explanations.

Author

Dimitrios Palyvos-Giannas

Network and Systems

Katerina Tzompanaki

University of Cergy-Pontoise

Marina Papatriantafilou

Network and Systems

Vincenzo Massimiliano Gulisano

Network and Systems

Proceedings of the VLDB Endowment

21508097 (eISSN)

Vol. 16 2 230-242

49th International Conference on Very Large Data Bases, VLDB 2023
Vancouver, Canada,

Subject Categories (SSIF 2011)

Other Computer and Information Science

Computer Science

Computer Systems

DOI

10.14778/3565816.3565825

More information

Latest update

12/19/2022