Erebus: Explaining the Outputs of Data Streaming Queries
Paper i proceeding, 2022

In data streaming, why-provenance can explain why a given outcome is observed but offers no help in understanding why an expected outcome is missing. Explaining missing answers has been addressed in DBMSs, but these solutions are not directly applicable to the streaming setting, because of the extra challenges posed by limited storage and by the unbounded nature of data streams. With our framework, Erebus, we tackle the unaddressed challenges behind explaining missing answers in streaming applications. Erebus allows users to define expectations about the results of a query, verifying at runtime if such expectations hold, and also providing explanations when expected and observed outcomes diverge (missing answers). To the best of our knowledge, Erebus is the first such solution in data streaming. Our thorough evaluation on real data shows that Erebus can explain the (missing) answers with small overheads, both in low-and higher-end devices, even when large portions of the processed data are part of such explanations.

Författare

Dimitrios Palyvos-Giannas

Nätverk och System

Katerina Tzompanaki

Université de Cergy-Pontoise

Marina Papatriantafilou

Nätverk och System

Vincenzo Massimiliano Gulisano

Nätverk och System

Proceedings of the VLDB Endowment

21508097 (eISSN)

Vol. 16 2 230-242

49th International Conference on Very Large Data Bases, VLDB 2023
Vancouver, Canada,

Ämneskategorier

Annan data- och informationsvetenskap

Datavetenskap (datalogi)

Datorsystem

DOI

10.14778/3565816.3565825

Mer information

Senast uppdaterat

2022-12-19