Mimir - Streaming operators classification with artificial neural networks
Paper in proceedings, 2019
Streaming applications are used for analysing large volumes of continuous data. Achieving efficiency and effectiveness in data streaming imply challenges that gen all the more important when different parties (i) define applications' semantics, (ii) choose the stream Processing Engine (SPE) to use, and (iii) provide the processing infrastructure (e.g., cloud or fog), and when one party's decisions (e.g., how to deploy applications or when to trigger adaptive reconfigurations) depend on information held by a distinct one (and possibly hard to retrieve). In this context, machine learning can bridge the involved parties (e.g., SPEs and cloud providers) by offering tools that learn from the behavior of streaming applications and help take decisions. Such a tool, the focus of our ongoing work, can be used to learn which operators are run by a streaming application running in a certain SPE, without relying on the SPE itself to provide such information. More concretely, to classify the type of operator based on a desired level of granularity (from a coarse-grained characterization into stateless/stateful, to a fine-grained operator classification) based on general application-related metrics. As an example application, this tool could help a Cloud provider decide which infrastructure to assign to a certain streaming application (run by a certain SPE), based on the type (and thus cost) of its operators.