Concurrent Data Structures for Efficient Streaming Aggregation
Report, 2013

In many data gathering applications, information arrives in the form of continuous streams rather than finite data sets. Efficient one-pass algorithms are required to cope with high input loads. Stream processing engines support continuous queries to process data in a real-time fashion and have evolved rapidly from centralized to distributed, parallel and elastic solutions. While a big effort has been put on leveraging the processing capacity of clusters of machines, less work has focused on leveraging the parallelism enabled by multi-core architectures by means of concurrent and lock-free data structures, to support the pipeline. This paper explores this aspect focusing on multiway aggregation, where large data volumes are received from multiple input streams. Multiway aggregation is crucial in contexts such as sensor networks, social media or clickstream analysis applications. We provide three enhanced aggregate operators that rely on two new concurrent data structures and their lock-free implementations, supporting both order-sensitive and order-insensitive aggregation functions. We provide an extensive study of the properties of the proposed aggregate operators and the new data structures. We also show an extensive experimental evaluation of the proposed methods, giving empirical evidence of their superiority. In this evaluation we run a variety of aggregation queries on two large datasets, one with data extracted from SoundCloud, a music social network, and one with data from a smart grid metering network. In all the experiments, the new data structures improved the aggregation performance significantly, up to one order of magnitude, in terms of both processing throughput and latency.

Author

Daniel Cederman

Chalmers, Computer Science and Engineering (Chalmers), Networks and Systems (Chalmers)

Vincenzo Massimiliano Gulisano

Chalmers, Computer Science and Engineering (Chalmers), Networks and Systems (Chalmers)

Ioannis Nikolakopoulos

Chalmers, Computer Science and Engineering (Chalmers), Networks and Systems (Chalmers)

Marina Papatriantafilou

Chalmers, Computer Science and Engineering (Chalmers), Networks and Systems (Chalmers)

Philippas Tsigas

Chalmers, Computer Science and Engineering (Chalmers), Networks and Systems (Chalmers)

Areas of Advance

Information and Communication Technology

Subject Categories

Computer Science

Computer Systems

More information

Created

10/7/2017