DRIVEN: a Framework for Efficient Data Retrieval and Clustering in Vehicular Networks
Paper i proceeding, 2019

Applications for adaptive (sometimes also called smart) Cyber-Physical Systems are blossoming thanks to the large volumes of data, sensed in a continuous fashion, in large distributed systems. The benefits of these applications come nonetheless with a price: the need for jointly addressing challenges in efficient data communication and analysis (among others). The goal of the DRIVEN framework, presented here, is to address these challenges for a data gathering and distance-based clustering tool in the context of vehicular networks. Because of the limited communication bandwidth (compared to the volume of sensed data) of vehicular networks and the monetary costs of data transmission, the intuition behind DRIVEN is to avoid gathering the data to be clustered in a raw format from each vehicle, but rather to allow for a streaming-based error-bounded approximation, through Piecewise Linear Approximation, to compress the volumes of data to be gathered. At the same time, rather than relying on a batch-based clustering algorithm that requires all the data to be first gathered (and then clustered), DRIVEN relies on and extends a streaming-based clustering algorithm that leverages the inherent ordering of the spatial and temporal data being collected, to perform the clustering in an online fashion, while data is being retrieved. As we show, based on our prototype implementation using Apache Flink and our evaluation with real-world data such as GPS and LiDAR, the accuracy loss for the clustering performed on the reconstructed data can be small, even when the raw data is compressed to 10- 35% of its original size, and the transferring of data itself can be completed in up to one-tenth of the duration observed when gathering raw data.

clustering

streaming data

compression

edge computing

fog computing

Författare

Bastian Havers

Chalmers, Data- och informationsteknik, Nätverk och system

Romaric Duvignau

Chalmers, Data- och informationsteknik, Nätverk och system

Hannaneh Najdataei

Chalmers, Data- och informationsteknik, Nätverk och system

Vincenzo Massimiliano Gulisano

Chalmers, Data- och informationsteknik, Nätverk och system

Marina Papatriantafilou

Chalmers, Data- och informationsteknik, Nätverk och system

Ashok Krishna Chaitanya Koppisetty

Proceedings - International Conference on Data Engineering

10844627 (ISSN)

1850-1861
978-1-5386-7474-1 (ISBN)

International Conference on Data Engineering 2019
Macau, Macau,

Molnbaserade produkter och produktion (FiC)

Stiftelsen för Strategisk forskning (SSF) (GMT14-0032), 2016-01-01 -- 2020-12-31.

HAREN: Självdistribuerad och anpassningsbar dataströmningsanalys i dimman

Vetenskapsrådet (VR) (2016-03800), 2017-01-01 -- 2020-12-31.

Ämneskategorier

Annan data- och informationsvetenskap

Mediateknik

Datavetenskap (datalogi)

Styrkeområden

Informations- och kommunikationsteknik

DOI

10.1109/ICDE.2019.00201

Mer information

Senast uppdaterat

2023-03-21