DRIVEN: A framework for efficient Data Retrieval and clustering in Vehicular Networks
Artikel i vetenskaplig tidskrift, 2020

The growing interest in data analysis applications for Cyber–Physical Systems stems from the large amounts of data such large distributed systems sense in a continuous fashion. A key research question in this context is how to jointly address the efficiency and effectiveness challenges of such data analysis applications. DRIVEN proposes a way to jointly address these challenges for a data gathering and distance-based clustering tool in the context of vehicular networks. To cope with the limited communication bandwidth (compared to the sensed data volume) of vehicular networks and data transmission's monetary costs, DRIVEN avoids gathering raw data from vehicles, but rather relies on a streaming-based and error-bounded approximation, through Piecewise Linear Approximation (PLA), to compress the volumes of gathered data. Moreover, a streaming-based approach is also used to cluster the collected data (once the latter is reconstructed from its PLA-approximated form). DRIVEN's clustering algorithm leverages the inherent ordering of the spatial and temporal data being collected to perform clustering in an online fashion, while data is being retrieved. As we show, based on our prototype implementation using Apache Flink and thorough evaluation with real-world data such as GPS, LiDAR and other vehicular signals, the accuracy loss for the clustering performed on the gathered approximated data can be small (below 10%), even when the raw data is compressed to 5-35% of its original size, and the transferring of historical data itself can be completed in up to one-tenth of the duration observed when gathering raw data.

Streaming data

Clustering

Compression

Edge computing

Fog computing

Författare

Bastian Havers

Chalmers, Data- och informationsteknik, Nätverk och system

Volvo Cars

Romaric Duvignau

Chalmers, Data- och informationsteknik, Nätverk och system

Hannaneh Najdataei

Chalmers, Data- och informationsteknik, Nätverk och system

Vincenzo Massimiliano Gulisano

Chalmers, Data- och informationsteknik, Nätverk och system

Marina Papatriantafilou

Chalmers, Data- och informationsteknik, Nätverk och system

Ashok Chaitanya Koppisetty

Volvo Cars

Future Generation Computer Systems

0167-739X (ISSN)

Vol. 107 1-17

AutoSPADA (Automotive Stream Processing and Distributed Analytics) OODIDA Phase 2

VINNOVA (2019-05884), 2020-03-12 -- 2022-12-31.

Molnbaserade produkter och produktion (FiC)

Stiftelsen för Strategisk forskning (SSF) (GMT14-0032), 2016-01-01 -- 2020-12-31.

HAREN: Självdistribuerad och anpassningsbar dataströmningsanalys i dimman

Vetenskapsrådet (VR) (2016-03800), 2017-01-01 -- 2020-12-31.

Ämneskategorier

Annan data- och informationsvetenskap

Mediateknik

Signalbehandling

DOI

10.1016/j.future.2020.01.050

Mer information

Senast uppdaterat

2022-07-27