Enhancing Localization, Selection, and Processing of Data in Vehicular Cyber-Physical Systems
Doctoral thesis, 2024
As transmitting all raw data to the cloud for analysis incurs increasing costs and processing latencies, and the edge devices lack the capability to perform all required data analyses, the questions of "Where" and "How" to process "Which" data become paramount and form the foundation of this thesis.
The first part of this thesis gives an outline of my work by introducing relevant background topics, motivating the research questions and describing the contributions of this thesis. These contributions are then contained in the five chapters that make up the second part: in Chapter A, I present the DRIVEN framework consisting of a novel lossy online time-series compression algorithm with tuneable bounded error for the edge, as part of a pipeline from edge to cloud that includes online data clustering, and evaluate the tradeoffs between data savings and reduced analysis accuracy from lossy compression. In Chapter B, I show how our work on Data Localization helps in discovering those vehicles in a connected fleet that have data relevant to a user-defined analysis task quickly and efficiently. Chapter C proposes Ananke, the first forward provenance framework for Stream Processing, enabling a route for selecting relevant data inside streaming sources that are ubiquitous in VCPSs. In Chapter D, I present the Nona framework that solves the problem of forward provenance for evolving sets of Stream Processing queries and thus allows data selection for modern analysis flows in which queries are constantly altered and redeployed. Finally, in Chapter E, I introduce a comprehensive requirements list for and an implementation of a VCPS learning simulator that enables the efficient evaluation of distributed data analysis algorithms for connected vehicular networks.
This thesis makes significant steps forward for utilizing edge resources more efficiently, while also setting the basis for further development of novel distributed data analysis algorithms in VCPSs.
Vehicular Cyber-Physical Systems
Stream Processing
Edge-to-Cloud Continuum
Provenance
Distributed Data Analysis
Author
Bastian Havers
Network and Systems
DRIVEN: A framework for efficient Data Retrieval and clustering in Vehicular Networks
Future Generation Computer Systems,;Vol. 107(2020)p. 1-17
Journal article
Time- and Computation-Efficient Data Localization at Vehicular Networks' Edge
IEEE Access,;Vol. 9(2021)p. 137714-137732
Journal article
Ananke: A Streaming Framework for Live Forward Provenance
Proceedings of the VLDB Endowment,;Vol. 14(2020)p. 391-403
Journal article
Havers, B., Papatriantafilou, M., Gulisano, V.: "Nona: A Framework for Elastic Stream Provenance"
Proposing a framework for evaluating learning strategies in vehicular CPSs
Middleware 2022 Industrial Track - Proceedings of the 23rd International Middleware Conference Industrial Track, Part of Middleware 2022,;(2022)p. 22-28
Paper in proceeding
Utvecklingen av nya bilar och särskilt nya funktioner inom bilar blir allt mer datadriven. Medan utvecklingen av självkörande fordon till exempel behöver stora mängder data, kan andra applikationer som detektion av hala vägar behöva data mycket snabbt för att kunna varna andra förare. All denna data har sitt ursprung i moderna bilar med deras hundratals sensorer, inklusive radar, kameror och snart även LiDAR. Genom kommunikation med molnet bildar dessa bilar nätverk med tusentals medlemmar som innehåller värdefull data. Idag lagras denna data vanligtvis tidvis på fordonen innan den laddas upp till molnet. Där samlas data från många bilar, förbehandlas och analyseras samtidigt. Eftersom datamängderna som behövs för utveckling och som produceras på bilar växer, behöver mer data skickas från bilarna till molnet - vilket leder till ökande latenser och högre kostnader för att samla in data, samt en större belastning på molnet för att analysera den.
Detta tillvägagångssätt förbiser en värdefull resurs i dessa nätverk av bilar: datorerna ombord på dem. I denna avhandling presenterar jag nya lösningar för att utnyttja denna resurs för att stödja molnet i analysen av bildata, för att upptäcka relevant data, sammanfatta den, eller utföra delar av analysen även innan datan når molnet. Detta möjliggör att skicka mindre data och att generera resultat snabbare, vilket i sin tur möjliggör analys av ännu mer data med en minskad användning av resurser.
The development of new cars and especially new functions inside cars is more and more data-driven. While, for example, the development of autonomous driving needs large amounts of data, other applications such as slippery road detection may need data very quickly to issue warnings to other drivers.
All this data originates on modern cars with their hundreds of sensors, including radar, cameras, and soon LiDAR. Through communication with the cloud, these cars form networks with thousands of members that hold valuable data.
Today, this data is usually intermittently stored on the vehicles before it is uploaded to the cloud. There, the data from many cars is pooled, preprocessed, and analyzed simultaneously.
As the data amounts needed for development and produced on cars grow, more data needs to be sent from the cars to the cloud - leading to increasing latencies and higher costs for gathering the data, and a higher strain on the cloud for analyzing it.
This approach overlooks a valuable resource in these networks of cars: the computers on-board of them. In this thesis, I present novel solutions for leveraging this resource to support the cloud in car data analysis, to detect relevant data, summarize it, or perform parts of the analysis even before data reaches the cloud. This allows to send less data and to generate results quicker, in turn enabling the analysis of even more data with a reduced use of resources.
AUTOSPADA (Automotive Stream Processing and Distributed Analytics) OODIDA Phase 2
VINNOVA (2019-05884), 2020-03-12 -- 2022-12-31.
BADA - On-board Off-board Distributed Data Analytics
VINNOVA (2016-04260), 2016-12-01 -- 2019-12-31.
Areas of Advance
Information and Communication Technology
Subject Categories
Computer Systems
ISBN
978-91-8103-002-0
Doktorsavhandlingar vid Chalmers tekniska högskola. Ny serie: 5460
Publisher
Chalmers
HA2, Hörsalsvägen 4
Opponent: Prof. Nalini Venkatasubramanian, University of California, Irvine, United Stated of America