DegradeFX - Explicating and Measuring Data Degradation Effects on ML
The amount of data from a prototypical self-driving car as of today can easily exceed 0.75GB/s due to high-resolution cameras, laser scanners, and radars. Video streams in such datasets used for training ML are typically compressed using lossy encoders to reduce the data size. However, recent research has shown that contemporary architectures for neural networks (NNs) are significantly affected by reduced image quality; similarly, it was also shown that even professionally trained NNs like Google's Cloud Vision API are easily misled when adding perturbations to video feeds.
Hence, using datasets with degraded data quality to train or evaluate ML-based algorithms increases the risk of potentially unexplainable behavior when integrated with the real system later. Therefore, systematically researching on this risk is becoming much more urgent now as new legislation on European level is in progress to bring event (accident) data recorders (EDRs) to all vehicles to record before, during, and after a collision. As such recorded data might be used to clarify liability for collisions where ML-components might be involved – even at court –, a verifiably correct model on the impact of lossy data encoders on ML-components is fundamental for system analysis as the enormous amount of data cannot be stored in a lossless format to begin with.
The main objective for this research project is to tackle one major problem related to the development and the deployment of systems containing AI components by systematically quantifying and modeling the impact of degraded data quality on ML-based algorithms as imposed by lossy video streams for example to allow for a systematic performance analysis for safety-critical autonomous systems.
Christian Berger (kontakt)
Docent vid Chalmers, Data- och informationsteknik, Software Engineering, Software Engineering for Cyber Physical Systems
Chalmers AI Research Centre
Finansierar Chalmers deltagande under 2020–
Relaterade styrkeområden och infrastruktur
Informations- och kommunikationsteknik