D-LeDe: A Data Leakage Detection Method for Automotive Perception Systems
Paper i proceeding, 2025

Data leakage is a very common problem that is often overlooked during splitting data into train and test sets before training any ML/DL model. The model performance gets artificially inflated with the presence of data leakage during the evaluation phase which often leads the model to erroneous prediction on real-time deployment. However, detecting the presence of such leakage is challenging, particularly in the object detection context of perception systems where the model needs to be supplied with image data for training. In this study, we conduct a computational experiment to develop a method for detecting data leakage. We then conducted an initial evaluation of the method as a first step on a public dataset, “Kitti”, which is a popular and widely accepted benchmark dataset in the automotive domain. The evaluation results show that our proposed D-LeDe method are able to successfully detect potential data leakage caused by image similarity. A further validation was also provided to justify the evaluation outcome by conducting pair-wise image similarity analysis using perceptual hash (pHash) distance.

Kitti

Automotive Perception Systems

Data Leakage Detection

Cirrus

YOLOv7

Object Detection

Författare

Md Abu Ahammed Babu

Göteborgs universitet

Chalmers, Data- och informationsteknik, Interaktionsdesign och Software Engineering

Volvo Group

Sushant Kumar Pandey

Rijksuniversiteit Groningen

Darko Durisic

Volvo Group

Ashok Chaitanya Koppisetty

Volvo Group

Miroslaw Staron

Göteborgs universitet

Chalmers, Data- och informationsteknik, Software Engineering

International Conference on Vehicle Technology and Intelligent Transport Systems, VEHITS - Proceedings

2184495X (eISSN)

210-221
9789897587450 (ISBN)

11th International Conference on Vehicle Technology and Intelligent Transport Systems, VEHITS 2025
Porto, Portugal,

Ämneskategorier (SSIF 2025)

Datorgrafik och datorseende

Datavetenskap (datalogi)

Datorsystem

DOI

10.5220/0013476700003941

Mer information

Senast uppdaterat

2025-05-09