Survey: Time-Series Data Preprocessing: A Survey and an Empirical Analysis

Amal Tawalkuli; Bastian Havers; Vincenzo Massimiliano Gulisano; Daniel Kaiser; Thomas Engel

doi:10.1016/j.jer.2024.02.018

Survey: Time-Series Data Preprocessing: A Survey and an Empirical Analysis
Artikel i vetenskaplig tidskrift, 2025

Data are naturally collected in their raw state and must undergo a series of preprocessing steps to obtain data in their input state for Artificial Intelligence (AI) and other applications. The data preprocessing phase is not only necessary to fit input requirements but also effective in improving AI training efficiency and output accuracy. Data preprocessing is a time consuming and complex phase that lacks a unified and structured approach. We survey data preprocessing techniques under different categories to provide an extended and structured scope of data preprocessing relevant to numerical time-series data. We also provide an empirical analysis of the impact of preprocessing techniques on the quality of the data and on the performance of AI algorithms. In addition, we discuss the feasibility of distributing some of the surveyed techniques to the edge. Leveraging edge computing to distribute data preprocessing reduces the workload on central systems, creates more manageable data lakes, reduces the consumption of resources (e.g., energy) and enables EdgeAI.

Data Preprocessing

Data Quality

Författare

Amal Tawalkuli

Université du Luxembourg

Bastian Havers

Nätverk och System

Forskning Andra publikationer

Vincenzo Massimiliano Gulisano

Nätverk och System

Forskning Andra publikationer

Daniel Kaiser

Université du Luxembourg

Thomas Engel

Université du Luxembourg

Journal of Engineering Research

2307-1877 (ISSN) 2307-1885 (eISSN)

Vol. 13 2 674-711

AutoSPADA (Automotive Stream Processing and Distributed Analytics) OODIDA Phase 2

VINNOVA (2019-05884), 2020-03-12 -- 2022-12-31.

Visa projekt

Styrkeområden

Informations- och kommunikationsteknik

Ämneskategorier (SSIF 2011)

Datavetenskap (datalogi)

DOI

10.1016/j.jer.2024.02.018

Publikationsdata kopplat till DOI

Mer information

Senast uppdaterat

2025-07-21

Survey: Time-Series Data Preprocessing: A Survey and an Empirical Analysis Artikel i vetenskaplig tidskrift, 2025

Författare

Amal Tawalkuli

Bastian Havers

Vincenzo Massimiliano Gulisano

Daniel Kaiser

Thomas Engel

Journal of Engineering Research

AutoSPADA (Automotive Stream Processing and Distributed Analytics) OODIDA Phase 2

Styrkeområden

Ämneskategorier (SSIF 2011)

DOI

Mer information

Senast uppdaterat

Survey: Time-Series Data Preprocessing: A Survey and an Empirical Analysis
Artikel i vetenskaplig tidskrift, 2025