On the Trade-off Between Robustness and Complexity in Data Pipelines
Paper i proceeding, 2021

Data pipelines play an important role throughout the data management process whether these are used for data analytics or machine learning. Data-driven organizations can make use of data pipelines for producing good quality data applications. Moreover, data pipelines ensure end-to-end velocity by automating the processes involved in extracting, transforming, combining, validating, and loading data for further analysis and visualization. However, the robustness of data pipelines is equally important since unhealthy data pipelines can add more noise to the input data. This paper identifies the essential elements for a robust data pipeline and analyses the trade-off between data pipeline robustness and complexity.

Data quality

Robustness

Complexity

Trade-off,

Composite nodes

Författare

Aiswarya Raj Munappy

Testing, Requirements, Innovation and Psychology

Jan Bosch

Testing, Requirements, Innovation and Psychology

Helena Holmström Olsson

Chalmers, Data- och informationsteknik, Software Engineering

International Conference on the Quality of Information and Communications Technology

865-0929 (ISSN) 1865-0937 (eISSN)

Vol. 1439 401-415

Quality of Information and Communications Technology
Faro, Portugal,

Software Engineering for AI/ML/DL

Chalmers AI-forskningscentrum (CHAIR), 2019-11-01 -- 2022-11-01.

Ämneskategorier

Annan data- och informationsvetenskap

Bioinformatik (beräkningsbiologi)

Mediateknik

DOI

10.1007/978-3-030-85347-1_29

ISBN

9783030853464

Mer information

Senast uppdaterat

2021-10-06