Data management for production quality deep learning models: Challenges and solutions
Artikel i vetenskaplig tidskrift, 2022

Deep learning (DL) based software systems are difficult to develop and maintain in industrial settings due to several challenges. Data management is one of the most prominent challenges which complicates DL in industrial deployments. DL models are data-hungry and require high-quality data. Therefore, the volume, variety, velocity, and quality of data cannot be compromised. This study aims to explore the data management challenges encountered by practitioners developing systems with DL components, identify the potential solutions from the literature and validate the solutions through a multiple case study. We identified 20 data management challenges experienced by DL practitioners through a multiple interpretive case study. Further, we identified 48 articles through a systematic literature review that discuss the solutions for the data management challenges. With the second round of multiple case study, we show that many of these solutions have limitations and are not used in practice due to a combination of four factors: high cost, lack of skill-set and infrastructure, inability to solve the problem completely, and incompatibility with certain DL use cases. Thus, data management for data-intensive DL models in production is complicated. Although the DL technology has achieved very promising results, there is still a significant need for further research in the field of data management to build high-quality datasets and streams that can be used for building production-ready DL systems. Furthermore, we have classified the data management challenges into four categories based on the availability of the solutions.(c) 2022 The Authors. Published by Elsevier Inc. This is an open access article under the CC BY-NC-ND license (http://creativecommons.org/licenses/by-nc-nd/4.0/).

Deep learning

Production quality DL models

Challenges

Solutions

Data management

Validation

Författare

Aiswarya Raj Munappy

Chalmers, Data- och informationsteknik, Interaktionsdesign och Software Engineering

Jan Bosch

Chalmers, Data- och informationsteknik, Interaktionsdesign och Software Engineering

Helena Holmstrom Olsson

Malmö universitet

Anders Arpteg

Peltarion AB

Bjoern Brinne

Peltarion AB

Journal of Systems and Software

0164-1212 (ISSN)

Vol. 191 111359

Ämneskategorier

Annan data- och informationsvetenskap

Programvaruteknik

Datorsystem

DOI

10.1016/j.jss.2022.111359

Mer information

Senast uppdaterat

2022-11-21