From Ad-Hoc Data Analytics to DataOps
Paper i proceeding, 2020

The collection of high-quality data provides a key competitive advantage to companies in their decision-making process. It helps to understand customer behavior and enables the usage and deployment of new technologies based on machine learning. However, the process from collecting the data, to clean and process it to be used by data scientists and applications is often manual, non-optimized and error-prone. This increases the time that the data takes to deliver value for the business. To reduce this time companies are looking into automation and validation of the data processes. Data processes are the operational side of data analytic workflow.DataOps, a recently coined term by data scientists, data analysts and data engineers refer to a general process aimed to shorten the end-to-end data analytic life-cycle time by introducing automation in the data collection, validation, and verification process. Despite its increasing popularity among practitioners, research on this topic has been limited and does not provide a clear definition for the term or how a data analytic process evolves from ad-hoc data collection to fully automated data analytics as envisioned by DataOps.This research provides three main contributions. First, utilizing multi-vocal literature we provide a definition and a scope for the general process referred to as DataOps. Second, based on a case study with a large mobile telecommunication organization, we analyze how multiple data analytic teams evolve their infrastructure and processes towards DataOps. Also, we provide a stairway showing the different stages of the evolution process. With this evolution model, companies can identify the stage which they belong to and also, can try to move to the next stage by overcoming the challenges they encounter in the current stage.

Agile Methodology

Continuous Monitoring

DevOps

DataOps

Data Pipelines

Data technologies

Författare

Aiswarya Raj Munappy

Chalmers, Data- och informationsteknik, Software Engineering

David Issa Mattos

Chalmers, Data- och informationsteknik, Software Engineering

Jan Bosch

Chalmers, Data- och informationsteknik, Software Engineering

Helena Holmström Olsson

Malmö universitet

Anas Dakkak

Ericsson AB

Proceedings - 2020 IEEE/ACM International Conference on Software and System Processes, ICSSP 2020

165-174
9781450375122 (ISBN)

2020 IEEE/ACM International Conference on Software and System Processes, ICSSP 2020
Seoul, South Korea,

HoliDev - Holistic DevOps Framework

VINNOVA (2017-05218), 2018-01-01 -- 2019-12-31.

Ämneskategorier

Produktionsteknik, arbetsvetenskap och ergonomi

Annan data- och informationsvetenskap

Tillförlitlighets- och kvalitetsteknik

Styrkeområden

Informations- och kommunikationsteknik

Infrastruktur

C3SE (Chalmers Centre for Computational Science and Engineering)

DOI

10.1145/3379177.3388909

Mer information

Senast uppdaterat

2023-04-21