A Method for Modeling Data Anomalies in Practice
Paper i proceeding, 2021

As technology has allowed us to collect large amounts of industrial data, it has become critical to analyze and understand the data collected, in particular to find data anomalies. Anomaly analysis allows a company to detect, analyze and understand anomalous or unusual data patterns. This is an important activity to understand, for example, deviations in service which may indicate potential problems, or differing customer behavior which may reveal new business opportunities. Much previous work has focused on anomaly detection, in particular using machine learning. Such approaches allow clustering of data patterns by common attributes, and, although useful, clusters often do not correspond to the root causes of anomalies, meaning that more manual analysis is needed. In this paper we report on a design science study with two different teams, in a partner company which focuses on modeling and understanding the attributes and root causes of data anomalies. After iteration, for each team, we have created general and anomaly-specific UML class diagrams and goal models to capture anomaly details. We use our experiences to create an example taxonomy, classifying anomalies by their root causes, and to create a general method for modeling and understanding data anomalies. This work paves the way for a better understanding of anomalies and their root causes, leading towards creating a training set which may be used for machine learning approaches.

Författare

Jennifer Horkoff

Software Engineering 1

Göteborgs universitet

Miroslaw Staron

Chalmers, Data- och informationsteknik, Software Engineering

Göteborgs universitet

Wilhelm Meding

Ericsson AB

Proceedings - 2021 47th Euromicro Conference on Software Engineering and Advanced Applications, SEAA 2021

119-128
9781665427050 (ISBN)

47th Euromicro Conference on Software Engineering and Advanced Applications, SEAA 2021
Palermo, Italy,

Ämneskategorier (SSIF 2025)

Datavetenskap (datalogi)

DOI

10.1109/SEAA53835.2021.00024

Mer information

Senast uppdaterat

2025-06-27