Software Logs for Machine Learning in a DevOps Environment
Paper in proceedings, 2020

System logs perform a critical function in software-intensive systems as logs record the state of the system and significant events in the system at important points in time. Unfortunately, log entries are typically created in an ad-hoc, unstructured and uncoordinated fashion, limiting their usefulness for analytics and machine learning. In a DevOps environment, especially, unmanaged evolution in log data structure causes frequent disruption of operations in automated data pipelines, dashboards and analytics. In this paper, we present the main challenges of contemporary approaches to generating, storing and managing the evolution of system logs data for large, complex, software-intensive systems based on an in-depth case study at a world-leading telecommunications company. Second, we present an approach for generating and managing the evolution of log data in a DevOps environment that does not suffer from the aforementioned challenges and is optimized for use in machine learning. Third, we provide validation of the approach based on expert interviews that confirm that the approach addresses the identified challenges and problems.

data generation

System logs

DevOps

data preprocessing

machine learning

Author

Nathan Bosch

Ericsson

Jan Bosch

Chalmers, Computer Science and Engineering (Chalmers), Software Engineering (Chalmers), Software Engineering for Testing, Requirements, Innovation and Psychology

Proceedings - 46th Euromicro Conference on Software Engineering and Advanced Applications, SEAA 2020

29-33 9226340

46th Euromicro Conference on Software Engineering and Advanced Applications, SEAA 2020
Kranj, Slovenia,

Subject Categories

Other Computer and Information Science

Computer Science

Computer Systems

DOI

10.1109/SEAA51224.2020.00016

More information

Latest update

12/16/2020