Machine learning experiment management tools: a mixed-methods empirical study
Journal article, 2024

Machine Learning (ML) experiment management tools support ML practitioners and software engineers when building intelligent software systems. By managing large numbers of ML experiments comprising many different ML assets, they not only facilitate engineering ML models and ML-enabled systems, but also managing their evolution—for instance, tracing system behavior to concrete experiments when the model performance drifts. However, while ML experiment management tools have become increasingly popular, little is known about their effectiveness in practice, as well as their actual benefits and challenges. We present a mixed-methods empirical study of experiment management tools and the support they provide to users. First, our survey of 81 ML practitioners sought to determine the benefits and challenges of ML experiment management and of the existing tool landscape. Second, a controlled experiment with 15 student developers investigated the effectiveness of ML experiment management tools. We learned that 70% of our survey respondents perform ML experiments using specialized tools, while out of those who do not use such tools, 52% are unaware of experiment management tools or of their benefits. The controlled experiment showed that experiment management tools offer valuable support to users to systematically track and retrieve ML assets. Using ML experiment management tools reduced error rates and increased completion rates. By presenting a user’s perspective on experiment management tools, and the first controlled experiment in this area, we hope that our results foster the adoption of these tools in practice, as well as they direct tool builders and researchers to improve the tool landscape overall.

Experiment management

Tools

Machine learning

Artifacts

Asset management

ML lifecycle

Author

Samuel Idowu

Software Engineering 2

Osman Osman

University of Gothenburg

Daniel Struber

Chalmers, Computer Science and Engineering (Chalmers), Software Engineering (Chalmers)

Radboud University

Thorsten Berger

Ruhr-Universität Bochum

Software Engineering 2

Empirical Software Engineering

1382-3256 (ISSN) 1573-7616 (eISSN)

Vol. 29 4 74

Subject Categories

Production Engineering, Human Work Science and Ergonomics

Business Administration

Software Engineering

DOI

10.1007/s10664-024-10444-w

More information

Latest update

6/19/2024