Machine learning experiment management tools: a mixed-methods empirical study
Artikel i vetenskaplig tidskrift, 2024

Machine Learning (ML) experiment management tools support ML practitioners and software engineers when building intelligent software systems. By managing large numbers of ML experiments comprising many different ML assets, they not only facilitate engineering ML models and ML-enabled systems, but also managing their evolution—for instance, tracing system behavior to concrete experiments when the model performance drifts. However, while ML experiment management tools have become increasingly popular, little is known about their effectiveness in practice, as well as their actual benefits and challenges. We present a mixed-methods empirical study of experiment management tools and the support they provide to users. First, our survey of 81 ML practitioners sought to determine the benefits and challenges of ML experiment management and of the existing tool landscape. Second, a controlled experiment with 15 student developers investigated the effectiveness of ML experiment management tools. We learned that 70% of our survey respondents perform ML experiments using specialized tools, while out of those who do not use such tools, 52% are unaware of experiment management tools or of their benefits. The controlled experiment showed that experiment management tools offer valuable support to users to systematically track and retrieve ML assets. Using ML experiment management tools reduced error rates and increased completion rates. By presenting a user’s perspective on experiment management tools, and the first controlled experiment in this area, we hope that our results foster the adoption of these tools in practice, as well as they direct tool builders and researchers to improve the tool landscape overall.

Experiment management


Machine learning


Asset management

ML lifecycle


Samuel Idowu

Software Engineering 2

Osman Osman

Göteborgs universitet

Daniel Struber

Chalmers, Data- och informationsteknik, Software Engineering

Radboud Universiteit

Thorsten Berger

Ruhr-Universität Bochum

Software Engineering 2

Empirical Software Engineering

1382-3256 (ISSN) 1573-7616 (eISSN)

Vol. 29 4 74


Produktionsteknik, arbetsvetenskap och ergonomi





Mer information

Senast uppdaterat