Predicting build outcomes in continuous integration using textual analysis of source code commits
Paper in proceeding, 2022

Machine learning has been increasingly used to solve various software engineering tasks. One example of its usage is to predict the outcome of builds in continuous integration, where a classifier is built to predict whether new code commits will successfully compile. The aim of this study is to investigate the effectiveness of fifteen software metrics in building a classifier for build outcome prediction. Particularly, we implemented an experiment wherein we compared the effectiveness of a line-level metric and fourteen other traditional software metrics on 49,040 build records that belong to 117 Java projects. We achieved an average precision of 91% and recall of 80% when using the line-level metric for training, compared to 90% precision and 76% recall for the next best traditional software metric. In contrast, using file-level metrics was found to yield a higher predictive quality (average MCC for the best software metric= 68%) than the line-level metric (average MCC= 16%) for the failed builds. We conclude that file-level metrics are better predictors of build outcomes for the failed builds, whereas the line-level metric is a slightly better predictor of passed builds.

Build Prediction

Continuous Integration

Textual Analysis

Author

Khaled Al Sabbagh

University of Gothenburg

Miroslaw Staron

University of Gothenburg

Regina Hebig

University of Gothenburg

PROMISE 2022 - Proceedings of the 18th International Conference on Predictive Models and Data Analytics in Software Engineering, co-located with ESEC/FSE 2022

42-51
9781450398602 (ISBN)

18th ACM International Conference on Predictive Models and Data Analytics in Software Engineering, PROMISE 2022, co-located with the 30th ACM Joint European Software Engineering Conference and Symposium on the Foundations of Software Engineering, ESEC/FSE 2022
Singapore, Singapore,

Subject Categories (SSIF 2011)

Software Engineering

Building Technologies

Computer Systems

DOI

10.1145/3558489.3559070

More information

Latest update

10/26/2023