Bayesian Data Analysis in Empirical Software Engineering Research

Carlo A Furia; Robert Feldt; Richard Torkar

doi:10.1109/TSE.2019.2935974

Bayesian Data Analysis in Empirical Software Engineering Research
Artikel i vetenskaplig tidskrift, 2021

IEEE Statistics comes in two main flavors: frequentist and Bayesian. For historical and technical reasons, frequentist statistics have traditionally dominated empirical data analysis, and certainly remain prevalent in empirical software engineering. This situation is unfortunate because frequentist statistics suffer from a number of shortcomings---such as lack of flexibility and results that are unintuitive and hard to interpret---that curtail their effectiveness when dealing with the heterogeneous data that is increasingly available for empirical analysis of software engineering practice. In this paper, we pinpoint these shortcomings, and present Bayesian data analysis techniques that provide tangible benefits---as they can provide clearer results that are simultaneously robust and nuanced. After a short, high-level introduction to the basic tools of Bayesian statistics, we present the reanalysis of two empirical studies on the effectiveness of automatically generated tests and the performance of programming languages, respectively. By contrasting the original frequentist analyses with our new Bayesian analyses, we demonstrate the concrete advantages of the latter. To conclude we advocate a more prominent role for Bayesian statistical techniques in empirical software engineering research and practice.

statistical hypothesis testing

statistical analysis

Bayesian data analysis

empirical software engineering

Författare

Carlo A Furia

Universita della Svizzera italiana

Forskning Andra publikationer

Robert Feldt

Chalmers, Data- och informationsteknik, Software Engineering

Forskning Andra publikationer

Richard Torkar

Göteborgs universitet

Forskning Andra publikationer

IEEE Transactions on Software Engineering

0098-5589 (ISSN) 19393520 (eISSN)

Vol. 47 9 1786-1810

Ämneskategorier (SSIF 2011)

Annan data- och informationsvetenskap

Programvaruteknik

Datavetenskap (datalogi)

DOI

10.1109/TSE.2019.2935974

Publikationsdata kopplat till DOI

Mer information

Senast uppdaterat

2022-04-05

Bayesian Data Analysis in Empirical Software Engineering Research Artikel i vetenskaplig tidskrift, 2021

Författare

Carlo A Furia

Robert Feldt

Richard Torkar

IEEE Transactions on Software Engineering

Ämneskategorier (SSIF 2011)

DOI

Mer information

Senast uppdaterat

Bayesian Data Analysis in Empirical Software Engineering Research
Artikel i vetenskaplig tidskrift, 2021