Ground truth deficiencies in software engineering: when codifying the past can be counterproductive
Journal article, 2022

Many software engineering tools build and evaluate their models based on historical data to support development and process decisions. These models help us answer numerous interesting questions, but have their own caveats. In a real-life setting, the objective function of human decision-makers for a given task might be influenced by a whole host of factors that stem from their cognitive biases, subverting the ideal objective function required for an optimally functioning system. Relying on this data as ground truth may give rise to systems that end up automating software engineering decisions by mimicking past sub-optimal behaviour. We illustrate this phenomenon and suggest mitigation strategies to raise awareness.


Software engineering

Task analysis

Machine learning


Data models

Computer bugs


Eray Tuzun

Bilkent University

Hakan Erdogmus

Carnegie Mellon University (CMU)

Maria Teresa Baldassarre

University of Bari Aldo Moro

Michael Felderer

University of Innsbruck

Robert Feldt

Chalmers, Computer Science and Engineering (Chalmers), Software Engineering (Chalmers)

Burak Turhan

Monash University

IEEE Software

0740-7459 (ISSN)

Vol. 39 3 85-95

Subject Categories

Software Engineering

Information Science

Computer Science



More information

Latest update