Rethinking Code Review Workflows with LLM Assistance: An Empirical Study
Paper in proceeding, 2025
Aims: We examine how large language models (LLMs) can support code reviews by addressing common inefficiencies and contextual gaps. Method: At WirelessCar Sweden A B, we conducted an exploratory field study to identify current challenges, followed by a field experiment with two LLM-assisted review prototypes: one providing upfront, AIgenerated reviews and another enabling on-demand interaction. Both used a retrieval-augmented generation pipeline to assemble relevant contextual information.
Results: The field study revealed frequent context switching, insufficient contextual information, and concerns around false positives. In practice, developers generally preferred the AI-led approach, especially for large or unfamiliar pull requests, though preferences varied with codebase familiarity and review risk.
Conclusions: LLM-assisted reviews can reduce cognitive load and improve comprehension, with hybrid proactive/on-demand designs best balancing efficiency, trust, and reviewer control.
Empirical Software Engineering
Large Language Models
Code Review
Author
Fannar Steinn Aealsteinsson
Student at Chalmers
WirelessCar Sweden AB
Bjorn Borgar Magnusson
WirelessCar Sweden AB
The Carl von Ossietzky University of Oldenburg
Mislav Milicevic
WirelessCar Sweden AB
Adam Nirving Davidsson
WirelessCar Sweden AB
Chih-Hong Cheng
Chalmers, Computer Science and Engineering (Chalmers), Interaction Design and Software Engineering
The Carl von Ossietzky University of Oldenburg
International Symposium on Empirical Software Engineering and Measurement
19493770 (ISSN) 19493789 (eISSN)
488-4979798331591472 (ISBN)
Honolulu, USA,
Subject Categories (SSIF 2025)
Software Engineering
Embedded Systems
Artificial Intelligence
DOI
10.1109/ESEM64174.2025.00013