ChatGPT asaTool forUser Story Quality Evaluation: Trustworthy Out oftheBox?
Paper in proceeding, 2024

In Agile software development, user stories play a vital role in capturing and conveying end-user needs, prioritizing features, and facilitating communication and collaboration within development teams. However, automated methods for evaluating user stories require training in NLP tools and can be time-consuming to develop and integrate. This study explores using ChatGPT for user story quality evaluation and compares its performance with an existing benchmark. Our study shows that ChatGPT’s evaluation aligns well with human evaluation, and we propose a “best of three” strategy to improve its output stability. We also discuss the concept of trustworthiness in AI and its implications for non-experts using ChatGPT’s unprocessed outputs. Our research contributes to understanding the reliability and applicability of Generative AI in user story evaluation and offers recommendations for future research.

Author

Krishna Ronanki

University of Gothenburg

Software Engineering 2

Beatriz Cabrero-Daniel

Software Engineering 2

University of Gothenburg

Christian Berger

Software Engineering 2

University of Gothenburg

Lecture Notes in Business Information Processing

1865-1348 (ISSN) 18651356 (eISSN)

Vol. 489 LNBIP 173-181
9783031485497 (ISBN)

workshops presented at 23rd International Conferences on Agile Software Development, XP 2022 and 24th International Conferences on Agile Software Development, XP 2023
Amsterdam, Netherlands,

Subject Categories (SSIF 2025)

Software Engineering

DOI

10.1007/978-3-031-48550-3_17

More information

Latest update

11/13/2025