Understanding and Evaluating Chatbot Interactions in Software Engineering
Licentiatavhandling, 2025

Chatbots have been used in software engineering for a long time. Initially, they were based on basic commands. Then, Artificial Intelligence introduced many components such as Natural Language Understanding (NLU) and made the architecture of the chat- bot slightly more complex to be able to automate simple tasks such as closing issues on GitHub and to retrieve information and documentation. However, the emergence of Large Language Models (LLMs) unlocked many other possibilities for chatbots, which allowed them to have extensive knowledge and be context-aware while being able to perform complex tasks and make decisions during the software development process. Consequently, chatbots could assist in requirement elicitation, code generation, and even analyzing monitoring logs of the software. This enabled software engineers to explore more possibilities, in particular, focusing on automating complex tasks using LLM chatbots and interacting with them as traditional chatbots. However, this created new challenges that need to be addressed, for example, hallucinating requirements or providing vulnerable code. Consequently, human factors such as trust began to fade slowly. In this thesis, I argue that to better use chatbots for the right use cases, we need to understand the interactions with them, including the usage and conversational flow. In addition, the evaluation of chatbots (both NLU and LLM based) should go beyond their performance and focus on the value that they bring to software engineers through their interactions. Using empirical methods in four observational and experimental studies, I present an analysis of the characteristics of interactions with NLU and LLM chatbots in comparison with those with human developers. NLU chatbots are used as tools where reliability is an evaluation criterion that complements performance. However, interactions with LLM chatbots are more complex and are impacted by many factors that I introduce in a personal experience framework. In addition, I show how different dimensions of productivity are affected based on whether the chatbot is used to provide guidance, manipulate artifacts, or learn new concepts. Moreover, since prompt programming is commonly used to enhance the outcome of the inter- actions, I show how certain prompt techniques improve code generation, but their overall impact remains limited. Therefore, this thesis guides chatbot designers in enhancing chatbots’ ability to communicate to improve the user’s personal experience. It also urges practitioners to adapt their use of chatbots to focus on collaborating with them rather than using them as automation tools. This also encourages researchers to investigate effective ways to implement collaboration with chatbots at different stages of the software development lifecycle.

Software Engineering

Human-AI Collaboration

Chatbots

Natural Language Understanding

Large Language Models

Författare

Ranim Khojah

Chalmers, Data- och informationsteknik, Interaktionsdesign och Software Engineering

Beyond Code Generation: An Observational Study of ChatGPT Usage in Software Engineering Practice

Proceedings of the ACM on Software Engineering,;Vol. 1(2024)p. 1819-1840

Artikel i vetenskaplig tidskrift

From Human-to-Human to Human-to-Bot Conversations in Software Engineering

AIware 2024 - Proceedings of the 1st ACM International Conference on AI-Powered Software, Co-located with: ESEC/FSE 2024,;(2024)p. 38-44

Paper i proceeding

Khojah, R., de Oliveira Neto, F. G., Mohamad, M., Leitner, P., The Impact of Prompt Programming on Function-Level Code Generation.

Evaluating N-best Calibration of Natural Language Understanding for Dialogue Systems

Proceedings of the 23rd Annual Meeting of the Special Interest Group on Discourse and Dialogue, SIGDIAL 2022,;(2025)p. 582-594

Paper i proceeding

Infrastruktur

C3SE (-2020, Chalmers Centre for Computational Science and Engineering)

Ämneskategorier (SSIF 2025)

Annan teknik

Data- och informationsvetenskap (Datateknik)

Utgivare

Chalmers

Mer information

Senast uppdaterat

2025-03-06