Intent-Driven Mobile GUI Testing with Autonomous Large Language Model Agents
Paper i proceeding, 2024

GUI testing checks if a software system behaves as expected when users interact with its graphical interface, e.g., testing specific functionality or validating relevant use case scenarios. Currently, deciding what to test at this high level is a manual task since automated GUI testing tools target lower level adequacy metrics such as structural code coverage or activity coverage. We propose DroidAgent, an autonomous GUI testing agent for Android, for semantic, intent-driven automation of GUI testing. It is based on Large Language Models and support mechanisms such as long- and short-term memory. Given an Android app, DroidAgent sets relevant task goals and subsequently tries to achieve them by interacting with the app. Our empirical evaluation of DroidAgent using 15 apps from the Themis benchmark shows that it can set up and perform realistic tasks, with a higher level of autonomy. For example, when testing a messaging app, DroidAgent created a second account and added a first account as a friend, testing a realistic use case, without human intervention. On average, DroidAgent achieved 61% activity coverage, compared to 51 % for current state-of-the-art GUI testing techniques. Further, manual analysis shows that 317 out of the 547 autonomously created tasks are realistic and relevant to app functionalities, and also that DroidAgent interacts deeply with the apps and covers more features.

large language model

GUI testing

artificial intelligence

software testing

test automation

Författare

Juyeon Yoon

Korea Advanced Institute of Science and Technology (KAIST)

Robert Feldt

Chalmers, Data- och informationsteknik, Software Engineering

Shin Yoo

Korea Advanced Institute of Science and Technology (KAIST)

Proceedings - 2024 IEEE Conference on Software Testing, Verification and Validation, ICST 2024

129-139
9798350308181 (ISBN)

17th IEEE Conference on Software Testing, Verification and Validation, ICST 2024
Toronto, Canada,

Automatiserad testning av gränser för kvalitet på AI/ML modeller (AQUAS)

Vetenskapsrådet (VR) (2020-05272), 2021-01-01 -- 2024-12-31.

Ämneskategorier

Programvaruteknik

Datorseende och robotik (autonoma system)

DOI

10.1109/ICST60714.2024.00020

Mer information

Senast uppdaterat

2024-09-23