Improving the Readability of Generated Tests Using GPT-4 and ChatGPT Code Interpreter
Paper in proceeding, 2024

A major challenge in automated test generation is the readability of generated tests. Emerging large language models (LLMs) excel at language analysis and transformation tasks. We propose that improving test readability is such a task and explore the capabilities of the GPT-4 LLM in improving readability of tests generated by the Pynguin search-based generation framework. Our initial results are promising. However, there are remaining research and technical challenges.

Search-Based Test Generation


Large Language Models

Automated Test Generation

Generative AI


Gregory Gay

Chalmers, Computer Science and Engineering (Chalmers), Interaction Design and Software Engineering

Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)

03029743 (ISSN) 16113349 (eISSN)

Vol. 14415 LNCS 140-146
9783031487958 (ISBN)

15th International Symposium on Search-Based Software Engineering, SSBSE 2023
San Francisco, USA,

Context-Infused Automated Software Test Generation

Swedish Research Council (VR) (2019-05275), 2020-01-01 -- 2023-12-31.

Subject Categories

Language Technology (Computational Linguistics)

Software Engineering

Computer Vision and Robotics (Autonomous Systems)



More information

Latest update