Automated Generation and Evaluation of JMH Microbenchmark Suites From Unit Tests

Mostafa Jangali; Yiming Tang; Niclas Alexandersson; Philipp Leitner; Jinqiu Yang; Weiyi Shang

doi:10.1109/TSE.2022.3188005

Automated Generation and Evaluation of JMH Microbenchmark Suites From Unit Tests
Artikel i vetenskaplig tidskrift, 2023

Performance is a crucial non-functional requirement of many software systems. Despite the widespread use of performance testing, developers still struggle to construct and evaluate the quality of performance tests. To address these two major challenges, we implement a framework, dubbed ju2jmh, to automatically generate performance microbenchmarks from JUnit tests and use mutation testing to study the quality of generated microbenchmarks. Specifically, we compare our ju2jmh generated benchmarks to manually written JMH benchmarks and to automatically generated JMH benchmarks using the AutoJMH framework, as well as directly measuring system performance with JUnit tests. For this purpose, we have conducted a study on three subjects (Rxjava, Eclipse-collections, and Zipkin) with $\sim$454 K source lines of code (SLOC), 2,417 JMH benchmarks (including manually written and generated AutoJMH benchmarks) and 35,084 JUnit tests. Our results show that the ju2jmh generated JMH benchmarks consistently outperform using the execution time and throughput of JUnit tests as a proxy of performance and JMH benchmarks automatically generated using the AutoJMH framework while being comparable to JMH benchmarks manually written by developers in terms of tests’ stability and ability to detect performance bugs. Nevertheless, ju2jmh benchmarks are able to cover more of the software applications than manually written JMH benchmarks during the microbenchmark execution. Furthermore, ju2jmh benchmarks are generated automatically, while manually written JMH benchmarks requires many hours of hard work and attention; therefore our study can reduce developers’ effort to construct microbenchmarks. In addition, we identify three factors (too low test workload, unstable tests and limited mutant coverage) that affect a benchmark’s ability to detect performance bugs. To the best of our knowledge, this is the first study aimed at assisting developers in fully automated microbenchmark creation and assessing microbenchmark quality for performance testing.

Java

Codes

JMH

Time measurement

Computer bugs

performance

Benchmark testing

performance testing

performance mutation testing

Throughput

performance microbenchmarking

Manuals

Författare

Mostafa Jangali

Université Concordia

Yiming Tang

Université Concordia

Niclas Alexandersson

Student vid Chalmers

Philipp Leitner

Chalmers, Data- och informationsteknik, Interaktionsdesign och Software Engineering

Forskning Andra publikationer

Jinqiu Yang

Université Concordia

Weiyi Shang

Université Concordia

IEEE Transactions on Software Engineering

0098-5589 (ISSN) 19393520 (eISSN)

Vol. 49 4 1704-1725

Ämneskategorier (SSIF 2011)

Datavetenskap (datalogi)

DOI

10.1109/TSE.2022.3188005

Publikationsdata kopplat till DOI

Mer information

Senast uppdaterat

2023-07-05

Automated Generation and Evaluation of JMH Microbenchmark Suites From Unit Tests Artikel i vetenskaplig tidskrift, 2023

Författare

Mostafa Jangali

Yiming Tang

Niclas Alexandersson

Philipp Leitner

Jinqiu Yang

Weiyi Shang

IEEE Transactions on Software Engineering

Ämneskategorier (SSIF 2011)

DOI

Mer information

Senast uppdaterat

Automated Generation and Evaluation of JMH Microbenchmark Suites From Unit Tests
Artikel i vetenskaplig tidskrift, 2023