Evaluating the layout quality of UML class diagrams using machine learning
Artikel i vetenskaplig tidskrift, 2022

UML is the de facto standard notation for graphically representing software. UML diagrams are used in the analysis, construction, and maintenance of software systems. Mostly, UML diagrams capture an abstract view of a (piece of a) software system. A key purpose of UML diagrams is to share knowledge about the system among developers. The quality of the layout of UML diagrams plays a crucial role in their comprehension. In this paper, we present an automated method for evaluating the layout quality of UML class diagrams. We use machine learning based on features extracted from the class diagram images using image processing. Such an automated evaluator has several uses: (1) From an industrial perspective, this tool could be used for automated quality assurance for class diagrams (e.g., as part of a quality monitor integrated into a DevOps toolchain). For example, automated feedback can be generated once a UML diagram is checked in the project repository. (2) In an educational setting, the evaluator can grade the layout aspect of student assignments in courses on software modeling, analysis, and design. (3) In the field of algorithm design for graph layouts, our evaluator can assess the layouts generated by such algorithms. In this way, this evaluator opens up the road for using machine learning to learn good layouting algorithms. Approach.: We use machine learning techniques to build (linear) regression models based on features extracted from the class diagram images using image processing. As ground truth, we use a dataset of 600+ UML Class Diagrams for which experts manually label the quality of the layout. Contributions.: This paper makes the following contributions: (1) We show the feasibility of the automatic evaluation of the layout quality of UML class diagrams. (2) We analyze which features of UML class diagrams are most strongly related to the quality of their layout. (3) We evaluate the performance of our layout evaluator. (4) We offer a dataset of labeled UML class diagrams. In this dataset, we supply for every diagram the following information: (a) a manually established ground truth of the quality of the layout, (b) an automatically established value for the layout-quality of the diagram (produced by our classifier), and (c) the values of key features of the layout of the diagram (obtained by image processing). This dataset can be used for replication of our study and others to build on and improve on this work. Editor's note: Open Science material was validated by the Journal of Systems and Software Open Science Board.

Quality of layout

Machine learning

Quality of UML class diagrams

Författare

Gustav Bergström

Göteborgs universitet

Fadhl Mohammad Omar Hujainah

Volvo Cars

Chalmers, Data- och informationsteknik, Interaktionsdesign och Software Engineering

Truong Ho-Quang

Volvo Cars

Chalmers, Data- och informationsteknik, Software Engineering

Rodi Jolak

Göteborgs universitet

Volvo Cars

Satrio Adi Rukmono

Institut Teknologi Bandung

Technische Universiteit Eindhoven

Arif Nurwidyantoro

Monash University

Michel Chaudron

Technische Universiteit Eindhoven

Göteborgs universitet

Journal of Systems and Software

0164-1212 (ISSN)

Vol. 192 111413

Ämneskategorier

Produktionsteknik, arbetsvetenskap och ergonomi

Annan data- och informationsvetenskap

Tillförlitlighets- och kvalitetsteknik

Programvaruteknik

Systemvetenskap

DOI

10.1016/j.jss.2022.111413

Relaterade dataset

Replication Package for "Evaluating the layout quality of UML class diagrams using machine learning" [dataset]

DOI: 10.5281/zenodo.6645684

Mer information

Senast uppdaterat

2023-09-21