Comparing Word-Based and AST-Based Models for Design Pattern Recognition
Paper in proceeding, 2023

Design patterns (DPs) provide reusable and general solutions for frequently encountered problems. Patterns are important to maintain the structure and quality of software products, in particular in large and distributed systems like automotive software. Modern language models (like Code2Vec or Word2Vec) indicate a deep understanding of programs, which has been shown to help in such tasks as program repair or program comprehension, and therefore show promise for DPR in industrial contexts. The models are trained in a self-supervised manner, using a large unlabelled code base, which allows them to quantify such abstract concepts as programming styles, coding guidelines, and, to some extent, the semantics of programs. This study demonstrates how two language models-Code2Vec and Word2Vec, trained on two public automotive repositories, can show the separation of programs containing specific DPs. The results show that the Code2Vec and Word2Vec produce average F1-scores of 0.781 and 0.690 on open-source Java programs, showing promise for DPR in practice.

Design Patterns

NLP

Programming Language Models

Author

Sivajeet Chand

Student at Chalmers

Sushant Kumar Pandey

Chalmers, Computer Science and Engineering (Chalmers), Interaction Design and Software Engineering

University of Gothenburg

Jennifer Horkoff

Chalmers, Computer Science and Engineering (Chalmers), Interaction Design and Software Engineering

University of Gothenburg

Miroslaw Staron

University of Gothenburg

Chalmers, Computer Science and Engineering (Chalmers), Software Engineering (Chalmers)

M. Ochodek

Adam Mickiewicz University in Poznań

Darko Durisic

Volvo

ACM International Conference Proceeding Series

44-48
9798400703751 (ISBN)

19th International Conference on Predictive Models and Data Analytics in Software Engineering, Co-located with: ESEC/FSE 2023
San Francisco, USA,

Subject Categories

Software Engineering

Computer Science

DOI

10.1145/3617555.3617873

More information

Latest update

12/11/2024