Batch Mode Deep Active Learning for Regression on Graph Data
Paper i proceeding, 2023

Acquiring labelled data for machine learning tasks, for example, for software performance prediction, remains a resource-intensive task. This study extends our previous work by introducing a batch-mode deep active learning approach tailored for regression in graph-structured data. Our framework leverages the source code conversion into Flow Augmented-AST graphs (FA-AST), subsequently utilizing both supervised and unsupervised graph embeddings. In contrast to single-instance querying, the batch-mode paradigm adaptively selects clusters of unlabeled data for labelling. We deploy an array of base kernels, kernel transformations, and selection methods, informed by both Bayesian and non-Bayesian strategies, to enhance the sample efficiency of neural network regression. Our experimental evaluation, conducted on multiple real-world software performance datasets, demonstrates the efficacy of the batch mode deep active learning approach in achieving robust performance with a reduced labelling budget. The methodology scales effectively to larger datasets and requires minimal alterations to existing neural network architectures.

Kernels

Deep Learning

Active Learning

Graph Neural Network

Författare

Peter Samoaa

Chalmers, Data- och informationsteknik, Data Science och AI

Linus Aronsson

Chalmers, Data- och informationsteknik, Data Science och AI

Philipp Leitner

Software Engineering 2

Morteza Haghir Chehreghani

Chalmers, Data- och informationsteknik, Data Science och AI

Proceedings - 2023 IEEE International Conference on Big Data, BigData 2023

5904-5913
9798350324457 (ISBN)

2023 IEEE International Conference on Big Data, BigData 2023
Sorrento, Italy,

Utvecklarfokuserad prestandaförbättring för programvaruingenjörer

Vetenskapsrådet (VR) (2018-04127), 2019-01-01 -- 2023-12-31.

Ämneskategorier

Datavetenskap (datalogi)

DOI

10.1109/BigData59044.2023.10386685

Mer information

Senast uppdaterat

2024-09-19