Batch Mode Deep Active Learning for Regression on Graph Data
Paper in proceeding, 2023

Acquiring labelled data for machine learning tasks, for example, for software performance prediction, remains a resource-intensive task. This study extends our previous work by introducing a batch-mode deep active learning approach tailored for regression in graph-structured data. Our framework leverages the source code conversion into Flow Augmented-AST graphs (FA-AST), subsequently utilizing both supervised and unsupervised graph embeddings. In contrast to single-instance querying, the batch-mode paradigm adaptively selects clusters of unlabeled data for labelling. We deploy an array of base kernels, kernel transformations, and selection methods, informed by both Bayesian and non-Bayesian strategies, to enhance the sample efficiency of neural network regression. Our experimental evaluation, conducted on multiple real-world software performance datasets, demonstrates the efficacy of the batch mode deep active learning approach in achieving robust performance with a reduced labelling budget. The methodology scales effectively to larger datasets and requires minimal alterations to existing neural network architectures.

Kernels

Deep Learning

Active Learning

Graph Neural Network

Author

Peter Samoaa

Chalmers, Computer Science and Engineering (Chalmers), Data Science and AI

Linus Aronsson

Chalmers, Computer Science and Engineering (Chalmers), Data Science and AI

Philipp Leitner

Software Engineering 2

Morteza Haghir Chehreghani

Chalmers, Computer Science and Engineering (Chalmers), Data Science and AI

Proceedings - 2023 IEEE International Conference on Big Data, BigData 2023

5904-5913
9798350324457 (ISBN)

2023 IEEE International Conference on Big Data, BigData 2023
Sorrento, Italy,

ImmeRSEd - Developer-Targeted Performance Engineering for Immersed Release and Software Engineers

Swedish Research Council (VR) (2018-04127), 2019-01-01 -- 2023-12-31.

Subject Categories (SSIF 2011)

Computer Science

DOI

10.1109/BigData59044.2023.10386685

More information

Latest update

9/19/2024