Batch Mode Deep Active Learning for Regression on Graph Data

Peter Samoaa; Linus Aronsson; Philipp Leitner; Morteza Haghir Chehreghani

doi:10.1109/BigData59044.2023.10386685

Batch Mode Deep Active Learning for Regression on Graph Data
Paper i proceeding, 2023

Acquiring labelled data for machine learning tasks, for example, for software performance prediction, remains a resource-intensive task. This study extends our previous work by introducing a batch-mode deep active learning approach tailored for regression in graph-structured data. Our framework leverages the source code conversion into Flow Augmented-AST graphs (FA-AST), subsequently utilizing both supervised and unsupervised graph embeddings. In contrast to single-instance querying, the batch-mode paradigm adaptively selects clusters of unlabeled data for labelling. We deploy an array of base kernels, kernel transformations, and selection methods, informed by both Bayesian and non-Bayesian strategies, to enhance the sample efficiency of neural network regression. Our experimental evaluation, conducted on multiple real-world software performance datasets, demonstrates the efficacy of the batch mode deep active learning approach in achieving robust performance with a reduced labelling budget. The methodology scales effectively to larger datasets and requires minimal alterations to existing neural network architectures.

Kernels

Deep Learning

Active Learning

Graph Neural Network

Författare

Peter Samoaa

Chalmers, Data- och informationsteknik, Data Science och AI

Linus Aronsson

Chalmers, Data- och informationsteknik, Data Science och AI

Forskning Andra publikationer

Philipp Leitner

Software Engineering 2

Forskning Andra publikationer

Morteza Haghir Chehreghani

Chalmers, Data- och informationsteknik, Data Science och AI

Forskning Andra publikationer

Proceedings - 2023 IEEE International Conference on Big Data, BigData 2023

5904-5913
9798350324457 (ISBN)

2023 IEEE International Conference on Big Data, BigData 2023
Sorrento, Italy,

Utvecklarfokuserad prestandaförbättring för programvaruingenjörer

Vetenskapsrådet (VR) (2018-04127), 2019-01-01 -- 2023-12-31.

Visa projekt

Ämneskategorier (SSIF 2011)

Datavetenskap (datalogi)

DOI

10.1109/BigData59044.2023.10386685

Publikationsdata kopplat till DOI

Mer information

Senast uppdaterat

2024-09-19

Batch Mode Deep Active Learning for Regression on Graph Data Paper i proceeding, 2023

Författare

Peter Samoaa

Linus Aronsson

Philipp Leitner

Morteza Haghir Chehreghani

Proceedings - 2023 IEEE International Conference on Big Data, BigData 2023

Utvecklarfokuserad prestandaförbättring för programvaruingenjörer

Ämneskategorier (SSIF 2011)

DOI

Mer information

Senast uppdaterat

Batch Mode Deep Active Learning for Regression on Graph Data
Paper i proceeding, 2023