Automated Boundary Identification for Machine Learning Classifiers

Felix Dobslaw; Robert Feldt

doi:10.1145/3643659.3643927

Automated Boundary Identification for Machine Learning Classifiers
Paper i proceeding, 2024

AI and Machine Learning (ML) models are increasingly used as (critical) components in software systems, even safety-critical ones. This puts new demands on the degree to which we need to test them and requires new and expanded testing methods. Recent boundary-value identification methods have been developed and shown to automatically find boundary candidates for traditional, non-ML software: pairs of nearby inputs that result in (highly) differing outputs. These can be shown to developers and testers, who can judge if the boundary is where it is supposed to be. Here, we explore how this method can identify decision boundaries of ML classification models. The resulting ML Boundary Spanning Algorithm (ML-BSA) is a search-based method extending previous work in two main ways.We empirically evaluate ML-BSA on seven ML datasets and show that it better spans and thus better identifies the entire classification boundary(ies). The diversity objective helps spread out the boundary pairs more broadly and evenly. This, we argue, can help testers and developers better judge where a classification boundary actually is, compare to expectations, and then focus further testing, validation, and even further training and model refinement on parts of the boundary where behaviour is not ideal.

Författare

Felix Dobslaw

Mittuniversitetet

Robert Feldt

Chalmers, Data- och informationsteknik, Software Engineering

Forskning Andra publikationer

2024 IEEE/ACM INTERNATIONAL WORKSHOP ON SEARCH-BASED AND FUZZ TESTING, SBFT 2024

1-8
979-8-4007-0562-5 (ISBN)

17th IEEE/ACM International Workshop on Search-Based and Fuzz Testing (SBFT)
Lisbon, Portugal,

Ämneskategorier (SSIF 2011)

Programvaruteknik

Datavetenskap (datalogi)

DOI

10.1145/3643659.3643927

Publikationsdata kopplat till DOI

Mer information

Senast uppdaterat

2024-11-08

Automated Boundary Identification for Machine Learning Classifiers Paper i proceeding, 2024