An empirical investigation of challenges of specifying training data and runtime monitors for critical software with machine learning and their relation to architectural decisions
Journal article, 2024

The development and operation of critical software that contains machine learning (ML) models requires diligence and established processes. Especially the training data used during the development of ML models have major influences on the later behaviour of the system. Runtime monitors are used to provide guarantees for that behaviour. Runtime monitors for example check that the data at runtime is compatible with the data used to train the model. In a first step towards identifying challenges when specifying requirements for training data and runtime monitors, we conducted and thematically analysed ten interviews with practitioners who develop ML models for critical applications in the automotive industry. We identified 17 themes describing the challenges and classified them in six challenge groups. In a second step, we found interconnection between the challenge themes through an additional semantic analysis of the interviews. We explored how the identified challenge themes and their interconnections can be mapped to different architecture views. This step involved identifying relevant architecture views such as data, context, hardware, AI model, and functional safety views that can address the identified challenges. The article presents a list of the identified underlying challenges, identified relations between the challenges and a mapping to architecture views. The intention of this work is to highlight once more that requirement specifications and system architecture are interlinked, even for AI-specific specification challenges such as specifying requirements for training data and runtime monitoring.

Author

Hans-Martin Heyn

Software Engineering 1

Eric Knauss

Chalmers, Computer Science and Engineering (Chalmers), Interaction Design and Software Engineering

Iswarya Malleswaran

Shruthi Dinakaran

Requirements Engineering

0947-3602 (ISSN) 1432-010X (eISSN)

Vol. 29 1 97-117

Very Efficient Deep Learning in IOT (VEDLIoT)

European Commission (EC) (EC/H2020/957197), 2020-11-01 -- 2023-10-31.

Areas of Advance

Information and Communication Technology

Subject Categories

Computer and Information Science

DOI

10.1007/s00766-024-00415-4

Related datasets

Replication Data for: An investigation of challenges encountered when specifying training data and runtime monitors for safety critical ML applications [dataset]

URI: https://doi.org/10.7910/DVN/WJ8TKY DOI: 10.7910/DVN/WJ8TKY

More information

Created

11/15/2024