Improving Annotation Quality and Overcoming Data Scarcity in ML-Based Medical Image Analysis
Licentiatavhandling, 2026

Medical images are a crucial part of healthcare, but require the time and effort of trained experts to analyze. Machine learning based methods have the potential to decrease this workload, but their practical adoption remains challenging. In particular, practitioners often have limited access to training data. Furthermore, labeling the data can be difficult when the assessment is subjective in nature, leading to disagreements among experts. In this thesis, we address these challenges in several ways.


First, we construct a comparison-based image annotation system and evaluate it against standard rating-based annotation in a study with six clinicians, finding that it significantly increases inter-annotator agreement. In follow-up work, we mitigate the increased annotation cost of comparisons by leveraging per-item features such as image content. We introduce GURO, a novel criterion for selecting informative comparisons, and show that incorporating item attributes significantly improves sample efficiency, making it a more scalable solution for large-scale annotation.


Finally, we compare methods for leveraging radiology reports to train image-only classifiers more efficiently. We find that existing methods are overwhelmingly evaluated on diagnostic labels, overlooking tasks such as prognosis, where the label is less directly correlated with the report. This distinction is important, as we observe that text-supervised models do not show the same benefits over self-supervised models in the non-diagnostic setting. Additionally, we explore the potential of using reports when fine-tuning, a previously neglected aspect, through generalized distillation. We find that this can lead to significant improvements in the data-scarce setting, depending on the task.


This thesis offers practical guidance for developing medical image models and introduces annotation methods that reduce label disagreement while maintaining low annotation effort.

Text-Supervised Learning

Pairwise Comparisons

Medical Imaging

Computer Vision

Machine Learning

Privileged Information

Label Quality

Self-Supervised Learning

Ordering

Subjective Annotation

EDIT Room Analysen
Opponent: Filip Malmberg

Författare

Herman Bergström

Chalmers, Data- och informationsteknik, Data Science och AI

Akram Abawi, Herman Bergström, Hanna Tärnåsen, Ida Häggström, Mats Lidén - A relative scoring annotation system provides higher quality labels for medical image machine learning.

Active preference learning for ordering items in- and out-of-sample

Advances in Neural Information Processing Systems,;Vol. 37(2024)

Paper i proceeding

Herman Bergström, Zhonqi Yue, Fredrik. D. Johansson - When are radiology reports useful for training medical image classifiers?

Ämneskategorier (SSIF 2025)

Datorgrafik och datorseende

Datavetenskap (datalogi)

Infrastruktur

C3SE (-2020, Chalmers Centre for Computational Science and Engineering)

Utgivare

Chalmers

EDIT Room Analysen

Opponent: Filip Malmberg

Mer information

Skapat

2026-01-12