Improving Annotation Quality and Overcoming Data Scarcity in ML-Based Medical Image Analysis
Licentiate thesis, 2026
First, we construct a comparison-based image annotation system and evaluate it against standard rating-based annotation in a study with six clinicians, finding that it significantly increases inter-annotator agreement. In follow-up work, we mitigate the increased annotation cost of comparisons by leveraging per-item features such as image content. We introduce GURO, a novel criterion for selecting informative comparisons, and show that incorporating item attributes significantly improves sample efficiency, making it a more scalable solution for large-scale annotation.
Finally, we compare methods for leveraging radiology reports to train image-only classifiers more efficiently. We find that existing methods are overwhelmingly evaluated on diagnostic labels, overlooking tasks such as prognosis, where the label is less directly correlated with the report. This distinction is important, as we observe that text-supervised models do not show the same benefits over self-supervised models in the non-diagnostic setting. Additionally, we explore the potential of using reports when fine-tuning, a previously neglected aspect, through generalized distillation. We find that this can lead to significant improvements in the data-scarce setting, depending on the task.
This thesis offers practical guidance for developing medical image models and introduces annotation methods that reduce label disagreement while maintaining low annotation effort.
Text-Supervised Learning
Pairwise Comparisons
Medical Imaging
Computer Vision
Machine Learning
Privileged Information
Label Quality
Self-Supervised Learning
Ordering
Subjective Annotation
Author
Herman Bergström
Chalmers, Computer Science and Engineering (Chalmers), Data Science and AI
Akram Abawi, Herman Bergström, Hanna Tärnåsen, Ida Häggström, Mats Lidén - A relative scoring annotation system provides higher quality labels for medical image machine learning.
Active preference learning for ordering items in- and out-of-sample
Advances in Neural Information Processing Systems,;Vol. 37(2024)
Paper in proceeding
Herman Bergström, Zhonqi Yue, Fredrik. D. Johansson - When are radiology reports useful for training medical image classifiers?
Subject Categories (SSIF 2025)
Computer graphics and computer vision
Computer Sciences
Infrastructure
C3SE (-2020, Chalmers Centre for Computational Science and Engineering)
Publisher
Chalmers
EDIT Room Analysen
Opponent: Filip Malmberg