Geometric Supervision and Deep Structured Models for Image Segmentation
Doctoral thesis, 2020
This thesis summarizes the content of five papers addressing the two aforementioned drawbacks of CNNs. The first two papers present methods on how geometric 3D models can be used to improve segmentation models. The 3D models can be created with little human labour and can be used as a supervisory signal to improve the robustness of semantic segmentation and long-term visual localization methods.
The last three papers focuses on models combining CNNs and CRFs for semantic segmentation. The models consist of a CNN capable of learning complex image features coupled with a CRF capable of learning dependencies between output variables. Emphasis has been on creating models that are possible to train end-to-end, giving the CNN and the CRF a chance to learn how to interact and exploit complementary information to achieve better performance.
conditional random fields
convolutional neural networks
deep structured models
Semantic segmentation
self-supervised learning
supervised learning
semi-supervised learning
Author
Måns Larsson
Chalmers, Electrical Engineering, Signal Processing and Biomedical Engineering
A cross-season correspondence dataset for robust semantic segmentation
Proceedings of the IEEE Computer Society Conference on Computer Vision and Pattern Recognition,;Vol. 2019-June(2019)p. 9524-9534
Paper in proceeding
Fine-Grained Segmentation Networks: Self-Supervised Segmentation for Improved Long-Term Visual Localization
Proceedings of the IEEE International Conference on Computer Vision,;(2019)p. 31-41
Paper in proceeding
Revisiting Deep Structured Models for Pixel-Level Labeling with Gradient-Based Inference
SIAM Journal on Imaging Sciences,;Vol. 11(2018)p. 2610-2628
Journal article
Max-margin learning of deep structured models for semantic segmentation
Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics),;Vol. 10270 LNCS(2017)p. 28-40
Paper in proceeding
Robust abdominal organ segmentation using regional convolutional neural networks
Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics),;Vol. 10270 LNCS(2017)p. 41-52
Paper in proceeding
In Computer Vision the goal is to automatically extract meaningful information from an image, in some way automating tasks that the human visual system can do. In Computer Vision the goal is to automatically extract meaningful information from an image, in some way automating tasks that the human visual system can perform. This is applicable in many areas, for example a self-driving car can utilize cameras and computer vision to perceive and understand its surroundings. For medical applications, automatic interpretation of images can be helpful for diagnosis or surgery planning.
The topic of this thesis is image segmentation, which aims at understanding an image at a pixel level. The goal is to assign a label to each pixel, describing the object it is depicting. During the last few years, the dominating approaches used for segmentation are based on deep learning. In deep learning, large models with a lot of parameters - called neural networks - are used to produce a segmentation for an input image. For a model to produce useful results, its parameters need to be learnt using a big set of data containing pairs of images and manually created segmentations.
In this thesis, neural networks are combined with a type of statistical model called Conditional Random Field (CRF). CRFs are good at modeling dependencies within the output labels of a model, hence it can learn dependencies such as "hat pixels are likely to be above face pixels". In addition, methods that use 3D models to train segmentation methods to be more robust to seasonal changes have been developed. Robust segmentation methods are crucial for applications such as self-driving cars where the system needs to able to interpret its surrounding reliably during all seasons of the year.
Perceptron
VINNOVA (2017-01942), 2017-06-01 -- 2019-11-30.
Areas of Advance
Information and Communication Technology
Subject Categories
Computer Vision and Robotics (Autonomous Systems)
ISBN
978-91-7905-294-2
Doktorsavhandlingar vid Chalmers tekniska högskola. Ny serie: 4761
Publisher
Chalmers
online participation
Opponent: Professor M. Pawan Kumar, Department of Engineering Science, University of Oxford