On uncertainty estimation in machine learning
Doktorsavhandling, 2024

This thesis explores the representation of probabilistic machine learning models,
focusing on representing and quantifying uncertainty. The included publica-
tions consider a range of topics, with the common theme of finding methods for
representing probability distributions and uncertainty with machine learning
models.

The thesis can be divided into three main parts. First, we introduce regular-
isation methods that improve the convergence of iterated Gaussian smoothers.
By interpreting smoothing as an optimisation problem, we develop Levenberg-
Marquardt regularisation and line-search methods. These extensions, which
require minimal additional computation, provide accurate Gaussian approx-
imations for a richer set of functions and input data compared to existing
methods.

The second part addresses uncertainty estimation directly. It presents a
general model distillation method for ensembles that efficiently condenses the
ensembles’ knowledge while retaining their ability to decompose prediction
uncertainties into epistemic and aleatoric components. The same type of
uncertainty decomposition is vital for us to propose a generalised active
learning formulation, in which unlabelled data points are selected based on
mutual information between model parameters and noisy labels. The resulting
active learning method can leverage the trade-off between dataset size and
label quality within a fixed annotation budget.

Finally, the third part considers methods for representing and estimating
more complex distributions using generative models. We study the theoretical
properties of several important parameter estimation methods for unnormalised
models, e.g., energy-based models. We prove connections between importance
sampling, contrastive divergence and noise contrastive estimation, thereby
establishing a more coherent framework. We use a technique previously limited
to energy-based models to propose an improved sampling method for composed
diffusion models. By approximating the energy difference between two samples
as a line integral of the score function, we achieve a Metropolis-Hastings-like
correction step for the score parameterised diffusion models. This enables
us to construct improved MCMC sampling methods for standard diffusion
models, which previously required an energy parameterisation.

Probabilistic machine learning

uncertainty estimation

EF
Opponent: Assistant Professor Arno Solin, Aalto University

Författare

Jakob Lindqvist

Chalmers, Elektroteknik, Signalbehandling och medicinsk teknik

Machine learning has become deeply integrated into our daily lives, influencing activities from
pedestrian detection in cars to providing answers in online support systems and even creating art.
However, for a machine learning algorithm to be truly effective, accuracy alone is not enough; we
also need to understand the confidence with which it makes predictions.
However, modern deep learning models are typically large, black-box, functions whose output is
difficult to interpret. This thesis studies an important aspect of this problem, namely how to estimate
uncertainty in probabilistic machine learning models.
Fortunately, many machine learning problems naturally lend themselves to a probabilistic approach,
reflecting the inherent uncertainties of real-world scenarios.
Instead of making a single, definitive prediction, a model should ideally forecast a range of possible
outcomes, each with an associated probability. This probabilistic framework not only helps in
modelling uncertainty more effectively but also offers the possibility of improving deep learning
models by combining them with well-known probabilistic methods.
A key challenge is constructing a probabilistic machine learning model that is sufficiently complex
to capture the intricacies of the task, while also being mathematically feasible to deal with.
This thesis explores various dimensions of this challenge, from techniques for articulating different
types of uncertainties to developing methods for learning these probabilistic models.

Ämneskategorier

Sannolikhetsteori och statistik

ISBN

978-91-8103-044-0

Doktorsavhandlingar vid Chalmers tekniska högskola. Ny serie: 5502

Utgivare

Chalmers

EF

Opponent: Assistant Professor Arno Solin, Aalto University

Mer information

Senast uppdaterat

2024-05-29