Deep Learning For Model-Based Multi-Object Tracking
Doctoral thesis, 2023

Multi-object tracking (MOT) is the task of estimating the state of multiple objects based on noisy sensor measurements. MOT is essential in various applications, such as pedestrian monitoring, vehicle tracking, animal behavior analysis, and others. It can be broadly divided into two categories: model-free and model-based, depending on whether accurate and tractable models of the measurement sensor and objects' dynamics are available for methods to use.

In model-based MOT, closed-form, Bayes-optimal solutions can be derived for certain model families. These solutions achieve the best possible performance in expectation, but become intractable as the time-horizon increases due to an exponential growth in the number of terms. Approximations are necessary to make these methods feasible, but they result in performance degradation for challenging tracking tasks.

The main objective of this thesis is to use deep learning (DL) to address this limitation. The approach taken is to treat MOT as a sequence-to-sequence learning task, devising methods that learn to map measurement sequences to state estimates directly. This perspective frees methods from the need to explicitly consider all possible associations between objects and measurements, thereby side-stepping the intractability of traditional approaches. Furthermore, the available models of the environment are leveraged to generate unlimited synthetic data. This is used to train modern DL architectures that excel in the regime of big data, unlocking their power to reason about complicated and long-term temporal interactions in their inputs.

When developing the aforementioned methods, it became necessary to compare their predictions and estimated uncertainties to the state-of-the-art trackers for the model-based setting. To allow for this, another contribution of this thesis is with the paper "An Uncertainty-Aware Performance Measure for Multi-Object Tracking", which proposes the first uncertainty-aware, hyperparameter-free, mathematically principled performance measure for MOT.

multi-object smoothing

multi-object tracking

Deep learning

multi-object tracking performance measures

HC4
Opponent: Simon Maskell

Author

Juliano Pinto

Chalmers, Electrical Engineering, Signal Processing and Biomedical Engineering

Next Generation Multitarget Trackers: Random Finite Set Methods vs Transformer-based Deep Learning

Proceedings of 2021 IEEE 24th International Conference on Information Fusion, FUSION 2021,;(2021)p. 1059-1066

Paper in proceeding

An Uncertainty-Aware Performance Measure for Multi-Object Tracking

IEEE Signal Processing Letters,;Vol. 28(2021)p. 1689-1693

Journal article

Deep Learning for Model-Based Multi-Object Tracking

IEEE Transactions on Aerospace and Electronic Systems,;Vol. In Press(2023)p. 1-17

Journal article

J. Pinto, G. Hess, W. Ljungbergh, Y. Xia, L. Svensson, and H. Wymeersch - Transformer-based Multi-object Smoothing with Decoupled Data Association and Smoothing

Multiple Object Tracking (MOT) is the task concerned with tracking the state of an unknown number of objects using data from a measurement sensor such as a camera or a radar. The state of an object contains different information depending on the task, and usually contains the object’s location and velocity, for example. Model-based MOT refers to the setting where it is possible to derive mathematical models for how objects move and interact, and how measurements are generated. In this setting, trackers can use these models to derive optimal estimates of the object states in a mathematically principled way. However, because measurements do not contain information about which object generated them, in realistic scenarios this approach requires an immense amount of computation to be performed exactly, requiring approximations that impact their performance.

The main objective of this thesis is to use deep learning to address this obstacle in model-based MOT. Instead of attempting to reason about all of the possible associations between measurements and objects (the main reason for the intractability of traditional methods), this thesis instead develops deep learning methods that learn to directly estimate the object states from a sequence of measurements. To train these methods, the available models of the environment are used to generate unlimited synthetic training data.

Numerous experiments provide evidence that deep learning trackers trained in this way are capable of matching the performance of traditional approaches in simple tasks (where traditional approaches are considered optimal), while outperforming them in more complicated tracking scenarios.

6G Artificial Intelligence Radar

Chalmers AI Research Centre, 2021-05-01 -- 2023-04-30.

Infrastructure

C3SE (Chalmers Centre for Computational Science and Engineering)

Subject Categories

Signal Processing

ISBN

978-91-7905-924-8

Doktorsavhandlingar vid Chalmers tekniska högskola. Ny serie: 5390

Publisher

Chalmers

HC4

Online

Opponent: Simon Maskell

More information

Latest update

9/12/2023