Inference of Effective Pairwise Relations for Data Processing
Licentiatavhandling, 2020

In various data science and artificial intelligence areas, representation learning is a performance-critical step. While different representation learning methods can detect different descriptive and latent features, many representation learning methods reflect on pairwise relations. The thesis consists of two parts, studying pairwise relations from two points of view: i) Pairwise relations between the states of a Markov chain. ii) Pairwise relations between objects in a dataset based on a desired (dis)similarity measure.
In the first part of the thesis, we consider Markov chains, noting that pairwise relations between its states are naturally modeled by the state-transition matrix. We propose a method for modeling the performance of a synchronization method for a multi-processor architecture. Our model introduces and builds upon a cache line bouncing process that models the interaction of threads accessing the shared cache lines.
In the second part of the thesis, we consider representation learning using the transitive-aware Minimax distance, which enables the extraction of elongated manifolds and structures in the data. While recent work has made Minimax distances computationally feasible, little attention has been put to its memory footprint, which is naturally O(N^2), the cost of storing all pairwise distances. We do, however, compute a novel hierarchical representation of the data, requiring O(N) memory, from which pairwise Minimax distances can then be efficiently inferred, in total requiring O(N) memory, at the cost of higher computational cost.
An alternative sampling-based approach is also derived, which computes approximate Minimax distances, also in O(N) memory but with a significantly reduced computational cost, while still yielding a good approximation, as verified by impressive results on clustering benchmarks.
Finally, we develop an unsupervised learning framework for clustering vehicle trajectories based on Minimax distances. The performance of the framework is validated on real-world datasets collected from real driving scenarios, on which satisfactory performance is demonstrated.

Motion trajectory clustering

Concurrent programming

Representation Learning

Pairwise Relations

Memory Efficiency

Minimax Distance

Performance Modeling

CSE EDIT 8103
Opponent: Prof. Niklas Lavesson, Department of Computer Science and Informatics, Jönköping University of Technology

Författare

Fazeleh Sadat Hoseini

Chalmers, Data- och informationsteknik, Nätverk och system

Modeling the performance of atomic primitives on modern architectures

ACM International Conference Proceeding Series,; (2019)

Paper i proceeding

Hoseini, Fazeleh Sadat, and Morteza Haghir Chehreghani. "Memory-Efficient Sampling for Minimax Distance Measures." arXiv preprint arXiv:2005.12627 (2020).

Hoseini, Fazeleh S., Sadegh Rahrovani, and Morteza Haghir Chehreghani. "A Generic Framework for Clustering Vehicle Motion Trajectories." arXiv preprint arXiv:2009.12443 (2020).

Ämneskategorier

Annan data- och informationsvetenskap

Styrkeområden

Informations- och kommunikationsteknik

Transport

Energi

Infrastruktur

C3SE (Chalmers Centre for Computational Science and Engineering)

Utgivare

Chalmers

CSE EDIT 8103

Online

Opponent: Prof. Niklas Lavesson, Department of Computer Science and Informatics, Jönköping University of Technology

Mer information

Senast uppdaterat

2021-01-15