Kausalitet och sidoinformation för effektiv maskininlärning
Forskningsprojekt, 2023
– 2026
Supervised machine learning is now an essential tool for scientists and engineers. It is used to predict outcomes from inputs with models trained to minimize prediction error in training data. As widespread adoption reaches outside of academia, the seams have started to show in this attractively simple idea. Models which achieve state-of-the-art accuracy on benchmark data sets fail to generalize to new samples and to seemingly identical domains. In this project, we propose incorporating knowledge of causal structure and privileged information, data beyond inputs and outputs, to improve the efficiency and domain robustness of learning algorithms. We will complete three aims toward our goal. To reach Aim 1, we will develop theory and methodology to identify prediction tasks in which learning using these tools is provably preferable to classical learning, first limiting our study to data from non-linear time-series. Toward Aim 2, we will expand these results to arbitrary causal structures, moving beyond dynamical systems. For Aim 3, we will develop theory to verify the usefulness of causal structure and auxiliary information in unsupervised domain adaptation, a notoriously difficult generalization task. We forsee that this project can have a meaningful impact on practical supervised learning already within 5 years. The project will be carried out by the applicant and a new PhD student. The aims are independent of application but inspired by existing collaborations in medicine.
Deltagare
Fredrik Johansson (kontakt)
Chalmers, Data- och informationsteknik, Data Science och AI
Finansiering
Vetenskapsrådet (VR)
Projekt-id: 2022-04748
Finansierar Chalmers deltagande under 2023–2026