Asymptotic Analysis of Machine Learning Models: Comparison Theorems and Universality
Licentiatavhandling, 2023

The study of Machine Learning models in asymptotic regimes, has provided insight into many of the properties of ML models, but seemingly contradicts classical statistical wisdom. To solve this mystery, this thesis focuses on the analysis of models such as the LASSO and Random features regression, when the data points and model parameters grow infinite at constant ratios. It provides analysis for the asymptotic behavior of these problems, including characterization of the learning curves; the predicted training and generalization error as a function of the degree of overparameterization.

The papers in this thesis particularly focus on the usage of Gaussian comparison theorems as a methodological tool for the analysis of these problems. In particular, the convex Gaussian min max theorem allows us to study more complex ML optimization problems, by considering alternative models that are simpler to analyze, but asymptotically hold similar properties.

Secondarily, this thesis considers universality, which within the asymptotic context demonstrates that many statistics of ML models are fully determined by lower order statistical moments. This allows us to study surrogate Gaussian models, matching these moments. These surrogate Gaussian models can subsequently be analyzed by means of the Gaussian comparison theorems.

CGMT

Machine Learning

convex gaussian min max theorem

Asymptotic

universality

CSE EDIT 3128
Opponent: Samet Oymak, Assistant Professor Electrical and Computer Engineering, UC Riverside, USA

Författare

David Bosch

Chalmers, Data- och informationsteknik, Data Science och AI

Double Descent in Feature Selection: Revisiting LASSO and Basis Pursuit

Thirty-eighth International Conference on Machine Learning, ICML 2021,; (2021)

Paper i proceeding

Random Features Model with General Convex Regularization: A Fine Grained Analysis with Precise Asymptotic Learning Curves

Proceedings of Machine Learning Research,; Vol. 206(2023)p. 11371-11414

Paper i proceeding

Ämneskategorier

Data- och informationsvetenskap

Sannolikhetsteori och statistik

Utgivare

Chalmers

CSE EDIT 3128

Online

Opponent: Samet Oymak, Assistant Professor Electrical and Computer Engineering, UC Riverside, USA

Mer information

Senast uppdaterat

2023-05-22