Asymptotic Analysis of Machine Learning Models: Comparison Theorems and Universality
Doktorsavhandling, 2025
This thesis investigates the asymptotic regime of machine learning models - a regime in which both the number of trainable parameters (model size) and the number of data points grow infinitely at a fixed ratio. Understanding model behavior in this limit provides valuable theoretical insights into model statistics such as training error and generalization error, particularly in high-dimensional settings relevant to contemporary machine learning practice.
The core methodological tools used throughout this work are Gaussian comparison theorems, with a special emphasis on the Convex Gaussian Min-max Theorem (CGMT). These theorems enable the rigorous analysis of complex learning algorithms by comparing them to alternative surrogate problems, which are simpler to analyze. By constructing such asymptotically equivalent optimization problems, we are able to derive characterizations of the models of interest by proxy.
A secondary but significant theme in this thesis is the concept of universality in the asymptotic regime. Universality results demonstrate that many statistical properties of machine learning models are asymptotically governed only by low-order moments (e.g., means and variances) of the data distribution, rather than its full structure. This insight justifies the use of Gaussian surrogate models that match these moments, making them amenable to analysis via Gaussian comparison tools.
CGMT
universality
Convex Gaussian MIn Max Theorem.
asymptotics
Författare
David Bosch
Data Science och AI 3
A Novel Gaussian Min-Max Theorem and its Applications
IEEE Transactions on Information Theory,;Vol. In Press(2025)
Artikel i vetenskaplig tidskrift
A Novel Convex Gaussian Min Max Theorem for Repeated Features
Proceedings of the 38th Conference on Uncertainty in Artificial Intelligence, UAI 2022,;Vol. 258(2025)p. 3673-3681
Paper i proceeding
Random Features Model with General Convex Regularization: A Fine Grained Analysis with Precise Asymptotic Learning Curves
Proceedings of Machine Learning Research,;Vol. 206(2023)p. 11371-11414
Paper i proceeding
Precise Asymptotic Analysis of Deep Random Feature Models
Proceedings of Machine Learning Research,;Vol. 195(2023)p. 4132-4179
Paper i proceeding
Double Descent in Feature Selection: Revisiting LASSO and Basis Pursuit
Thirty-eighth International Conference on Machine Learning, ICML 2021,;(2021)
Paper i proceeding
Ämneskategorier (SSIF 2025)
Sannolikhetsteori och statistik
Artificiell intelligens
ISBN
978-91-8103-287-1
Doktorsavhandlingar vid Chalmers tekniska högskola. Ny serie: 5745
Utgivare
Chalmers