Debloating Machine Learning Systems

Huaifeng Zhang

doi:10.63959/chalmers.dt/5769

Debloating Machine Learning Systems
Doktorsavhandling, 2025

In recent decades, Machine Learning (ML) has rapidly evolved from an academic pursuit into a cornerstone of modern industries, with applications spanning manufacturing, healthcare, finance, transportation, etc.. The emergence of large language models (LLMs) has further accelerated this growth, driving unprecedented demand for ML systems capable of supporting models of widely varying scales and data modalities. Despite the growing importance of ML systems, the problem of software bloat within them remains underexplored. Software bloat usually refers to unnecessary code, files, or dependencies. It degrades performance, increases resource consumption, and introduces security risks. With the rise of containerized applications as a common deployment method for ML, the problem is further exacerbated: containers must package not only applications but also all associated libraries and dependencies, leading to significant overhead. This thesis investigates bloat in ML systems across multiple layers, from ML containers to ML shared libraries. This thesis introduces novel techniques to measure, analyze, and mitigate bloat in ML systems. The main contributions are: (A) MMLB: a framework for measuring ML bloat that quantifies container-level bloat and identifies its causes, showing that ML containers are substantially more bloated than general-purpose containers and ML shared libraries are the major source of bloat. (B) BLAFS: a bloat-aware file system that efficiently and effectively reduces file bloat in ML containers by detecting and removing unused files. (C) RTrace: a tracer that accurately identifies functions executed in shared libraries, improving the visibility of shared library execution and enabling precise detection of host-code bloat. (D) MERGESHUFFLE: a tool that removes unused code from shared libraries while preserving functionality and improving performance and security. (E) Negative-ML: This work reveals that ML shared libraries are different from generic libraries: while the latter contain only host code, ML shared libraries include both host code and device code, the latter targeting GPUs and significantly contributing to their size. Negative-ML presents a holistic debloating approach that targets both host and device code, representing the first systematic investigation of device-code bloat. This thesis thus offers both a systematic understanding of software bloat in ML systems and practical techniques to mitigate it, contributing to more efficient, secure, and sustainable deployment of ML in real-world environments.

Software Engineering

Software Debloating

Software Bloat

Machine Learning Systems

Performance Optimization

Chalmers Johannesburg campus, EDIT 6217

Online disputation

Författare

Huaifeng Zhang

Chalmers, Data- och informationsteknik, Dator- och nätverkssystem

Forskning Andra publikationer

Machine learning systems are bloated and vulnerable

SIGMETRICS/PERFORMANCE 2024 - Abstracts of the 2024 ACM SIGMETRICS/IFIP PERFORMANCE Joint International Conference on Measurement and Modeling of Computer Systems,;(2024)p. 37-38

Paper i proceeding

RTrace: Towards Better Visibility of Shared Library Execution

Övrigt konferensbidrag

The Cure is in the Cause: A Filesystem for Container Debloating

Proceedings of the 2025 ACM Symposium on Cloud Computing,;(2025)

Paper i proceeding

The Hidden Bloat in Machine Learning Systems

Proceedings of the 8th Conference on Machine Learning and Systems (MLSys, Best Paper Award),;(2025)

Paper i proceeding

MERGESHUFFLE: Debloating Shared Libraries for Improved Perfor- mance and Security

How was the famous statue David created? When asked, Michelangelo replied, “It’s simple. I just remove what is not David.” This thesis follows a similar philosophy, but instead of marble, it works with software.

Rather than adding new features, this thesis focuses on removing what is unnecessary. In software, this excess is called software bloat - unnecessary code and features in software. Such bloat wastes resources, increases energy consumption, and slows down performance. The process of removing this bloat is called debloating.

This thesis applies debloating to machine learning systems, which is the software at the heart of modern Artificial Intelligence (AI). By analyzing which parts of these systems are truly used under real workloads, this thesis introduces methods to identify and remove unused components, ranging from large software modules down to individual code instructions. The result is a leaner, faster, and more energy-efficient system.

Ämneskategorier (SSIF 2025)

Programvaruteknik

Styrkeområden

Informations- och kommunikationsteknik

Drivkrafter

Hållbar utveckling

Infrastruktur

C3SE (-2020, Chalmers Centre for Computational Science and Engineering)

DOI

10.63959/chalmers.dt/5769

Publikationsdata kopplat till DOI

ISBN

978-91-8103-312-0

Doktorsavhandlingar vid Chalmers tekniska högskola. Ny serie: 5769

Utgivare

Chalmers

Chalmers Johannesburg campus, EDIT 6217

Online

Mer information

Senast uppdaterat

2025-11-03

Debloating Machine Learning Systems Doktorsavhandling, 2025

Författare

Huaifeng Zhang

Machine learning systems are bloated and vulnerable

RTrace: Towards Better Visibility of Shared Library Execution

The Cure is in the Cause: A Filesystem for Container Debloating

The Hidden Bloat in Machine Learning Systems

MERGESHUFFLE: Debloating Shared Libraries for Improved Perfor- mance and Security

Ämneskategorier (SSIF 2025)

Styrkeområden

Drivkrafter

Infrastruktur

DOI

ISBN

Utgivare

Mer information

Senast uppdaterat

Debloating Machine Learning Systems
Doktorsavhandling, 2025