Towards Reliable Deep Foundation Models in OOD detection, model calibration, and hallucination mitigation
Doktorsavhandling, 2025

Despite the success and potential of deep learning techniques, ensuring the reliable deployment of such models remains a primary concern. In this thesis, the reliability of deep models is tackled through the lens of out-of-distribution (OOD) detection, model calibration, and hallucination mitigation to contributing to a trustworthy artificial intelligence (AI) system. 

Paper A and Paper B utilize joint energy-based modeling (JEM), and develop a probabilistic classifier and regressor, respectively. Specifically, Paper A addresses the training instability of joint energy-based models by replacing stochastic gradient Langevin dynamics with slice score matching, which results in a smoother training procedure without compromising the OOD performance. Paper B extends the idea of JEM from classification to regression, leading to a better calibrated regressor.

Paper C focuses on large-scale OOD detection with standard discriminative classifiers and proposes a novel OOD score based on generalized entropy, utilizing only information from the probability space.

Paper D leverages transfer learning and self-supervised learning techniques to devise an efficient framework, in which only normal samples are required for detecting anomalies in Chest X-rays.

Paper E utilizes the powerful text-image alignment in contrastive vision-language models (VLMs) for zero-shot OOD detection.

Finally, Paper F leverages insights from OOD detection and proposes an energy-based decoding method to mitigate object hallucination in generative VLMs.

Lecture Hall SB-H4, Sven Hultins Gata 6, Gothenburg
Opponent: Professor, Fredrik Lindsten, Linköping University, Linköping, Sweden.

Författare

Xixi Liu

Chalmers, Elektroteknik, Signalbehandling och medicinsk teknik

Text Prompt Augmentation for Zero-shot Out-of-Distribution Detection

2024 European Conference on Computer Vision,;Vol. 15059-15147(2024)p. 364-380

Paper i proceeding

Deep Nearest Neighbors for Anomaly Detection in Chest X-Rays

Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics),;Vol. 14349 LNCS(2024)p. 293-302

Paper i proceeding

GEN: Pushing the Limits of Softmax-Based Out-of-Distribution Detection

Proceedings of the IEEE Computer Society Conference on Computer Vision and Pattern Recognition,;Vol. 2023-June(2023)p. 23946-23955

Paper i proceeding

Effortless Training of Joint Energy-Based Models with Sliced Score Matching

Proceedings - International Conference on Pattern Recognition,;Vol. 2022-August(2022)p. 2643-2649

Paper i proceeding

Joint Energy-based Model for Deep Probabilistic Regression

Proceedings - International Conference on Pattern Recognition,;Vol. 2022-August(2022)p. 2693-2699

Paper i proceeding

Energy-Guided Decoding for Object Hallucination Mitigation

Deep learning techniques have been widely utilized across various domains including autonomous driving, content generation/recommendation, drug discovery, voice assistants, and renewable energy management. Ensuring the reliability of deployed models in the real-world applications has become more critical than ever. This thesis aims to enhance the reliability of deep models for trustworthy artificial intelligence by addressing out-of distribution detection (OOD), model calibration, and hallucination mitigation.

The key results in this thesis reveal the following insights:
1) Training deep models utilizing joint energy-based modeling enhances OOD detection performance and results in better calibrated regressors and classifiers;
2) OOD detection can be effectively achieved by utilizing only information available in the probability space of discriminative classifiers;
3) Medical anomalies can be identified using only normal images. By utilizing transfer learning and self-supervised learning techniques, an efficient feature-based framework is developed to detect medical anomalies in Chest X-rays. This approach outperforms reconstruction-based methods in terms of accuracy and effectiveness;
4) The knowledge of OOD detection within the framework of discriminative classifiers, can be effectively transferred to contrastive vision-language models (VLMs), enabling zero-shot OOD detection;
5) The insight gained from OOD detection has potential to address object hallucination in generative VLMs.

Ämneskategorier (SSIF 2025)

Datorseende och lärande system

Robotik och automation

Signalbehandling

ISBN

978-91-8103-158-4

Doktorsavhandlingar vid Chalmers tekniska högskola. Ny serie: 5616

Utgivare

Chalmers

Lecture Hall SB-H4, Sven Hultins Gata 6, Gothenburg

Online

Opponent: Professor, Fredrik Lindsten, Linköping University, Linköping, Sweden.

Mer information

Senast uppdaterat

2025-03-05