Towards Reliable Deep Foundation Models in OOD detection, model calibration, and hallucination mitigation
Doctoral thesis, 2025

Despite the success and potential of deep learning techniques, ensuring the reliable deployment of such models remains a primary concern. In this thesis, the reliability of deep models is tackled through the lens of out-of-distribution (OOD) detection, model calibration, and hallucination mitigation to contributing to a trustworthy artificial intelligence (AI) system. 

Paper A and Paper B utilize joint energy-based modeling (JEM), and develop a probabilistic classifier and regressor, respectively. Specifically, Paper A addresses the training instability of joint energy-based models by replacing stochastic gradient Langevin dynamics with slice score matching, which results in a smoother training procedure without compromising the OOD performance. Paper B extends the idea of JEM from classification to regression, leading to a better calibrated regressor.

Paper C focuses on large-scale OOD detection with standard discriminative classifiers and proposes a novel OOD score based on generalized entropy, utilizing only information from the probability space.

Paper D leverages transfer learning and self-supervised learning techniques to devise an efficient framework, in which only normal samples are required for detecting anomalies in Chest X-rays.

Paper E utilizes the powerful text-image alignment in contrastive vision-language models (VLMs) for zero-shot OOD detection.

Finally, Paper F leverages insights from OOD detection and proposes an energy-based decoding method to mitigate object hallucination in generative VLMs.

Lecture Hall SB-H4, Sven Hultins Gata 6, Gothenburg
Opponent: Professor, Fredrik Lindsten, Linköping University, Linköping, Sweden.

Author

Xixi Liu

Chalmers, Electrical Engineering, Signal Processing and Biomedical Engineering

Text Prompt Augmentation for Zero-shot Out-of-Distribution Detection

2024 European Conference on Computer Vision,;Vol. 15059-15147(2024)p. 364-380

Paper in proceeding

Deep Nearest Neighbors for Anomaly Detection in Chest X-Rays

Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics),;Vol. 14349 LNCS(2024)p. 293-302

Paper in proceeding

GEN: Pushing the Limits of Softmax-Based Out-of-Distribution Detection

Proceedings of the IEEE Computer Society Conference on Computer Vision and Pattern Recognition,;Vol. 2023-June(2023)p. 23946-23955

Paper in proceeding

Effortless Training of Joint Energy-Based Models with Sliced Score Matching

Proceedings - International Conference on Pattern Recognition,;Vol. 2022-August(2022)p. 2643-2649

Paper in proceeding

Joint Energy-based Model for Deep Probabilistic Regression

Proceedings - International Conference on Pattern Recognition,;Vol. 2022-August(2022)p. 2693-2699

Paper in proceeding

Energy-Guided Decoding for Object Hallucination Mitigation

Deep learning techniques have been widely utilized across various domains including autonomous driving, content generation/recommendation, drug discovery, voice assistants, and renewable energy management. Ensuring the reliability of deployed models in the real-world applications has become more critical than ever. This thesis aims to enhance the reliability of deep models for trustworthy artificial intelligence by addressing out-of distribution detection (OOD), model calibration, and hallucination mitigation.

The key results in this thesis reveal the following insights:
1) Training deep models utilizing joint energy-based modeling enhances OOD detection performance and results in better calibrated regressors and classifiers;
2) OOD detection can be effectively achieved by utilizing only information available in the probability space of discriminative classifiers;
3) Medical anomalies can be identified using only normal images. By utilizing transfer learning and self-supervised learning techniques, an efficient feature-based framework is developed to detect medical anomalies in Chest X-rays. This approach outperforms reconstruction-based methods in terms of accuracy and effectiveness;
4) The knowledge of OOD detection within the framework of discriminative classifiers, can be effectively transferred to contrastive vision-language models (VLMs), enabling zero-shot OOD detection;
5) The insight gained from OOD detection has potential to address object hallucination in generative VLMs.

Subject Categories (SSIF 2025)

Computer Vision and learning System

Robotics and automation

Signal Processing

ISBN

978-91-8103-158-4

Doktorsavhandlingar vid Chalmers tekniska högskola. Ny serie: 5616

Publisher

Chalmers

Lecture Hall SB-H4, Sven Hultins Gata 6, Gothenburg

Online

Opponent: Professor, Fredrik Lindsten, Linköping University, Linköping, Sweden.

More information

Latest update

3/5/2025 2