Text Prompt Augmentation for Zero-shot Out-of-Distribution Detection

Xixi Liu; Christopher Zach

doi:10.1007/978-3-031-73464-9_22

Text Prompt Augmentation for Zero-shot Out-of-Distribution Detection
Paper in proceeding, 2025

Out-of-distribution (OOD) detection has been extensively studied for the reliable deployment of deep-learning models. Despite great progress in this research direction, most works focus on discriminative classifiers and perform OOD detection based on single-modal representations that consist of either visual or textual features. Moreover, they rely on training with in-distribution (ID) data. The emergence of vision-language models (e.g. CLIP) allows to perform zero-shot OOD detection by leveraging multi-modal feature embeddings and therefore only rely on labels defining ID data. Several approaches have been devised but these either need a given OOD label set, which might deviate from real OOD data, or fine-tune CLIP, which potentially has to be done for different ID datasets. In this paper, we first adapt various OOD scores developed for discriminative classifiers to CLIP. Further, we propose an enhanced method named TAG based on Text prompt AuGmentation to amplify the separation between ID and OOD data, which is simple but effective, and can be applied on various score functions. Its performance is demonstrated on CIFAR-100 and large-scale ImageNet-1k OOD detection benchmarks. It consistently improves AUROC and FPR95 on CIFAR-100 across five commonly used architectures over four baseline OOD scores. The average AUROC and FPR95 improvements are 6.35% and 10.67%, respectively. The results for ImageNet-1k follow a similar, but less pronounced pattern.

Zero-shot out-of-distribution detection

Vision-language models

Author

Xixi Liu

Chalmers, Electrical Engineering, Signal Processing and Biomedical Engineering

Other publications Research

Christopher Zach

Chalmers, Electrical Engineering, Signal Processing and Biomedical Engineering

Other publications Research

2024 European Conference on Computer Vision

0302-9743 (ISSN) 1611-3349 (eISSN)

Vol. 15131 LNCS 364-380
978-3-031-73464-9 (ISBN)

The 18th European Conference on Computer Vision ECCV 2024.
Milan, Italy,

Subject Categories (SSIF 2011)

Electrical Engineering, Electronic Engineering, Information Engineering

DOI

10.1007/978-3-031-73464-9_22

Publication data connected to DOI

More information

Latest update

11/5/2025

Text Prompt Augmentation for Zero-shot Out-of-Distribution Detection Paper in proceeding, 2025