Fact Recall, Heuristics or Pure Guesswork? Precise Interpretations of Language Models for Fact Completion
Paper in proceeding, 2025

Language models (LMs) can make a correct prediction based on many possible signals in a prompt, not all corresponding to recall of factual associations. However, current interpretations of LMs fail to take this into account. For example, given the query “Astrid Lindgren was born in” with the corresponding completion “Sweden”, no difference is made between whether the prediction was based on knowing where the author was born or assuming that a person with a Swedish-sounding name was born in Sweden. In this paper, we present a model-specific recipe - PrISM - for constructing datasets with examples of four different prediction scenarios: generic language modeling, guesswork, heuristics recall and exact fact recall. We apply two popular interpretability methods to the scenarios: causal tracing (CT) and information flow analysis. We find that both yield distinct results for each scenario. Results for exact fact recall and generic language modeling scenarios confirm previous conclusions about the importance of mid-range MLP sublayers for fact recall, while results for guesswork and heuristics indicate a critical role of late last token position MLP sublayers. In summary, we contribute resources for a more extensive and granular study of fact completion in LMs, together with analyses that provide a more nuanced understanding of how LMs process fact-related queries.

knowledge representation

language models

interpretability

Author

Denitsa Saynova

University of Gothenburg

Chalmers, Computer Science and Engineering (Chalmers), Data Science and AI

Lovisa Hagström

University of Gothenburg

Chalmers, Computer Science and Engineering (Chalmers), Data Science and AI

Moa Johansson

Data Science and AI 2

University of Gothenburg

Richard Johansson

University of Gothenburg

Chalmers, Computer Science and Engineering (Chalmers), Data Science

Marco Kuhlmann

Linköping University

Findings of the Association for Computational Linguistics: ACL 2025

18322-18349
979-8-89176-256-5 (ISBN)

The 63rd Annual Meeting of the Association for Computational Linguistics
Vienna, Austria,

Subject Categories (SSIF 2025)

Natural Language Processing

Computer Sciences

Artificial Intelligence

Infrastructure

Chalmers e-Commons (incl. C3SE, 2020-)

DOI

10.18653/v1/2025.findings-acl.942

More information

Latest update

8/29/2025