Reasoning in Transformers – Mitigating Spurious Correlations and Reasoning Shortcuts
Paper i proceeding, 2024

Transformer language models are used for a wide variety of tasks, including some that also require logical reasoning. However, a transformer model may easily learn spurious patterns in the data, short-circuiting actual reasoning. We investigate to what extent transformers can be trained to a) approximate reasoning in propositional logic while b) avoiding known reasoning shortcuts via spurious correlations in the training data. To do so, we use a dataset with known spurious correlation between truth and e.g. the number of rules in the problem. We augment the data with proofs, and train two models based on generative transformers: WP-BART, trained to generate whole proofs at once, and a neuro-symbolic model, SIP-BART, trained to generate individual proof steps in combination with a symbolic proof checker. We find that SIP-BART succeeds in avoiding reasoning shortcuts, while WP-BART does not. For SIP-BART, we then identify a few remaining errors, arising from using a pre-trained language model. These are qualitatively analysed to create a taxonomy of four different types of additional pitfalls.

Författare

Daniel Enström

Göteborgs universitet

Viktor Kjellberg

Göteborgs universitet

Moa Johansson

Göteborgs universitet

Chalmers, Data- och informationsteknik, Data Science och AI

Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)

03029743 (ISSN) 16113349 (eISSN)

Vol. 14980 LNAI 207-221
9783031711695 (ISBN)

18th International Conference on Neural-Symbolic Learning and Reasoning, NeSy 2024
Barcelona, Spain,

Ämneskategorier

Datavetenskap (datalogi)

DOI

10.1007/978-3-031-71170-1_18

Mer information

Senast uppdaterat

2024-10-03