Reasoning in Transformers – Mitigating Spurious Correlations and Reasoning Shortcuts
Paper in proceeding, 2024

Transformer language models are used for a wide variety of tasks, including some that also require logical reasoning. However, a transformer model may easily learn spurious patterns in the data, short-circuiting actual reasoning. We investigate to what extent transformers can be trained to a) approximate reasoning in propositional logic while b) avoiding known reasoning shortcuts via spurious correlations in the training data. To do so, we use a dataset with known spurious correlation between truth and e.g. the number of rules in the problem. We augment the data with proofs, and train two models based on generative transformers: WP-BART, trained to generate whole proofs at once, and a neuro-symbolic model, SIP-BART, trained to generate individual proof steps in combination with a symbolic proof checker. We find that SIP-BART succeeds in avoiding reasoning shortcuts, while WP-BART does not. For SIP-BART, we then identify a few remaining errors, arising from using a pre-trained language model. These are qualitatively analysed to create a taxonomy of four different types of additional pitfalls.

Author

Daniel Enström

University of Gothenburg

Viktor Kjellberg

University of Gothenburg

Moa Johansson

University of Gothenburg

Chalmers, Computer Science and Engineering (Chalmers), Data Science and AI

Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)

03029743 (ISSN) 16113349 (eISSN)

Vol. 14980 LNAI 207-221
9783031711695 (ISBN)

18th International Conference on Neural-Symbolic Learning and Reasoning, NeSy 2024
Barcelona, Spain,

Subject Categories (SSIF 2011)

Computer Science

DOI

10.1007/978-3-031-71170-1_18

More information

Latest update

10/3/2024