AnaLog: Testing Analytical and Deductive Logic Learnability in Language Models

Samuel Ryb, Mario Giulianelli, Arabella Sinclair, Raquel Fernández


Abstract
We investigate the extent to which pre-trained language models acquire analytical and deductive logical reasoning capabilities as a side effect of learning word prediction. We present AnaLog, a natural language inference task designed to probe models for these capabilities, controlling for different invalid heuristics the models may adopt instead of learning the desired generalisations. We test four languagemodels on AnaLog, finding that they have all learned, to a different extent, to encode information that is predictive of entailment beyond shallow heuristics such as lexical overlap and grammaticality. We closely analyse the best performing language model and show that while it performs more consistently than other language models across logical connectives and reasoning domains, it still is sensitive to lexical and syntactic variations in the realisation of logical statements.
Anthology ID:
2022.starsem-1.5
Volume:
Proceedings of the 11th Joint Conference on Lexical and Computational Semantics
Month:
July
Year:
2022
Address:
Seattle, Washington
Editors:
Vivi Nastase, Ellie Pavlick, Mohammad Taher Pilehvar, Jose Camacho-Collados, Alessandro Raganato
Venue:
*SEM
SIG:
SIGLEX
Publisher:
Association for Computational Linguistics
Note:
Pages:
55–68
Language:
URL:
https://aclanthology.org/2022.starsem-1.5
DOI:
10.18653/v1/2022.starsem-1.5
Bibkey:
Cite (ACL):
Samuel Ryb, Mario Giulianelli, Arabella Sinclair, and Raquel Fernández. 2022. AnaLog: Testing Analytical and Deductive Logic Learnability in Language Models. In Proceedings of the 11th Joint Conference on Lexical and Computational Semantics, pages 55–68, Seattle, Washington. Association for Computational Linguistics.
Cite (Informal):
AnaLog: Testing Analytical and Deductive Logic Learnability in Language Models (Ryb et al., *SEM 2022)
Copy Citation:
PDF:
https://preview.aclanthology.org/emnlp-22-attachments/2022.starsem-1.5.pdf
Data
GLUE