Joshua Bensemann
2022
AbductionRules: Training Transformers to Explain Unexpected Inputs
Nathan Young
|
Qiming Bao
|
Joshua Bensemann
|
Michael Witbrock
Findings of the Association for Computational Linguistics: ACL 2022
Transformers have recently been shown to be capable of reliably performing logical reasoning over facts and rules expressed in natural language, but abductive reasoning - inference to the best explanation of an unexpected observation - has been underexplored despite significant applications to scientific discovery, common-sense reasoning, and model interpretability. This paper presents AbductionRules, a group of natural language datasets designed to train and test generalisable abduction over natural-language knowledge bases. We use these datasets to finetune pretrained Transformers and discuss their performance, finding that our models learned generalisable abductive techniques but also learned to exploit the structure of our data. Finally, we discuss the viability of this approach to abductive reasoning and ways in which it may be improved in future work.
Eye Gaze and Self-attention: How Humans and Transformers Attend Words in Sentences
Joshua Bensemann
|
Alex Peng
|
Diana Benavides-Prado
|
Yang Chen
|
Neset Tan
|
Paul Michael Corballis
|
Patricia Riddle
|
Michael Witbrock
Proceedings of the Workshop on Cognitive Modeling and Computational Linguistics
Attention describes cognitive processes that are important to many human phenomena including reading. The term is also used to describe the way in which transformer neural networks perform natural language processing. While attention appears to be very different under these two contexts, this paper presents an analysis of the correlations between transformer attention and overt human attention during reading tasks. An extensive analysis of human eye tracking datasets showed that the dwell times of human eye movements were strongly correlated with the attention patterns occurring in the early layers of pre-trained transformers such as BERT. Additionally, the strength of a correlation was not related to the number of parameters within a transformer. This suggests that something about the transformers’ architecture determined how closely the two measures were correlated.
Search
Co-authors
- Michael J. Witbrock 2
- Nathan Young 1
- Qiming Bao 1
- Alex Peng 1
- Diana Benavides-Prado 1
- show all...