CogBERT: Cognition-Guided Pre-trained Language Models

Xiao Ding; Bowen Chen; Li Du; Bing Qin; Ting Liu

CogBERT: Cognition-Guided Pre-trained Language Models

Xiao Ding, Bowen Chen, Li Du, Bing Qin, Ting Liu

Abstract

We study the problem of integrating cognitive language processing signals (e.g., eye-tracking or EEG data) into pre-trained language models like BERT. Existing methods typically fine-tune pre-trained models on cognitive data, ignoring the semantic gap between the texts and cognitive signals. To fill the gap, we propose CogBERT, a framework that can induce fine-grained cognitive features from cognitive data and incorporate cognitive features into BERT by adaptively adjusting the weight of cognitive features for different NLP tasks. Extensive experiments show that: (1) Cognition-guided pre-trained models can consistently perform better than basic pre-trained models on ten NLP tasks. (2) Different cognitive features contribute differently to different NLP tasks. Based on this observation, we give a fine-grained explanation of why cognitive data is helpful for NLP. (3) Different transformer layers of pre-trained models should encode different cognitive features, with word-level cognitive features at the bottom and semantic-level cognitive features at the top. (4) Attention visualization demonstrates that CogBERT aligns with human gaze patterns and improves its natural language comprehension ability.

Anthology ID:: 2022.coling-1.284
Volume:: Proceedings of the 29th International Conference on Computational Linguistics
Month:: October
Year:: 2022
Address:: Gyeongju, Republic of Korea
Venue:: COLING
SIG:
Publisher:: International Committee on Computational Linguistics
Note:
Pages:: 3210–3225
Language:
URL:: https://aclanthology.org/2022.coling-1.284
DOI:
Bibkey:
Cite (ACL):: Xiao Ding, Bowen Chen, Li Du, Bing Qin, and Ting Liu. 2022. CogBERT: Cognition-Guided Pre-trained Language Models. In Proceedings of the 29th International Conference on Computational Linguistics, pages 3210–3225, Gyeongju, Republic of Korea. International Committee on Computational Linguistics.
Cite (Informal):: CogBERT: Cognition-Guided Pre-trained Language Models (Ding et al., COLING 2022)
Copy Citation:
PDF:: https://preview.aclanthology.org/ingestion-script-update/2022.coling-1.284.pdf
Code: PosoSAgapo/cogbert
Data: CoNLL-2003, GLUE, QNLI

PDF Search Code