Abstract
Recent research shows that pre-trained language models, built to generate text conditioned on some context, learn to encode syntactic knowledge to a certain degree. This has motivated researchers to move beyond the sentence-level and look into their ability to encode less studied discourse-level phenomena. In this paper, we add to the body of probing research by investigating discourse entity representations in large pre-trained language models in English. Motivated by early theories of discourse and key pieces of previous work, we focus on the information-status of entities as discourse-new or discourse-old. We present two probing models, one based on binary classification and another one on sequence labeling. The results of our experiments show that pre-trained language models do encode information on whether an entity has been introduced before or not in the discourse. However, this information alone is not sufficient to find the entities in a discourse, opening up interesting questions about the definition of entities for future work.- Anthology ID:
- 2022.coling-1.73
- Volume:
- Proceedings of the 29th International Conference on Computational Linguistics
- Month:
- October
- Year:
- 2022
- Address:
- Gyeongju, Republic of Korea
- Editors:
- Nicoletta Calzolari, Chu-Ren Huang, Hansaem Kim, James Pustejovsky, Leo Wanner, Key-Sun Choi, Pum-Mo Ryu, Hsin-Hsi Chen, Lucia Donatelli, Heng Ji, Sadao Kurohashi, Patrizia Paggio, Nianwen Xue, Seokhwan Kim, Younggyun Hahm, Zhong He, Tony Kyungil Lee, Enrico Santus, Francis Bond, Seung-Hoon Na
- Venue:
- COLING
- SIG:
- Publisher:
- International Committee on Computational Linguistics
- Note:
- Pages:
- 875–886
- Language:
- URL:
- https://aclanthology.org/2022.coling-1.73
- DOI:
- Cite (ACL):
- Sharid Loáiciga, Anne Beyer, and David Schlangen. 2022. New or Old? Exploring How Pre-Trained Language Models Represent Discourse Entities. In Proceedings of the 29th International Conference on Computational Linguistics, pages 875–886, Gyeongju, Republic of Korea. International Committee on Computational Linguistics.
- Cite (Informal):
- New or Old? Exploring How Pre-Trained Language Models Represent Discourse Entities (Loáiciga et al., COLING 2022)
- PDF:
- https://preview.aclanthology.org/nschneid-patch-1/2022.coling-1.73.pdf
- Code
- clp-research/new-old-discourse-entities