Lang Yu


2022

pdf
“No, They Did Not”: Dialogue Response Dynamics in Pre-trained Language Models
Sanghee J. Kim | Lang Yu | Allyson Ettinger
Proceedings of the 29th International Conference on Computational Linguistics

A critical component of competence in language is being able to identify relevant components of an utterance and reply appropriately. In this paper we examine the extent of such dialogue response sensitivity in pre-trained language models, conducting a series of experiments with a particular focus on sensitivity to dynamics involving phenomena of at-issueness and ellipsis. We find that models show clear sensitivity to a distinctive role of embedded clauses, and a general preference for responses that target main clause content of prior utterances. However, the results indicate mixed and generally weak trends with respect to capturing the full range of dynamics involved in targeting at-issue versus not-at-issue content. Additionally, models show fundamental limitations in grasp of the dynamics governing ellipsis, and response selections show clear interference from superficial factors that outweigh the influence of principled discourse constraints.

2021

pdf
On the Interplay Between Fine-tuning and Composition in Transformers
Lang Yu | Allyson Ettinger
Findings of the Association for Computational Linguistics: ACL-IJCNLP 2021

2020

pdf
Assessing Phrasal Representation and Composition in Transformers
Lang Yu | Allyson Ettinger
Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing (EMNLP)

Deep transformer models have pushed performance on NLP tasks to new limits, suggesting sophisticated treatment of complex linguistic inputs, such as phrases. However, we have limited understanding of how these models handle representation of phrases, and whether this reflects sophisticated composition of phrase meaning like that done by humans. In this paper, we present systematic analysis of phrasal representations in state-of-the-art pre-trained transformers. We use tests leveraging human judgments of phrase similarity and meaning shift, and compare results before and after control of word overlap, to tease apart lexical effects versus composition effects. We find that phrase representation in these models relies heavily on word content, with little evidence of nuanced composition. We also identify variations in phrase representation quality across models, layers, and representation types, and make corresponding recommendations for usage of representations from these models.