Flexible Generation from Fragmentary Linguistic Input

Peng Qian, Roger Levy


Abstract
The dominant paradigm for high-performance models in novel NLP tasks today is direct specialization for the task via training from scratch or fine-tuning large pre-trained models. But does direct specialization capture how humans approach novel language tasks? We hypothesize that human performance is better characterized by flexible inference through composition of basic computational motifs available to the human language user. To test this hypothesis, we formulate a set of novel fragmentary text completion tasks, and compare the behavior of three direct-specialization models against a new model we introduce, GibbsComplete, which composes two basic computational motifs central to contemporary models: masked and autoregressive word prediction. We conduct three types of evaluation: human judgments of completion quality, satisfaction of syntactic constraints imposed by the input fragment, and similarity to human behavior in the structural statistics of the completions. With no task-specific parameter tuning, GibbsComplete performs comparably to direct-specialization models in the first two evaluations, and outperforms all direct-specialization models in the third evaluation. These results support our hypothesis that human behavior in novel language tasks and environments may be better characterized by flexible composition of basic computational motifs rather than by direct specialization.
Anthology ID:
2022.acl-long.563
Volume:
Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers)
Month:
May
Year:
2022
Address:
Dublin, Ireland
Editors:
Smaranda Muresan, Preslav Nakov, Aline Villavicencio
Venue:
ACL
SIG:
Publisher:
Association for Computational Linguistics
Note:
Pages:
8176–8196
Language:
URL:
https://aclanthology.org/2022.acl-long.563
DOI:
10.18653/v1/2022.acl-long.563
Bibkey:
Cite (ACL):
Peng Qian and Roger Levy. 2022. Flexible Generation from Fragmentary Linguistic Input. In Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), pages 8176–8196, Dublin, Ireland. Association for Computational Linguistics.
Cite (Informal):
Flexible Generation from Fragmentary Linguistic Input (Qian & Levy, ACL 2022)
Copy Citation:
PDF:
https://preview.aclanthology.org/improve-issue-templates/2022.acl-long.563.pdf
Code
 pqian11/fragment-completion
Data
New York Times Annotated Corpus