Parsing entire discourses as very long strings: Capturing topic continuity in grounded language learning

Minh-Thang Luong; Michael C. Frank; Mark Johnson

doi:10.1162/tacl_a_00230

Parsing entire discourses as very long strings: Capturing topic continuity in grounded language learning

Minh-Thang Luong, Michael C. Frank, Mark Johnson

Abstract

Grounded language learning, the task of mapping from natural language to a representation of meaning, has attracted more and more interest in recent years. In most work on this topic, however, utterances in a conversation are treated independently and discourse structure information is largely ignored. In the context of language acquisition, this independence assumption discards cues that are important to the learner, e.g., the fact that consecutive utterances are likely to share the same referent (Frank et al., 2013). The current paper describes an approach to the problem of simultaneously modeling grounded language at the sentence and discourse levels. We combine ideas from parsing and grammar induction to produce a parser that can handle long input strings with thousands of tokens, creating parse trees that represent full discourses. By casting grounded language learning as a grammatical inference task, we use our parser to extend the work of Johnson et al. (2012), investigating the importance of discourse continuity in children’s language acquisition and its interaction with social cues. Our model boosts performance in a language acquisition task and yields good discourse segmentations compared with human annotators.

Anthology ID:: Q13-1026
Volume:: Transactions of the Association for Computational Linguistics, Volume 1
Month:
Year:: 2013
Address:: Cambridge, MA
Editors:: Dekang Lin, Michael Collins
Venue:: TACL
SIG:
Publisher:: MIT Press
Note:
Pages:: 315–326
Language:
URL:: https://aclanthology.org/Q13-1026
DOI:: 10.1162/tacl_a_00230
Bibkey:
Cite (ACL):: Minh-Thang Luong, Michael C. Frank, and Mark Johnson. 2013. Parsing entire discourses as very long strings: Capturing topic continuity in grounded language learning. Transactions of the Association for Computational Linguistics, 1:315–326.
Cite (Informal):: Parsing entire discourses as very long strings: Capturing topic continuity in grounded language learning (Luong et al., TACL 2013)
Copy Citation:
PDF:: https://preview.aclanthology.org/nschneid-patch-5/Q13-1026.pdf
Data: RoboCup

PDF Search