Abstract
We create a large-scale dialogue corpus that provides pragmatic paraphrases to advance technology for understanding the underlying intentions of users. While neural conversation models acquire the ability to generate fluent responses through training on a dialogue corpus, previous corpora have mainly focused on the literal meanings of utterances. However, in reality, people do not always present their intentions directly. For example, if a person said to the operator of a reservation service “I don’t have enough budget.”, they, in fact, mean “please find a cheaper option for me.” Our corpus provides a total of 71,498 indirect–direct utterance pairs accompanied by a multi-turn dialogue history extracted from the MultiWoZ dataset. In addition, we propose three tasks to benchmark the ability of models to recognize and generate indirect and direct utterances. We also investigated the performance of state-of-the-art pre-trained models as baselines.- Anthology ID:
- 2021.findings-emnlp.170
- Volume:
- Findings of the Association for Computational Linguistics: EMNLP 2021
- Month:
- November
- Year:
- 2021
- Address:
- Punta Cana, Dominican Republic
- Editors:
- Marie-Francine Moens, Xuanjing Huang, Lucia Specia, Scott Wen-tau Yih
- Venue:
- Findings
- SIG:
- SIGDAT
- Publisher:
- Association for Computational Linguistics
- Note:
- Pages:
- 1980–1989
- Language:
- URL:
- https://aclanthology.org/2021.findings-emnlp.170
- DOI:
- 10.18653/v1/2021.findings-emnlp.170
- Cite (ACL):
- Junya Takayama, Tomoyuki Kajiwara, and Yuki Arase. 2021. DIRECT: Direct and Indirect Responses in Conversational Text Corpus. In Findings of the Association for Computational Linguistics: EMNLP 2021, pages 1980–1989, Punta Cana, Dominican Republic. Association for Computational Linguistics.
- Cite (Informal):
- DIRECT: Direct and Indirect Responses in Conversational Text Corpus (Takayama et al., Findings 2021)
- PDF:
- https://preview.aclanthology.org/nschneid-patch-3/2021.findings-emnlp.170.pdf
- Code
- junya-takayama/direct
- Data
- MRPC, MultiWOZ, PAWS