Abstract
Discourse relation classification has proven to be a hard task, with rather low performance on several corpora that notably differ on the relation set they use. We propose to decompose the task into smaller, mostly binary tasks corresponding to various primitive concepts encoded into the discourse relation definitions. More precisely, we translate the discourse relations into a set of values for attributes based on distinctions used in the mappings between discourse frameworks proposed by Sanders et al. (2018). This arguably allows for a more robust representation of discourse relations, and enables us to address usually ignored aspects of discourse relation prediction, namely multiple labels and underspecified annotations. We show experimentally which of the conceptual primitives are harder to learn from the Penn Discourse Treebank English corpus, and propose a correspondence to predict the original labels, with preliminary empirical comparisons with a direct model.- Anthology ID:
- W19-5950
- Volume:
- Proceedings of the 20th Annual SIGdial Meeting on Discourse and Dialogue
- Month:
- September
- Year:
- 2019
- Address:
- Stockholm, Sweden
- Venue:
- SIGDIAL
- SIG:
- SIGDIAL
- Publisher:
- Association for Computational Linguistics
- Note:
- Pages:
- 432–441
- Language:
- URL:
- https://aclanthology.org/W19-5950
- DOI:
- 10.18653/v1/W19-5950
- Cite (ACL):
- Charlotte Roze, Chloé Braud, and Philippe Muller. 2019. Which aspects of discourse relations are hard to learn? Primitive decomposition for discourse relation classification. In Proceedings of the 20th Annual SIGdial Meeting on Discourse and Dialogue, pages 432–441, Stockholm, Sweden. Association for Computational Linguistics.
- Cite (Informal):
- Which aspects of discourse relations are hard to learn? Primitive decomposition for discourse relation classification (Roze et al., SIGDIAL 2019)
- PDF:
- https://preview.aclanthology.org/ingestion-script-update/W19-5950.pdf