Unsupervised Learning of Discourse-Aware Text Representation for Essay Scoring

Farjana Sultana Mim, Naoya Inoue, Paul Reisert, Hiroki Ouchi, Kentaro Inui


Abstract
Existing document embedding approaches mainly focus on capturing sequences of words in documents. However, some document classification and regression tasks such as essay scoring need to consider discourse structure of documents. Although some prior approaches consider this issue and utilize discourse structure of text for document classification, these approaches are dependent on computationally expensive parsers. In this paper, we propose an unsupervised approach to capture discourse structure in terms of coherence and cohesion for document embedding that does not require any expensive parser or annotation. Extrinsic evaluation results show that the document representation obtained from our approach improves the performance of essay Organization scoring and Argument Strength scoring.
Anthology ID:
P19-2053
Volume:
Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics: Student Research Workshop
Month:
July
Year:
2019
Address:
Florence, Italy
Venue:
ACL
SIG:
Publisher:
Association for Computational Linguistics
Note:
Pages:
378–385
Language:
URL:
https://aclanthology.org/P19-2053
DOI:
10.18653/v1/P19-2053
Bibkey:
Cite (ACL):
Farjana Sultana Mim, Naoya Inoue, Paul Reisert, Hiroki Ouchi, and Kentaro Inui. 2019. Unsupervised Learning of Discourse-Aware Text Representation for Essay Scoring. In Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics: Student Research Workshop, pages 378–385, Florence, Italy. Association for Computational Linguistics.
Cite (Informal):
Unsupervised Learning of Discourse-Aware Text Representation for Essay Scoring (Mim et al., ACL 2019)
Copy Citation:
PDF:
https://preview.aclanthology.org/remove-xml-comments/P19-2053.pdf
Code
 FarjanaSultanaMim/DiscoShuffle