Abstract
The detection of quotations (i.e., reported speech, thought, and writing) has established itself as an NLP analysis task. However, state-of-the-art models have been developed on the basis of specific corpora and incorpo- rate a high degree of corpus-specific assumptions and knowledge, which leads to fragmentation. In the spirit of task-agnostic modeling, we present a corpus-agnostic neural model for quotation detection and evaluate it on three corpora that vary in language, text genre, and structural assumptions. The model (a) approaches the state-of-the-art on the corpora when using established feature sets and (b) shows reasonable performance even when us- ing solely word forms, which makes it applicable for non-standard (i.e., historical) corpora.- Anthology ID:
- R19-1103
- Volume:
- Proceedings of the International Conference on Recent Advances in Natural Language Processing (RANLP 2019)
- Month:
- September
- Year:
- 2019
- Address:
- Varna, Bulgaria
- Venue:
- RANLP
- SIG:
- Publisher:
- INCOMA Ltd.
- Note:
- Pages:
- 888–894
- Language:
- URL:
- https://aclanthology.org/R19-1103
- DOI:
- 10.26615/978-954-452-056-4_103
- Cite (ACL):
- Sean Papay and Sebastian Padó. 2019. Quotation Detection and Classification with a Corpus-Agnostic Model. In Proceedings of the International Conference on Recent Advances in Natural Language Processing (RANLP 2019), pages 888–894, Varna, Bulgaria. INCOMA Ltd..
- Cite (Informal):
- Quotation Detection and Classification with a Corpus-Agnostic Model (Papay & Padó, RANLP 2019)
- PDF:
- https://preview.aclanthology.org/nodalida-main-page/R19-1103.pdf