Abstract
This paper presents two novel datasets and a random-forest classifier to automatically predict literal vs. non-literal language usage for a highly frequent type of multi-word expression in a low-resource language, i.e., Estonian. We demonstrate the value of language-specific indicators induced from theoretical linguistic research, which outperform a high majority baseline when combined with language-independent features of non-literal language (such as abstractness).- Anthology ID:
- N18-4002
- Volume:
- Proceedings of the 2018 Conference of the North American Chapter of the Association for Computational Linguistics: Student Research Workshop
- Month:
- June
- Year:
- 2018
- Address:
- New Orleans, Louisiana, USA
- Venue:
- NAACL
- SIG:
- Publisher:
- Association for Computational Linguistics
- Note:
- Pages:
- 9–16
- Language:
- URL:
- https://aclanthology.org/N18-4002
- DOI:
- 10.18653/v1/N18-4002
- Cite (ACL):
- Eleri Aedmaa, Maximilian Köper, and Sabine Schulte im Walde. 2018. Combining Abstractness and Language-specific Theoretical Indicators for Detecting Non-Literal Usage of Estonian Particle Verbs. In Proceedings of the 2018 Conference of the North American Chapter of the Association for Computational Linguistics: Student Research Workshop, pages 9–16, New Orleans, Louisiana, USA. Association for Computational Linguistics.
- Cite (Informal):
- Combining Abstractness and Language-specific Theoretical Indicators for Detecting Non-Literal Usage of Estonian Particle Verbs (Aedmaa et al., NAACL 2018)
- PDF:
- https://preview.aclanthology.org/auto-file-uploads/N18-4002.pdf