Abstract
This paper reports on a structured evaluation of feature-based Machine Learning algorithms for selecting the form of a referring expression in discourse context. Based on this evaluation, we selected seven feature sets from the literature, amounting to 65 distinct linguistic features. The features were then grouped into 9 broad classes. After building Random Forest models, we used Feature Importance Ranking and Sequential Forward Search methods to assess the “importance” of the features. Combining the results of the two methods, we propose a consensus feature set. The 6 features in our consensus set come from 4 different classes, namely grammatical role, inherent features of the referent, antecedent form and recency.- Anthology ID:
- 2020.coling-main.403
- Volume:
- Proceedings of the 28th International Conference on Computational Linguistics
- Month:
- December
- Year:
- 2020
- Address:
- Barcelona, Spain (Online)
- Editors:
- Donia Scott, Nuria Bel, Chengqing Zong
- Venue:
- COLING
- SIG:
- Publisher:
- International Committee on Computational Linguistics
- Note:
- Pages:
- 4575–4586
- Language:
- URL:
- https://aclanthology.org/2020.coling-main.403
- DOI:
- 10.18653/v1/2020.coling-main.403
- Cite (ACL):
- Fahime Same and Kees van Deemter. 2020. A Linguistic Perspective on Reference: Choosing a Feature Set for Generating Referring Expressions in Context. In Proceedings of the 28th International Conference on Computational Linguistics, pages 4575–4586, Barcelona, Spain (Online). International Committee on Computational Linguistics.
- Cite (Informal):
- A Linguistic Perspective on Reference: Choosing a Feature Set for Generating Referring Expressions in Context (Same & van Deemter, COLING 2020)
- PDF:
- https://preview.aclanthology.org/nschneid-patch-4/2020.coling-main.403.pdf
- Data
- OntoNotes 5.0