Abstract
We propose an approach to Named Entity Disambiguation that avoids a problem of standard work on the task (likewise affecting fully supervised, weakly supervised, or distantly supervised machine learning techniques): the treatment of name mentions referring to people with no (or very little) coverage in the textual training data is systematically incorrect. We propose to indirectly take into account the property information for the “non-prominent” name bearers, such as nationality and profession (e.g., for a Canadian law professor named Michael Jackson, with no Wikipedia article, it is very hard to obtain reliable textual training data). The target property information for the entities is directly available from name authority files, or inferrable, e.g., from listings of sportspeople etc. Our proposed approach employs topic modeling to exploit textual training data based on entities sharing the relevant properties. In experiments with a pilot implementation of the general approach, we show that the approach does indeed work well for name/referent pairs with limited textual coverage in the training data.- Anthology ID:
- C16-1140
- Volume:
- Proceedings of COLING 2016, the 26th International Conference on Computational Linguistics: Technical Papers
- Month:
- December
- Year:
- 2016
- Address:
- Osaka, Japan
- Editors:
- Yuji Matsumoto, Rashmi Prasad
- Venue:
- COLING
- SIG:
- Publisher:
- The COLING 2016 Organizing Committee
- Note:
- Pages:
- 1481–1492
- Language:
- URL:
- https://aclanthology.org/C16-1140
- DOI:
- Cite (ACL):
- Andrea Glaser and Jonas Kuhn. 2016. Named Entity Disambiguation for little known referents: a topic-based approach. In Proceedings of COLING 2016, the 26th International Conference on Computational Linguistics: Technical Papers, pages 1481–1492, Osaka, Japan. The COLING 2016 Organizing Committee.
- Cite (Informal):
- Named Entity Disambiguation for little known referents: a topic-based approach (Glaser & Kuhn, COLING 2016)
- PDF:
- https://preview.aclanthology.org/landing_page/C16-1140.pdf