Named entity translation using anchor texts

Wang Ling, Pável Calado, Bruno Martins, Isabel Trancoso, Alan Black, Luísa Coheur


Abstract
This work describes a process to extract Named Entity (NE) translations from the text available in web links (anchor texts). It translates a NE by retrieving a list of web documents in the target language, extracting the anchor texts from the links to those documents and finding the best translation from the anchor texts, using a combination of features, some of which, are specific to anchor texts. Experiments performed on a manually built corpora, suggest that over 70% of the NEs, ranging from unpopular to popular entities, can be translated correctly using sorely anchor texts. Tests on a Machine Translation task indicate that the system can be used to improve the quality of the translations of state-of-the-art statistical machine translation systems.
Anthology ID:
2011.iwslt-papers.3
Volume:
Proceedings of the 8th International Workshop on Spoken Language Translation: Papers
Month:
December 8-9
Year:
2011
Address:
San Francisco, California
Editors:
Marcello Federico, Mei-Yuh Hwang, Margit Rödder, Sebastian Stüker
Venue:
IWSLT
SIG:
SIGSLT
Publisher:
Note:
Pages:
206–213
Language:
URL:
https://aclanthology.org/2011.iwslt-papers.3
DOI:
Bibkey:
Cite (ACL):
Wang Ling, Pável Calado, Bruno Martins, Isabel Trancoso, Alan Black, and Luísa Coheur. 2011. Named entity translation using anchor texts. In Proceedings of the 8th International Workshop on Spoken Language Translation: Papers, pages 206–213, San Francisco, California.
Cite (Informal):
Named entity translation using anchor texts (Ling et al., IWSLT 2011)
Copy Citation:
PDF:
https://preview.aclanthology.org/nschneid-patch-3/2011.iwslt-papers.3.pdf