Automating Interlingual Homograph Recognition with Parallel Sentences

Yi Han; Ryohei Sasano; Koichi Takeda

Automating Interlingual Homograph Recognition with Parallel Sentences

Abstract

Interlingual homographs are words that spell the same but possess different meanings across languages. Recognizing interlingual homographs from form-identical words generally needs linguistic knowledge and massive annotation work. In this paper, we propose an automatic interlingual homograph recognition method based on the cross-lingual word embedding similarity and co-occurrence of form-identical words in parallel sentences. We conduct experiments with various off-the-shelf language models coordinating with cross-lingual alignment operations and co-occurrence metrics on the Chinese-Japanese and English-Dutch language pairs. Experimental results demonstrate that our proposed method is able to make accurate and consistent predictions across languages.

Anthology ID:: 2022.findings-aacl.20
Volume:: Findings of the Association for Computational Linguistics: AACL-IJCNLP 2022
Month:: November
Year:: 2022
Address:: Online only
Venue:: Findings
SIG:
Publisher:: Association for Computational Linguistics
Note:
Pages:: 211–216
Language:
URL:: https://aclanthology.org/2022.findings-aacl.20
DOI:
Bibkey:
Cite (ACL):: Yi Han, Ryohei Sasano, and Koichi Takeda. 2022. Automating Interlingual Homograph Recognition with Parallel Sentences. In Findings of the Association for Computational Linguistics: AACL-IJCNLP 2022, pages 211–216, Online only. Association for Computational Linguistics.
Cite (Informal):: Automating Interlingual Homograph Recognition with Parallel Sentences (Han et al., Findings 2022)
Copy Citation:
PDF:: https://preview.aclanthology.org/remove-xml-comments/2022.findings-aacl.20.pdf

PDF Search