Yi Han


2022

pdf
Automating Interlingual Homograph Recognition with Parallel Sentences
Yi Han | Ryohei Sasano | Koichi Takeda
Findings of the Association for Computational Linguistics: AACL-IJCNLP 2022

Interlingual homographs are words that spell the same but possess different meanings across languages. Recognizing interlingual homographs from form-identical words generally needs linguistic knowledge and massive annotation work. In this paper, we propose an automatic interlingual homograph recognition method based on the cross-lingual word embedding similarity and co-occurrence of form-identical words in parallel sentences. We conduct experiments with various off-the-shelf language models coordinating with cross-lingual alignment operations and co-occurrence metrics on the Chinese-Japanese and English-Dutch language pairs. Experimental results demonstrate that our proposed method is able to make accurate and consistent predictions across languages.