Back-Transliteration of English Loanwords in Japanese

Yuying Ren


Abstract
We propose methods for transliterating English loanwords in Japanese from their Japanese written form (katakana/romaji) to their original English written form. Our data is a Japanese-English loanwords dictionary that we have created ourselves. We employ two approaches: the direct transliteration, which directly converts words from katakana to English, and the indirect transliteration, which utilizes the English pronunciation as an intermediate step. Additionally, we compare the effectiveness of using katakana versus romaji as input characters. We develop 6 models of 2 types for our experiments: one with an English lexicon-filter, and the other without. For each type, we built 3 models, including a pair n-gram based on WFSTs and two sequence-to-sequence models leveraging LSTM and transformer. Our best performing model was the pair n-gram model with a lexicon-filter, directly transliterating from katakana to English.
Anthology ID:
2023.cawl-1.6
Volume:
Proceedings of the Workshop on Computation and Written Language (CAWL 2023)
Month:
July
Year:
2023
Address:
Toronto, Canada
Venue:
CAWL
SIG:
Publisher:
Association for Computational Linguistics
Note:
Pages:
43–49
Language:
URL:
https://aclanthology.org/2023.cawl-1.6
DOI:
Bibkey:
Cite (ACL):
Yuying Ren. 2023. Back-Transliteration of English Loanwords in Japanese. In Proceedings of the Workshop on Computation and Written Language (CAWL 2023), pages 43–49, Toronto, Canada. Association for Computational Linguistics.
Cite (Informal):
Back-Transliteration of English Loanwords in Japanese (Ren, CAWL 2023)
Copy Citation:
PDF:
https://preview.aclanthology.org/nodalida-main-page/2023.cawl-1.6.pdf