Hayate Nakano
2025
Just One is Enough: An Existence-based Alignment Check for Robust Japanese Pronunciation Estimation
Hayate Nakano
|
Nobuhiro Kaji
Proceedings of the 2025 Conference on Empirical Methods in Natural Language Processing: Industry Track
Neural models for Japanese pronunciation estimation often suffer from errors such ashallucinations (generating pronunciations that are not grounded in the input) and omissions (skipping parts of the input).Although attention-based alignment has been used to detect such errors,selecting reliable attention heads is difficult,and developing methods that can both detect and correct these errorsremains challenging.In this paper, we propose a simple method calledexistence-based alignment check.In this approach,we consider alignment candidatesindependently extracted from all attention heads,and check whether at least one of these candidates satisfies two conditionsderived from the linguistic properties of Japanese pronunciation:monotonicity and pronunciation length per character.We generate multiple hypotheses using beam searchand use the alignment check as a filtering mechanismto correct hallucinations and omissions.We apply this method to a dataset of Japanese facility namesand demonstrate that it improves pronunciation estimation accuracyby over 2.5%.