Yuan-Han Li


2022

pdf
Using Grammatical and Semantic Correction Model to Improve Chinese-to-Taiwanese Machine Translation Fluency
Yuan-Han Li | Chung-Ping Young | Wen-Hsiang Lu
Proceedings of the 34th Conference on Computational Linguistics and Speech Processing (ROCLING 2022)

Currently, there are three major issues to tackle in Chinese-to-Taiwanese machine translation: multi-pronunciation Taiwanese words, unknown words, and Chinese-to-Taiwanese grammatical and semantic transformation. Recent studies have mostly focused on the issues of multi-pronunciation Taiwanese words and unknown words, while very few research papers focus on grammatical and semantic transformation. However, there exist grammatical rules exclusive to Taiwanese that, if not translated properly, would cause the result to feel unnatural to native speakers and potentially twist the original meaning of the sentence, even with the right words and pronunciations. Therefore, this study collects and organizes a few common Taiwanese sentence structures and grammar rules, then creates a grammar and semantic correction model for Chinese-to-Taiwanese machine translation, which would detect and correct grammatical and semantic discrepancies between the two languages, thus improving translation fluency.