On the Effectiveness of Quasi Character-Level Models for Machine Translation

Salvador Carrión-Ponz; Francisco Casacuberta

On the Effectiveness of Quasi Character-Level Models for Machine Translation

Salvador Carrión-Ponz, Francisco Casacuberta

Abstract

Neural Machine Translation (NMT) models often use subword-level vocabularies to deal with rare or unknown words. Although some studies have shown the effectiveness of purely character-based models, these approaches have resulted in highly expensive models in computational terms. In this work, we explore the benefits of quasi-character-level models for very low-resource languages and their ability to mitigate the effects of the catastrophic forgetting problem. First, we conduct an empirical study on the efficacy of these models, as a function of the vocabulary and training set size, for a range of languages, domains, and architectures. Next, we study the ability of these models to mitigate the effects of catastrophic forgetting in machine translation. Our work suggests that quasi-character-level models have practically the same generalization capabilities as character-based models but at lower computational costs. Furthermore, they appear to help achieve greater consistency between domains than standard subword-level models, although the catastrophic forgetting problem is not mitigated.

Anthology ID:: 2022.amta-research.10
Volume:: Proceedings of the 15th biennial conference of the Association for Machine Translation in the Americas (Volume 1: Research Track)
Month:: September
Year:: 2022
Address:: Orlando, USA
Venue:: AMTA
SIG:
Publisher:: Association for Machine Translation in the Americas
Note:
Pages:: 131–143
Language:
URL:: https://aclanthology.org/2022.amta-research.10
DOI:
Bibkey:
Cite (ACL):: Salvador Carrión-Ponz and Francisco Casacuberta. 2022. On the Effectiveness of Quasi Character-Level Models for Machine Translation. In Proceedings of the 15th biennial conference of the Association for Machine Translation in the Americas (Volume 1: Research Track), pages 131–143, Orlando, USA. Association for Machine Translation in the Americas.
Cite (Informal):: On the Effectiveness of Quasi Character-Level Models for Machine Translation (Carrión-Ponz & Casacuberta, AMTA 2022)
Copy Citation:
PDF:: https://preview.aclanthology.org/paclic-22-ingestion/2022.amta-research.10.pdf

PDF Search