Abstract
Neural Machine Translation (NMT) models often use subword-level vocabularies to deal with rare or unknown words. Although some studies have shown the effectiveness of purely character-based models, these approaches have resulted in highly expensive models in computational terms. In this work, we explore the benefits of quasi-character-level models for very low-resource languages and their ability to mitigate the effects of the catastrophic forgetting problem. First, we conduct an empirical study on the efficacy of these models, as a function of the vocabulary and training set size, for a range of languages, domains, and architectures. Next, we study the ability of these models to mitigate the effects of catastrophic forgetting in machine translation. Our work suggests that quasi-character-level models have practically the same generalization capabilities as character-based models but at lower computational costs. Furthermore, they appear to help achieve greater consistency between domains than standard subword-level models, although the catastrophic forgetting problem is not mitigated.- Anthology ID:
- 2022.amta-research.10
- Volume:
- Proceedings of the 15th biennial conference of the Association for Machine Translation in the Americas (Volume 1: Research Track)
- Month:
- September
- Year:
- 2022
- Address:
- Orlando, USA
- Venue:
- AMTA
- SIG:
- Publisher:
- Association for Machine Translation in the Americas
- Note:
- Pages:
- 131–143
- Language:
- URL:
- https://aclanthology.org/2022.amta-research.10
- DOI:
- Cite (ACL):
- Salvador Carrión-Ponz and Francisco Casacuberta. 2022. On the Effectiveness of Quasi Character-Level Models for Machine Translation. In Proceedings of the 15th biennial conference of the Association for Machine Translation in the Americas (Volume 1: Research Track), pages 131–143, Orlando, USA. Association for Machine Translation in the Americas.
- Cite (Informal):
- On the Effectiveness of Quasi Character-Level Models for Machine Translation (Carrión-Ponz & Casacuberta, AMTA 2022)
- PDF:
- https://preview.aclanthology.org/paclic-22-ingestion/2022.amta-research.10.pdf