@inproceedings{kaing-etal-2024-robust,
    title = "Robust Neural Machine Translation for Abugidas by Glyph Perturbation",
    author = "Kaing, Hour  and
      Ding, Chenchen  and
      Tanaka, Hideki  and
      Utiyama, Masao",
    editor = "Graham, Yvette  and
      Purver, Matthew",
    booktitle = "Proceedings of the 18th Conference of the European Chapter of the Association for Computational Linguistics (Volume 2: Short Papers)",
    month = mar,
    year = "2024",
    address = "St. Julian{'}s, Malta",
    publisher = "Association for Computational Linguistics",
    url = "https://preview.aclanthology.org/ingest-emnlp/2024.eacl-short.27/",
    doi = "10.18653/v1/2024.eacl-short.27",
    pages = "311--318",
    abstract = "Neural machine translation (NMT) systems are vulnerable when trained on limited data. This is a common scenario in low-resource tasks in the real world. To increase robustness, a solution is to intently add realistic noise in the training phase. Noise simulation using text perturbation has been proven to be efficient in writing systems that use Latin letters. In this study, we further explore perturbation techniques on more complex abugida writing systems, for which the visual similarity of complex glyphs is considered to capture the essential nature of these writing systems. Besides the generated noise, we propose a training strategy to improve robustness. We conducted experiments on six languages: Bengali, Hindi, Myanmar, Khmer, Lao, and Thai. By overcoming the introduced noise, we obtained non-degenerate NMT systems with improved robustness for low-resource tasks for abugida glyphs."
}Markdown (Informal)
[Robust Neural Machine Translation for Abugidas by Glyph Perturbation](https://preview.aclanthology.org/ingest-emnlp/2024.eacl-short.27/) (Kaing et al., EACL 2024)
ACL
- Hour Kaing, Chenchen Ding, Hideki Tanaka, and Masao Utiyama. 2024. Robust Neural Machine Translation for Abugidas by Glyph Perturbation. In Proceedings of the 18th Conference of the European Chapter of the Association for Computational Linguistics (Volume 2: Short Papers), pages 311–318, St. Julian’s, Malta. Association for Computational Linguistics.