Kung Hong


Fixing paper assignments

  1. Please select all papers that belong to the same person.
  2. Indicate below which author they should be assigned to.
Provide a valid ORCID iD here. This will be used to match future papers to this author.
Provide the name of the school or the university where the author has received or will receive their highest degree (e.g., Ph.D. institution for researchers, or current affiliation for students). This will be used to form the new author page ID, if needed.

TODO: "submit" and "cancel" buttons here


2024

pdf bib
CantonMT: Cantonese to English NMT Platform with Fine-Tuned Models using Real and Synthetic Back-Translation Data
Kung Hong | Lifeng Han | Riza Batista-Navarro | Goran Nenadic
Proceedings of the 25th Annual Conference of the European Association for Machine Translation (Volume 1)

Neural Machine Translation (NMT) for low-resource languages remains a challenge for many NLP researchers. In this work, we deploy a standard data augmentation methodology by back-translation to a new language translation direction, i.e., Cantonese-to-English. We present the models we fine-tuned using the limited amount of real data and the synthetic data we generated using back-translation by three models: OpusMT, NLLB, and mBART.We carried out automatic evaluation using a range of different metrics including those that are lexical-based and embedding-based.Furthermore, we create a user-friendly interface for the models we included in this project, CantonMT, and make it available to facilitate Cantonese-to-English MT research. Researchers can add more models to this platform via our open-source CantonMT toolkit, available at https://github.com/kenrickkung/CantoneseTranslation.