The Ups and Downs of Training RoBERTa-based models on Smaller Datasets for Translation Tasks from Classical Chinese into Modern Standard Mandarin and Modern English
Stuart Michael McManus, Roslin Liu, Yuji Li, Leo Tam, Stephanie Qiu, Letian Yu
Abstract
The paper presents an investigation into the effectiveness of pre-trained language models, Siku-RoBERTa and RoBERTa, for Classical Chinese to Modern Standard Mandarin and Classical Chinese to English translation tasks. The English translation model resulted in unsatisfactory performance due to the small dataset, while the Modern Standard Mandarin model gave reasonable results.- Anthology ID:
- 2023.alt-1.2
- Volume:
- Proceedings of ALT2023: Ancient Language Translation Workshop
- Month:
- September
- Year:
- 2023
- Address:
- Macau SAR, China
- Venue:
- alt
- SIG:
- Publisher:
- Asia-Pacific Association for Machine Translation
- Note:
- Pages:
- 15–22
- Language:
- URL:
- https://aclanthology.org/2023.alt-1.2
- DOI:
- Cite (ACL):
- Stuart Michael McManus, Roslin Liu, Yuji Li, Leo Tam, Stephanie Qiu, and Letian Yu. 2023. The Ups and Downs of Training RoBERTa-based models on Smaller Datasets for Translation Tasks from Classical Chinese into Modern Standard Mandarin and Modern English. In Proceedings of ALT2023: Ancient Language Translation Workshop, pages 15–22, Macau SAR, China. Asia-Pacific Association for Machine Translation.
- Cite (Informal):
- The Ups and Downs of Training RoBERTa-based models on Smaller Datasets for Translation Tasks from Classical Chinese into Modern Standard Mandarin and Modern English (McManus et al., alt 2023)
- PDF:
- https://preview.aclanthology.org/nschneid-patch-4/2023.alt-1.2.pdf