Abstract
Nepali, a low-resource language belonging to the Indo-Aryan language family and spoken in Nepal, India, Sikkim, and Burma has comparatively very little digital content and resources, more particularly in the legal domain. However, the need to translate legal documents is ever-increasing in the context of growing volumes of legal cases and a large population seeking to go abroad for higher education or employment. This underscores the need for developing an English-Nepali Machine Translation for the legal domain. We attempt to address this problem by utilizing a Neural Machine Translation (NMT) System with an encoder-decoder architecture, specifically designed for legal Nepali-English translation. Leveraging a custom-built legal corpus of 125,000 parallel sentences, our system achieves encouraging BLEU scores of 7.98 in (Nepali → English) and 6.63 (English → Nepali) direction- Anthology ID:
- 2024.sigul-1.7
- Volume:
- Proceedings of the 3rd Annual Meeting of the Special Interest Group on Under-resourced Languages @ LREC-COLING 2024
- Month:
- May
- Year:
- 2024
- Address:
- Torino, Italia
- Editors:
- Maite Melero, Sakriani Sakti, Claudia Soria
- Venues:
- SIGUL | WS
- SIG:
- Publisher:
- ELRA and ICCL
- Note:
- Pages:
- 53–58
- Language:
- URL:
- https://aclanthology.org/2024.sigul-1.7
- DOI:
- Cite (ACL):
- Shabdapurush Poudel, Bal Krishna Bal, and Praveen Acharya. 2024. Bidirectional English-Nepali Machine Translation(MT) System for Legal Domain. In Proceedings of the 3rd Annual Meeting of the Special Interest Group on Under-resourced Languages @ LREC-COLING 2024, pages 53–58, Torino, Italia. ELRA and ICCL.
- Cite (Informal):
- Bidirectional English-Nepali Machine Translation(MT) System for Legal Domain (Poudel et al., SIGUL-WS 2024)
- PDF:
- https://preview.aclanthology.org/nschneid-patch-5/2024.sigul-1.7.pdf