The Vault: A Comprehensive Multilingual Dataset for Advancing Code Understanding and Generation
Dung Nguyen Manh, Nam Le Hai, Anh T. V. Dau, Anh Minh Nguyen, Khanh Nghiem, Jin Guo, Nghi D. Q. Bui
Abstract
Abstract. (not necessary)- Anthology ID:
- 2023.nlposs-1.25
- Volume:
- Proceedings of the 3rd Workshop for Natural Language Processing Open Source Software (NLP-OSS 2023)
- Month:
- December
- Year:
- 2023
- Address:
- Singapore
- Editors:
- Liling Tan, Dmitrijs Milajevs, Geeticka Chauhan, Jeremy Gwinnup, Elijah Rippeth
- Venues:
- NLPOSS | WS
- SIG:
- Publisher:
- Association for Computational Linguistics
- Note:
- Pages:
- 219–244
- Language:
- URL:
- https://aclanthology.org/2023.nlposs-1.25
- DOI:
- 10.18653/v1/2023.nlposs-1.25
- Cite (ACL):
- Dung Nguyen Manh, Nam Le Hai, Anh T. V. Dau, Anh Minh Nguyen, Khanh Nghiem, Jin Guo, and Nghi D. Q. Bui. 2023. The Vault: A Comprehensive Multilingual Dataset for Advancing Code Understanding and Generation. In Proceedings of the 3rd Workshop for Natural Language Processing Open Source Software (NLP-OSS 2023), pages 219–244, Singapore. Association for Computational Linguistics.
- Cite (Informal):
- The Vault: A Comprehensive Multilingual Dataset for Advancing Code Understanding and Generation (Manh et al., NLPOSS-WS 2023)
- PDF:
- https://preview.aclanthology.org/ingest-acl-2023-videos/2023.nlposs-1.25.pdf