Abstract
The availability of source code has been put forward as one of the most critical factors for improving the reproducibility of scientific research. This work studies trends in source code availability at major computational linguistics conferences, namely, ACL, EMNLP, LREC, NAACL, and COLING. We observe positive trends, especially in conferences that actively promote reproducibility. We follow this by conducting a reproducibility study of eight papers published in EMNLP 2021, finding that source code releases leave much to be desired. Moving forward, we suggest all conferences require self-contained artifacts and provide a venue to evaluate such artifacts at the time of publication. Authors can include small-scale experiments and explicit scripts to generate each result to improve the reproducibility of their work.- Anthology ID:
- 2022.emnlp-main.150
- Volume:
- Proceedings of the 2022 Conference on Empirical Methods in Natural Language Processing
- Month:
- December
- Year:
- 2022
- Address:
- Abu Dhabi, United Arab Emirates
- Editors:
- Yoav Goldberg, Zornitsa Kozareva, Yue Zhang
- Venue:
- EMNLP
- SIG:
- Publisher:
- Association for Computational Linguistics
- Note:
- Pages:
- 2350–2361
- Language:
- URL:
- https://aclanthology.org/2022.emnlp-main.150
- DOI:
- 10.18653/v1/2022.emnlp-main.150
- Cite (ACL):
- Mohammad Arvan, Luís Pina, and Natalie Parde. 2022. Reproducibility in Computational Linguistics: Is Source Code Enough?. In Proceedings of the 2022 Conference on Empirical Methods in Natural Language Processing, pages 2350–2361, Abu Dhabi, United Arab Emirates. Association for Computational Linguistics.
- Cite (Informal):
- Reproducibility in Computational Linguistics: Is Source Code Enough? (Arvan et al., EMNLP 2022)
- PDF:
- https://preview.aclanthology.org/nschneid-patch-2/2022.emnlp-main.150.pdf