Luís Pina

2022

pdf abs
Reproducibility in Computational Linguistics: Is Source Code Enough?
Mohammad Arvan | Luís Pina | Natalie Parde
Proceedings of the 2022 Conference on Empirical Methods in Natural Language Processing

The availability of source code has been put forward as one of the most critical factors for improving the reproducibility of scientific research. This work studies trends in source code availability at major computational linguistics conferences, namely, ACL, EMNLP, LREC, NAACL, and COLING. We observe positive trends, especially in conferences that actively promote reproducibility. We follow this by conducting a reproducibility study of eight papers published in EMNLP 2021, finding that source code releases leave much to be desired. Moving forward, we suggest all conferences require self-contained artifacts and provide a venue to evaluate such artifacts at the time of publication. Authors can include small-scale experiments and explicit scripts to generate each result to improve the reproducibility of their work.

pdf abs
Reproducibility of Exploring Neural Text Simplification Models: A Review
Mohammad Arvan | Luís Pina | Natalie Parde
Proceedings of the 15th International Conference on Natural Language Generation: Generation Challenges

The reproducibility of NLP research has drawn increased attention over the last few years. Several tools, guidelines, and metrics have been introduced to address concerns in regard to this problem; however, much work still remains to ensure widespread adoption of effective reproducibility standards. In this work, we review the reproducibility of Exploring Neural Text Simplification Models by Nisioi et al. (2017), evaluating it from three main aspects: data, software artifacts, and automatic evaluations. We discuss the challenges and issues we faced during this process. Furthermore, we explore the adequacy of current reproducibility standards. Our code, trained models, and a docker container of the environment used for training and evaluation are made publicly available.

Co-authors

Mohammad Arvan 2
Natalie Parde 2

Venues

emnlp1
inlg1