Evaluation of Question Answer Generation for Portuguese: Insights and Datasets
Felipe Paula, Cassiana Roberta Lizzoni Michelin, Viviane Moreira
Abstract
Automatic question generation is an increasingly important task that can be applied in different settings, including educational purposes, data augmentation for question-answering (QA), and conversational systems. More specifically, we focus on question answer generation (QAG), which produces question-answer pairs given an input context. We adapt and apply QAG approaches to generate question-answer pairs for different domains and assess their capacity to generate accurate, diverse, and abundant question-answer pairs. Our analyses combine both qualitative and quantitative evaluations that allow insights into the quality and types of errors made by QAG methods. We also look into strategies for error filtering and their effects. Our work concentrates on Portuguese, a widely spoken language that is underrepresented in natural language processing research. To address the pressing need for resources, we generate and make available human-curated extractive QA datasets in three diverse domains.- Anthology ID:
- 2024.findings-emnlp.306
- Volume:
- Findings of the Association for Computational Linguistics: EMNLP 2024
- Month:
- November
- Year:
- 2024
- Address:
- Miami, Florida, USA
- Editors:
- Yaser Al-Onaizan, Mohit Bansal, Yun-Nung Chen
- Venue:
- Findings
- SIG:
- Publisher:
- Association for Computational Linguistics
- Note:
- Pages:
- 5315–5327
- Language:
- URL:
- https://preview.aclanthology.org/fix-sig-urls/2024.findings-emnlp.306/
- DOI:
- 10.18653/v1/2024.findings-emnlp.306
- Cite (ACL):
- Felipe Paula, Cassiana Roberta Lizzoni Michelin, and Viviane Moreira. 2024. Evaluation of Question Answer Generation for Portuguese: Insights and Datasets. In Findings of the Association for Computational Linguistics: EMNLP 2024, pages 5315–5327, Miami, Florida, USA. Association for Computational Linguistics.
- Cite (Informal):
- Evaluation of Question Answer Generation for Portuguese: Insights and Datasets (Paula et al., Findings 2024)
- PDF:
- https://preview.aclanthology.org/fix-sig-urls/2024.findings-emnlp.306.pdf