Extrinsic evaluation of question generation methods with user journey logs
Elie Antoine, Eléonore Besnehard, Frederic Bechet, Geraldine Damnati, Eric Kergosien, Arnaud Laborderie
Abstract
There is often a significant disparity between the performance of Natural Language Processing (NLP) tools as evaluated on benchmark datasets using metrics like ROUGE or BLEU, and the actual user experience encountered when employing these tools in real-world scenarios. This highlights the critical necessity for user-oriented studies aimed at evaluating user experience concerning the effectiveness of developed methodologies. A primary challenge in such “ecological” user studies is their assessment of specific configurations of NLP tools, making replication under identical conditions impractical. Consequently, their utility is limited for the automated evaluation and comparison of different configurations of the same tool. The objective of this study is to conduct an “ecological” evaluation of a question generation within the context of an external task involving document linking. To do this we conducted an "ecological" evaluation of a document linking tool in the context of the exploration of a Social Science archives and from this evaluation, we aim to derive a form of a “reference corpus” that can be used offline for the automated comparison of models and quantitative tool assessment. This corpus is available on the following link: https://gitlab.lis-lab.fr/archival-public/autogestion-qa-linking- Anthology ID:
- 2024.humeval-1.6
- Volume:
- Proceedings of the Fourth Workshop on Human Evaluation of NLP Systems (HumEval) @ LREC-COLING 2024
- Month:
- May
- Year:
- 2024
- Address:
- Torino, Italia
- Editors:
- Simone Balloccu, Anya Belz, Rudali Huidrom, Ehud Reiter, Joao Sedoc, Craig Thomson
- Venues:
- HumEval | WS
- SIG:
- Publisher:
- ELRA and ICCL
- Note:
- Pages:
- 63–70
- Language:
- URL:
- https://aclanthology.org/2024.humeval-1.6
- DOI:
- Cite (ACL):
- Elie Antoine, Eléonore Besnehard, Frederic Bechet, Geraldine Damnati, Eric Kergosien, and Arnaud Laborderie. 2024. Extrinsic evaluation of question generation methods with user journey logs. In Proceedings of the Fourth Workshop on Human Evaluation of NLP Systems (HumEval) @ LREC-COLING 2024, pages 63–70, Torino, Italia. ELRA and ICCL.
- Cite (Informal):
- Extrinsic evaluation of question generation methods with user journey logs (Antoine et al., HumEval-WS 2024)
- PDF:
- https://preview.aclanthology.org/nschneid-patch-5/2024.humeval-1.6.pdf