Cloze Evaluation for Deeper Understanding of Commonsense Stories in Indonesian

Fajri Koto, Timothy Baldwin, Jey Han Lau


Abstract
Story comprehension that involves complex causal and temporal relations is a critical task in NLP, but previous studies have focused predominantly on English, leaving open the question of how the findings generalize to other languages, such as Indonesian. In this paper, we follow the Story Cloze Test framework of Mostafazadeh et al. (2016) in evaluating story understanding in Indonesian, by constructing a four-sentence story with one correct ending and one incorrect ending. To investigate commonsense knowledge acquisition in language models, we experimented with: (1) a classification task to predict the correct ending; and (2) a generation task to complete the story with a single sentence. We investigate these tasks in two settings: (i) monolingual training and ii) zero-shot cross-lingual transfer between Indonesian and English.
Anthology ID:
2022.csrr-1.2
Volume:
Proceedings of the First Workshop on Commonsense Representation and Reasoning (CSRR 2022)
Month:
May
Year:
2022
Address:
Dublin, Ireland
Editors:
Antoine Bosselut, Xiang Li, Bill Yuchen Lin, Vered Shwartz, Bodhisattwa Prasad Majumder, Yash Kumar Lal, Rachel Rudinger, Xiang Ren, Niket Tandon, Vilém Zouhar
Venue:
CSRR
SIG:
Publisher:
Association for Computational Linguistics
Note:
Pages:
8–16
Language:
URL:
https://aclanthology.org/2022.csrr-1.2
DOI:
10.18653/v1/2022.csrr-1.2
Bibkey:
Cite (ACL):
Fajri Koto, Timothy Baldwin, and Jey Han Lau. 2022. Cloze Evaluation for Deeper Understanding of Commonsense Stories in Indonesian. In Proceedings of the First Workshop on Commonsense Representation and Reasoning (CSRR 2022), pages 8–16, Dublin, Ireland. Association for Computational Linguistics.
Cite (Informal):
Cloze Evaluation for Deeper Understanding of Commonsense Stories in Indonesian (Koto et al., CSRR 2022)
Copy Citation:
PDF:
https://preview.aclanthology.org/emnlp-22-attachments/2022.csrr-1.2.pdf
Video:
 https://preview.aclanthology.org/emnlp-22-attachments/2022.csrr-1.2.mp4
Code
 fajri91/indocloze