HkAmsters at CMCL 2022 Shared Task: Predicting Eye-Tracking Data from a Gradient Boosting Framework with Linguistic Features

Lavinia Salicchi, Rong Xiang, Yu-Yin Hsu


Abstract
Eye movement data are used in psycholinguistic studies to infer information regarding cognitive processes during reading. In this paper, we describe our proposed method for the Shared Task of Cognitive Modeling and Computational Linguistics (CMCL) 2022 - Subtask 1, which involves data from multiple datasets on 6 languages. We compared different regression models using features of the target word and its previous word, and target word surprisal as regression features. Our final system, using a gradient boosting regressor, achieved the lowest mean absolute error (MAE), resulting in the best system of the competition.
Anthology ID:
2022.cmcl-1.13
Volume:
Proceedings of the Workshop on Cognitive Modeling and Computational Linguistics
Month:
May
Year:
2022
Address:
Dublin, Ireland
Venue:
CMCL
SIG:
Publisher:
Association for Computational Linguistics
Note:
Pages:
114–120
Language:
URL:
https://aclanthology.org/2022.cmcl-1.13
DOI:
10.18653/v1/2022.cmcl-1.13
Bibkey:
Cite (ACL):
Lavinia Salicchi, Rong Xiang, and Yu-Yin Hsu. 2022. HkAmsters at CMCL 2022 Shared Task: Predicting Eye-Tracking Data from a Gradient Boosting Framework with Linguistic Features. In Proceedings of the Workshop on Cognitive Modeling and Computational Linguistics, pages 114–120, Dublin, Ireland. Association for Computational Linguistics.
Cite (Informal):
HkAmsters at CMCL 2022 Shared Task: Predicting Eye-Tracking Data from a Gradient Boosting Framework with Linguistic Features (Salicchi et al., CMCL 2022)
Copy Citation:
PDF:
https://preview.aclanthology.org/ingestion-script-update/2022.cmcl-1.13.pdf
Video:
 https://preview.aclanthology.org/ingestion-script-update/2022.cmcl-1.13.mp4