TorontoCL at CMCL 2021 Shared Task: RoBERTa with Multi-Stage Fine-Tuning for Eye-Tracking Prediction

Bai Li, Frank Rudzicz


Abstract
Eye movement data during reading is a useful source of information for understanding language comprehension processes. In this paper, we describe our submission to the CMCL 2021 shared task on predicting human reading patterns. Our model uses RoBERTa with a regression layer to predict 5 eye-tracking features. We train the model in two stages: we first fine-tune on the Provo corpus (another eye-tracking dataset), then fine-tune on the task data. We compare different Transformer models and apply ensembling methods to improve the performance. Our final submission achieves a MAE score of 3.929, ranking 3rd place out of 13 teams that participated in this shared task.
Anthology ID:
2021.cmcl-1.9
Volume:
Proceedings of the Workshop on Cognitive Modeling and Computational Linguistics
Month:
June
Year:
2021
Address:
Online
Venue:
CMCL
SIG:
Publisher:
Association for Computational Linguistics
Note:
Pages:
85–89
Language:
URL:
https://aclanthology.org/2021.cmcl-1.9
DOI:
10.18653/v1/2021.cmcl-1.9
Bibkey:
Cite (ACL):
Bai Li and Frank Rudzicz. 2021. TorontoCL at CMCL 2021 Shared Task: RoBERTa with Multi-Stage Fine-Tuning for Eye-Tracking Prediction. In Proceedings of the Workshop on Cognitive Modeling and Computational Linguistics, pages 85–89, Online. Association for Computational Linguistics.
Cite (Informal):
TorontoCL at CMCL 2021 Shared Task: RoBERTa with Multi-Stage Fine-Tuning for Eye-Tracking Prediction (Li & Rudzicz, CMCL 2021)
Copy Citation:
PDF:
https://preview.aclanthology.org/ingestion-script-update/2021.cmcl-1.9.pdf
Code
 SPOClab-ca/cmcl-shared-task