Team Ohio State at CMCL 2021 Shared Task: Fine-Tuned RoBERTa for Eye-Tracking Data Prediction

Byung-Doh Oh


Abstract
This paper describes Team Ohio State’s approach to the CMCL 2021 Shared Task, the goal of which is to predict five eye-tracking features from naturalistic self-paced reading corpora. For this task, we fine-tune a pre-trained neural language model (RoBERTa; Liu et al., 2019) to predict each feature based on the contextualized representations. Moreover, motivated by previous eye-tracking studies, we include word length in characters and proportion of sentence processed as two additional input features. Our best model strongly outperforms the baseline and is also competitive with other systems submitted to the shared task. An ablation study shows that the word length feature contributes to making more accurate predictions, indicating the usefulness of features that are specific to the eye-tracking paradigm.
Anthology ID:
2021.cmcl-1.11
Volume:
Proceedings of the Workshop on Cognitive Modeling and Computational Linguistics
Month:
June
Year:
2021
Address:
Online
Editors:
Emmanuele Chersoni, Nora Hollenstein, Cassandra Jacobs, Yohei Oseki, Laurent Prévot, Enrico Santus
Venue:
CMCL
SIG:
Publisher:
Association for Computational Linguistics
Note:
Pages:
97–101
Language:
URL:
https://aclanthology.org/2021.cmcl-1.11
DOI:
10.18653/v1/2021.cmcl-1.11
Bibkey:
Cite (ACL):
Byung-Doh Oh. 2021. Team Ohio State at CMCL 2021 Shared Task: Fine-Tuned RoBERTa for Eye-Tracking Data Prediction. In Proceedings of the Workshop on Cognitive Modeling and Computational Linguistics, pages 97–101, Online. Association for Computational Linguistics.
Cite (Informal):
Team Ohio State at CMCL 2021 Shared Task: Fine-Tuned RoBERTa for Eye-Tracking Data Prediction (Oh, CMCL 2021)
Copy Citation:
PDF:
https://preview.aclanthology.org/emnlp-22-attachments/2021.cmcl-1.11.pdf
Code
 byungdoh/cmcl21_st