Type and Complexity Signals in Multilingual Question Representations

Robin Kokot, Wessel Poelman


Abstract
This work investigates how a multilingual transformer model represents morphosyntactic properties of questions. We introduce the Question Type and Complexity (QTC) dataset with sentences across seven languages, annotated with type information and complexity metrics including dependency length, tree depth, and lexical density. Our evaluation extends probing methods to regression labels with selectivity controls to quantify gains in generalizability. We compare layer-wise probes on frozen Glot500-m (Imani et al., 2023) representations against subword TF-IDF baselines, and a fine-tuned model. Results show that statistical features classify questions well in explicitly marked languages and structural complexity prediction, while neural probes lead on individual metrics. We use these results to evaluate when contextual representations outperform statistical baselines and whether parameter updates reduce availability of pre-trained linguistic information.
Anthology ID:
2025.mrl-main.28
Volume:
Proceedings of the 5th Workshop on Multilingual Representation Learning (MRL 2025)
Month:
November
Year:
2025
Address:
Suzhuo, China
Editors:
David Ifeoluwa Adelani, Catherine Arnett, Duygu Ataman, Tyler A. Chang, Hila Gonen, Rahul Raja, Fabian Schmidt, David Stap, Jiayi Wang
Venues:
MRL | WS
SIG:
Publisher:
Association for Computational Linguistics
Note:
Pages:
411–425
Language:
URL:
https://preview.aclanthology.org/ingest-emnlp/2025.mrl-main.28/
DOI:
Bibkey:
Cite (ACL):
Robin Kokot and Wessel Poelman. 2025. Type and Complexity Signals in Multilingual Question Representations. In Proceedings of the 5th Workshop on Multilingual Representation Learning (MRL 2025), pages 411–425, Suzhuo, China. Association for Computational Linguistics.
Cite (Informal):
Type and Complexity Signals in Multilingual Question Representations (Kokot & Poelman, MRL 2025)
Copy Citation:
PDF:
https://preview.aclanthology.org/ingest-emnlp/2025.mrl-main.28.pdf