MaiNLP at SemEval-2024 Task 1: Analyzing Source Language Selection in Cross-Lingual Textual Relatedness

Shijia Zhou; Huangyan Shan; Barbara Plank; Robert Litschko

doi:10.18653/v1/2024.semeval-1.259

MaiNLP at SemEval-2024 Task 1: Analyzing Source Language Selection in Cross-Lingual Textual Relatedness

Shijia Zhou, Huangyan Shan, Barbara Plank, Robert Litschko

Abstract

This paper presents our system developed for the SemEval-2024 Task 1: Semantic Textual Relatedness (STR), on Track C: Cross-lingual. The task aims to detect semantic relatedness of two sentences from the same languages. For cross-lingual approach we developed a set of linguistics-inspired models trained with several task-specific strategies. We 1) utilize language vectors for selection of donor languages; 2) investigate the multi-source approach for training; 3) use transliteration of non-latin script to study impact of “script gap”; 4) opt machine translation for data augmentation. We additionally compare the performance of XLM-RoBERTa and Furina with the same training strategy. Our submission achieved the first place in the C8 (Kinyarwanda) test.

Anthology ID:: 2024.semeval-1.259
Volume:: Proceedings of the 18th International Workshop on Semantic Evaluation (SemEval-2024)
Month:: June
Year:: 2024
Address:: Mexico City, Mexico
Editors:: Atul Kr. Ojha, A. Seza Doğruöz, Harish Tayyar Madabushi, Giovanni Da San Martino, Sara Rosenthal, Aiala Rosá
Venue:: SemEval
SIG:: SIGLEX
Publisher:: Association for Computational Linguistics
Note:
Pages:: 1842–1853
Language:
URL:: https://aclanthology.org/2024.semeval-1.259
DOI:: 10.18653/v1/2024.semeval-1.259
Bibkey:
Cite (ACL):: Shijia Zhou, Huangyan Shan, Barbara Plank, and Robert Litschko. 2024. MaiNLP at SemEval-2024 Task 1: Analyzing Source Language Selection in Cross-Lingual Textual Relatedness. In Proceedings of the 18th International Workshop on Semantic Evaluation (SemEval-2024), pages 1842–1853, Mexico City, Mexico. Association for Computational Linguistics.
Cite (Informal):: MaiNLP at SemEval-2024 Task 1: Analyzing Source Language Selection in Cross-Lingual Textual Relatedness (Zhou et al., SemEval 2024)
Copy Citation:
PDF:: https://preview.aclanthology.org/nschneid-patch-4/2024.semeval-1.259.pdf
Supplementary material:: 2024.semeval-1.259.SupplementaryMaterial.txt

PDF Search Supplementary material