Abstract
Sentence Ordering (SO) is a linguistic task which requires re-ordering of shuffled sentences into a coherent paragraph. SO has downstream applications, but also serves as a semantic probe for computational models as this capability is essential for understanding narrative structures, causal and temporal relations within texts. Despite its importance, prior research has been limited to predictable English language structures and has not thoroughly addressed the complexities of multilingual and varied narrative contexts. To fill this gap, we introduce a novel and comprehensive Multilingual Sentence Ordering task that extends SO to diverse narratives across 12 languages, including challenging code-switched texts. We have developed MultiSO, a new benchmark dataset that represents these challenges. Our findings reveal that both specialized sentence ordering models and advanced Large Language Models like GPT-4 face significant challenges with this task.- Anthology ID:
- 2024.starsem-1.24
- Volume:
- Proceedings of the 13th Joint Conference on Lexical and Computational Semantics (*SEM 2024)
- Month:
- June
- Year:
- 2024
- Address:
- Mexico City, Mexico
- Editors:
- Danushka Bollegala, Vered Shwartz
- Venue:
- *SEM
- SIG:
- SIGLEX
- Publisher:
- Association for Computational Linguistics
- Note:
- Pages:
- 308–313
- Language:
- URL:
- https://aclanthology.org/2024.starsem-1.24
- DOI:
- Cite (ACL):
- Alexandre Salle and Shervin Malmasi. 2024. Multilingual and Code-Switched Sentence Ordering. In Proceedings of the 13th Joint Conference on Lexical and Computational Semantics (*SEM 2024), pages 308–313, Mexico City, Mexico. Association for Computational Linguistics.
- Cite (Informal):
- Multilingual and Code-Switched Sentence Ordering (Salle & Malmasi, *SEM 2024)
- PDF:
- https://preview.aclanthology.org/jeptaln-2024-ingestion/2024.starsem-1.24.pdf