Detecting Diachronic Syntactic Developments in Presence of Bias Terms

Oliver Hellwig, Sven Sellmer


Abstract
Corpus-based studies of diachronic syntactic changes are typically guided by the results of previous qualitative research. When such results are missing or, as is the case for Vedic Sanskrit, are restricted to small parts of a transmitted corpus, an exploratory framework that detects such changes in a data-driven fashion can substantially support the research process. In this paper, we introduce a customized version of the infinite relational model that groups syntactic constituents based on their structural similarities and their diachronic distributions. We propose a simple way to control for register and intellectual affiliation, and discuss our findings for four syntactic structures in Vedic texts.
Anthology ID:
2022.lt4hala-1.2
Volume:
Proceedings of the Second Workshop on Language Technologies for Historical and Ancient Languages
Month:
June
Year:
2022
Address:
Marseille, France
Editors:
Rachele Sprugnoli, Marco Passarotti
Venue:
LT4HALA
SIG:
Publisher:
European Language Resources Association
Note:
Pages:
10–19
Language:
URL:
https://aclanthology.org/2022.lt4hala-1.2
DOI:
Bibkey:
Cite (ACL):
Oliver Hellwig and Sven Sellmer. 2022. Detecting Diachronic Syntactic Developments in Presence of Bias Terms. In Proceedings of the Second Workshop on Language Technologies for Historical and Ancient Languages, pages 10–19, Marseille, France. European Language Resources Association.
Cite (Informal):
Detecting Diachronic Syntactic Developments in Presence of Bias Terms (Hellwig & Sellmer, LT4HALA 2022)
Copy Citation:
PDF:
https://preview.aclanthology.org/nschneid-patch-4/2022.lt4hala-1.2.pdf
Data
Universal Dependencies