DELTA: A Toolkit for Measuring Linguistic Diversity in Dependency-Parsed Corpora

Louis Estève, Kaja Dobrovoljc


Abstract
Despite growing interest in measuring linguistic diversity on the one hand and the increasing availability of cross-linguistically comparable parsed corpora on the other, tools for systematically measuring the diversity of specific linguistic phenomena on such data remain limited. To address this gap, we present DELTA, an open-source framework that integrates dependency tree querying with diversity computation, enabling systematic measurement across multiple linguistic levels (e.g., lexis, morphology, syntax) and multiple diversity dimensions (variety, balance, disparity). The pipeline processes CoNLL-U formatted corpora through configurable workflows, treating the format as a general-purpose tabular structure independent of specific annotation conventions. We validate DELTA on Parallel Universal Dependencies multilingual dataset, demonstrating its capacity for corpus profiling and cross-corpus diversity comparison.
Anthology ID:
2026.eacl-demo.6
Volume:
Proceedings of the 19th Conference of the European Chapter of the Association for Computational Linguistics (Volume 3: System Demonstrations)
Month:
March
Year:
2026
Address:
Rabat, Marocco
Editors:
Danilo Croce, Jochen Leidner, Nafise Sadat Moosavi
Venue:
EACL
SIG:
Publisher:
Association for Computational Linguistics
Note:
Pages:
75–85
Language:
URL:
https://preview.aclanthology.org/ingest-eacl/2026.eacl-demo.6/
DOI:
Bibkey:
Cite (ACL):
Louis Estève and Kaja Dobrovoljc. 2026. DELTA: A Toolkit for Measuring Linguistic Diversity in Dependency-Parsed Corpora. In Proceedings of the 19th Conference of the European Chapter of the Association for Computational Linguistics (Volume 3: System Demonstrations), pages 75–85, Rabat, Marocco. Association for Computational Linguistics.
Cite (Informal):
DELTA: A Toolkit for Measuring Linguistic Diversity in Dependency-Parsed Corpora (Estève & Dobrovoljc, EACL 2026)
Copy Citation:
PDF:
https://preview.aclanthology.org/ingest-eacl/2026.eacl-demo.6.pdf