@inproceedings{svoboda-sevcikova-2024-compounds,
    title = "Compounds in {U}niversal {D}ependencies: A Survey in Five {E}uropean Languages",
    author = "Svoboda, Emil  and
      {\v{S}}ev{\v{c}}{\'i}kov{\'a}, Magda",
    editor = "Hahn, Michael  and
      Sorokin, Alexey  and
      Kumar, Ritesh  and
      Shcherbakov, Andreas  and
      Otmakhova, Yulia  and
      Yang, Jinrui  and
      Serikov, Oleg  and
      Rani, Priya  and
      Ponti, Edoardo M.  and
      Murado{\u{g}}lu, Saliha  and
      Gao, Rena  and
      Cotterell, Ryan  and
      Vylomova, Ekaterina",
    booktitle = "Proceedings of the 6th Workshop on Research in Computational Linguistic Typology and Multilingual NLP",
    month = mar,
    year = "2024",
    address = "St. Julian's, Malta",
    publisher = "Association for Computational Linguistics",
    url = "https://preview.aclanthology.org/ingest-emnlp/2024.sigtyp-1.12/",
    pages = "88--99",
    abstract = "In Universal Dependencies, compounds, which we understand as words containing two or more roots, are represented according to tokenization, which reflects the orthographic conventions of the language. A closed compound (e.g. $\textit{waterfall}$) corresponds to a single word in Universal Dependencies while a hyphenated compound ($\textit{father-in-law}$) and an open compound ($\textit{apple pie}$) to multiple words. The aim of this paper is to open a discussion on how to move towards a more consistent annotation of compounds.The solution we argue for is to represent the internal structure of all compound types analogously to syntactic phrases, which would not only increase the comparability of compounding within and across languages, but also allow comparisons of compounds and syntactic phrases."
}Markdown (Informal)
[Compounds in Universal Dependencies: A Survey in Five European Languages](https://preview.aclanthology.org/ingest-emnlp/2024.sigtyp-1.12/) (Svoboda & Ševčíková, SIGTYP 2024)
ACL