Domain-Weighted Batch Sampling for Neural Dependency Parsing

Jacob Striebel; Daniel Dakota; Sandra Kübler

Domain-Weighted Batch Sampling for Neural Dependency Parsing

Jacob Striebel, Daniel Dakota, Sandra Kübler

Abstract

In neural dependency parsing, as well as in the broader field of NLP, domain adaptation remains a challenging problem. When adapting a parser to a target domain, there is a fundamental tension between the need to make use of out-of-domain data and the need to ensure that syntactic characteristic of the target domain are learned. In this work we explore a way to balance these two competing concerns, namely using domain-weighted batch sampling, which allows us to use all available training data, while controlling the probability of sampling in- and out-of-domain data when constructing training batches. We conduct experiments using ten natural language domains and find that domain-weighted batch sampling yields substantial performance improvements in all ten domains compared to a baseline of conventional randomized batch sampling.

Anthology ID:: 2024.mwe-1.24
Volume:: Proceedings of the Joint Workshop on Multiword Expressions and Universal Dependencies (MWE-UD) @ LREC-COLING 2024
Month:: May
Year:: 2024
Address:: Torino, Italia
Editors:: Archna Bhatia, Gosse Bouma, A. Seza Doğruöz, Kilian Evang, Marcos Garcia, Voula Giouli, Lifeng Han, Joakim Nivre, Alexandre Rademaker
Venues:: MWE | UDW | WS
SIGs:: SIGLEX | SIGPARSE
Publisher:: ELRA and ICCL
Note:
Pages:: 198–206
Language:
URL:: https://preview.aclanthology.org/jlcl-multiple-ingestion/2024.mwe-1.24/
DOI:
Bibkey:
Cite (ACL):: Jacob Striebel, Daniel Dakota, and Sandra Kübler. 2024. Domain-Weighted Batch Sampling for Neural Dependency Parsing. In Proceedings of the Joint Workshop on Multiword Expressions and Universal Dependencies (MWE-UD) @ LREC-COLING 2024, pages 198–206, Torino, Italia. ELRA and ICCL.
Cite (Informal):: Domain-Weighted Batch Sampling for Neural Dependency Parsing (Striebel et al., MWE-UDW 2024)
Copy Citation:
PDF:: https://preview.aclanthology.org/jlcl-multiple-ingestion/2024.mwe-1.24.pdf

PDF Cite Search Fix data