Automatic Generation of High Quality CCGbanks for Parser Domain Adaptation

Masashi Yoshikawa, Hiroshi Noji, Koji Mineshima, Daisuke Bekki


Abstract
We propose a new domain adaptation method for Combinatory Categorial Grammar (CCG) parsing, based on the idea of automatic generation of CCG corpora exploiting cheaper resources of dependency trees. Our solution is conceptually simple, and not relying on a specific parser architecture, making it applicable to the current best-performing parsers. We conduct extensive parsing experiments with detailed discussion; on top of existing benchmark datasets on (1) biomedical texts and (2) question sentences, we create experimental datasets of (3) speech conversation and (4) math problems. When applied to the proposed method, an off-the-shelf CCG parser shows significant performance gains, improving from 90.7% to 96.6% on speech conversation, and from 88.5% to 96.8% on math problems.
Anthology ID:
P19-1013
Volume:
Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics
Month:
July
Year:
2019
Address:
Florence, Italy
Editors:
Anna Korhonen, David Traum, Lluís Màrquez
Venue:
ACL
SIG:
Publisher:
Association for Computational Linguistics
Note:
Pages:
129–139
Language:
URL:
https://aclanthology.org/P19-1013
DOI:
10.18653/v1/P19-1013
Bibkey:
Cite (ACL):
Masashi Yoshikawa, Hiroshi Noji, Koji Mineshima, and Daisuke Bekki. 2019. Automatic Generation of High Quality CCGbanks for Parser Domain Adaptation. In Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics, pages 129–139, Florence, Italy. Association for Computational Linguistics.
Cite (Informal):
Automatic Generation of High Quality CCGbanks for Parser Domain Adaptation (Yoshikawa et al., ACL 2019)
Copy Citation:
PDF:
https://preview.aclanthology.org/nschneid-patch-1/P19-1013.pdf
Video:
 https://preview.aclanthology.org/nschneid-patch-1/P19-1013.mp4
Data
Penn TreebankUniversal Dependencies