Automatic Generation of High Quality CCGbanks for Parser Domain Adaptation
Masashi Yoshikawa, Hiroshi Noji, Koji Mineshima, Daisuke Bekki
Abstract
We propose a new domain adaptation method for Combinatory Categorial Grammar (CCG) parsing, based on the idea of automatic generation of CCG corpora exploiting cheaper resources of dependency trees. Our solution is conceptually simple, and not relying on a specific parser architecture, making it applicable to the current best-performing parsers. We conduct extensive parsing experiments with detailed discussion; on top of existing benchmark datasets on (1) biomedical texts and (2) question sentences, we create experimental datasets of (3) speech conversation and (4) math problems. When applied to the proposed method, an off-the-shelf CCG parser shows significant performance gains, improving from 90.7% to 96.6% on speech conversation, and from 88.5% to 96.8% on math problems.- Anthology ID:
- P19-1013
- Volume:
- Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics
- Month:
- July
- Year:
- 2019
- Address:
- Florence, Italy
- Editors:
- Anna Korhonen, David Traum, Lluís Màrquez
- Venue:
- ACL
- SIG:
- Publisher:
- Association for Computational Linguistics
- Note:
- Pages:
- 129–139
- Language:
- URL:
- https://aclanthology.org/P19-1013
- DOI:
- 10.18653/v1/P19-1013
- Cite (ACL):
- Masashi Yoshikawa, Hiroshi Noji, Koji Mineshima, and Daisuke Bekki. 2019. Automatic Generation of High Quality CCGbanks for Parser Domain Adaptation. In Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics, pages 129–139, Florence, Italy. Association for Computational Linguistics.
- Cite (Informal):
- Automatic Generation of High Quality CCGbanks for Parser Domain Adaptation (Yoshikawa et al., ACL 2019)
- PDF:
- https://preview.aclanthology.org/nschneid-patch-1/P19-1013.pdf
- Data
- Penn Treebank, Universal Dependencies