Abstract
This paper focuses on a traditional relation extraction task in the context of limited annotated data and a narrow knowledge domain. We explore this task with a clinical corpus consisting of 200 breast cancer follow-up treatment letters in which 16 distinct types of relations are annotated. We experiment with an approach to extracting typed relations called window-bounded co-occurrence (WBC), which uses an adjustable context window around entity mentions of a relevant type, and compare its performance with a more typical intra-sentential co-occurrence baseline. We further introduce a new bag-of-concepts (BoC) approach to feature engineering based on the state-of-the-art word embeddings and word synonyms. We demonstrate the competitiveness of BoC by comparing with methods of higher complexity, and explore its effectiveness on this small dataset.- Anthology ID:
- N19-3007
- Volume:
- Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics: Student Research Workshop
- Month:
- June
- Year:
- 2019
- Address:
- Minneapolis, Minnesota
- Editors:
- Sudipta Kar, Farah Nadeem, Laura Burdick, Greg Durrett, Na-Rae Han
- Venue:
- NAACL
- SIG:
- Publisher:
- Association for Computational Linguistics
- Note:
- Pages:
- 43–52
- Language:
- URL:
- https://preview.aclanthology.org/add_missing_videos/N19-3007/
- DOI:
- 10.18653/v1/N19-3007
- Cite (ACL):
- Jiyu Chen, Karin Verspoor, and Zenan Zhai. 2019. A Bag-of-concepts Model Improves Relation Extraction in a Narrow Knowledge Domain with Limited Data. In Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics: Student Research Workshop, pages 43–52, Minneapolis, Minnesota. Association for Computational Linguistics.
- Cite (Informal):
- A Bag-of-concepts Model Improves Relation Extraction in a Narrow Knowledge Domain with Limited Data (Chen et al., NAACL 2019)
- PDF:
- https://preview.aclanthology.org/add_missing_videos/N19-3007.pdf