Abstract
In this work, we address the NER problem by splitting it into two logical sub-tasks: (1) Span Detection which simply extracts entity mention spans irrespective of entity type; (2) Span Classification which classifies the spans into their entity types. Further, we formulate both sub-tasks as question-answering (QA) problems and produce two leaner models which can be optimized separately for each sub-task. Experiments with four cross-domain datasets demonstrate that this two-step approach is both effective and time efficient. Our system, SplitNER outperforms baselines on OntoNotes5.0, WNUT17 and a cybersecurity dataset and gives on-par performance on BioNLP13CG. In all cases, it achieves a significant reduction in training time compared to its QA baseline counterpart. The effectiveness of our system stems from fine-tuning the BERT model twice, separately for span detection and classification. The source code can be found at https://github.com/c3sr/split-ner.- Anthology ID:
- 2023.acl-short.36
- Volume:
- Proceedings of the 61st Annual Meeting of the Association for Computational Linguistics (Volume 2: Short Papers)
- Month:
- July
- Year:
- 2023
- Address:
- Toronto, Canada
- Editors:
- Anna Rogers, Jordan Boyd-Graber, Naoaki Okazaki
- Venue:
- ACL
- SIG:
- Publisher:
- Association for Computational Linguistics
- Note:
- Pages:
- 416–426
- Language:
- URL:
- https://aclanthology.org/2023.acl-short.36
- DOI:
- 10.18653/v1/2023.acl-short.36
- Cite (ACL):
- Jatin Arora and Youngja Park. 2023. Split-NER: Named Entity Recognition via Two Question-Answering-based Classifications. In Proceedings of the 61st Annual Meeting of the Association for Computational Linguistics (Volume 2: Short Papers), pages 416–426, Toronto, Canada. Association for Computational Linguistics.
- Cite (Informal):
- Split-NER: Named Entity Recognition via Two Question-Answering-based Classifications (Arora & Park, ACL 2023)
- PDF:
- https://preview.aclanthology.org/fix-volume-bibkeys/2023.acl-short.36.pdf