Abstract
We introduce a novel tagging scheme for discontinuous named entity recognition based on an explicit description of the inner structure of discontinuous mentions. We rely on a weighted finite state automaton for both marginal and maximum a posteriori inference. As such, our method is sound in the sense that (1) well-formedness of predicted tag sequences is ensured via the automaton structure and (2) there is an unambiguous mapping between well-formed sequences of tags and (discontinuous) mentions. We evaluate our approach on three English datasets in the biomedical domain, and report comparable results to state-of-the-art while having a way simpler and faster model.- Anthology ID:
- 2024.emnlp-main.1087
- Volume:
- Proceedings of the 2024 Conference on Empirical Methods in Natural Language Processing
- Month:
- November
- Year:
- 2024
- Address:
- Miami, Florida, USA
- Editors:
- Yaser Al-Onaizan, Mohit Bansal, Yun-Nung Chen
- Venue:
- EMNLP
- SIG:
- Publisher:
- Association for Computational Linguistics
- Note:
- Pages:
- 19506–19518
- Language:
- URL:
- https://aclanthology.org/2024.emnlp-main.1087
- DOI:
- 10.18653/v1/2024.emnlp-main.1087
- Cite (ACL):
- Caio Filippo Corro. 2024. A Fast and Sound Tagging Method for Discontinuous Named-Entity Recognition. In Proceedings of the 2024 Conference on Empirical Methods in Natural Language Processing, pages 19506–19518, Miami, Florida, USA. Association for Computational Linguistics.
- Cite (Informal):
- A Fast and Sound Tagging Method for Discontinuous Named-Entity Recognition (Corro, EMNLP 2024)
- PDF:
- https://preview.aclanthology.org/dois-2013-emnlp/2024.emnlp-main.1087.pdf