Enhancing NER by Harnessing Multiple Datasets with Conditional Variational Autoencoders

Taku Oi, Makoto Miwa


Abstract
We propose a novel method to integrate a Conditional Variational Autoencoder (CVAE) into a span-based Named Entity Recognition (NER) model to model the shared and unshared information among labels in multiple datasets and ease the training on the datasets. Experimental results using multiple biomedical datasets show the effectiveness of the proposed method, achieving improved performance on the BioRED dataset.
Anthology ID:
2025.acl-short.87
Volume:
Proceedings of the 63rd Annual Meeting of the Association for Computational Linguistics (Volume 2: Short Papers)
Month:
July
Year:
2025
Address:
Vienna, Austria
Editors:
Wanxiang Che, Joyce Nabende, Ekaterina Shutova, Mohammad Taher Pilehvar
Venue:
ACL
SIG:
Publisher:
Association for Computational Linguistics
Note:
Pages:
1107–1117
Language:
URL:
https://preview.aclanthology.org/landing_page/2025.acl-short.87/
DOI:
Bibkey:
Cite (ACL):
Taku Oi and Makoto Miwa. 2025. Enhancing NER by Harnessing Multiple Datasets with Conditional Variational Autoencoders. In Proceedings of the 63rd Annual Meeting of the Association for Computational Linguistics (Volume 2: Short Papers), pages 1107–1117, Vienna, Austria. Association for Computational Linguistics.
Cite (Informal):
Enhancing NER by Harnessing Multiple Datasets with Conditional Variational Autoencoders (Oi & Miwa, ACL 2025)
Copy Citation:
PDF:
https://preview.aclanthology.org/landing_page/2025.acl-short.87.pdf