Abstract
We present SlotGAN, a framework for training a mention detection model that only requires unlabeled text and a gazetteer. It consists of a generator trained to extract spans from an input sentence, and a discriminator trained to determine whether a span comes from the generator, or from the gazetteer. We evaluate the method on English newswire data and compare it against supervised, weakly-supervised, and unsupervised methods. We find that the performance of the method is lower than these baselines, because it tends to generate more and longer spans, and in some cases it relies only on capitalization. In other cases, it generates spans that are valid but differ from the benchmark. When evaluated with metrics based on overlap, we find that SlotGAN performs within 95% of the precision of a supervised method, and 84% of its recall. Our results suggest that the model can generate spans that overlap well, but an additional filtering mechanism is required.- Anthology ID:
- 2022.spnlp-1.4
- Volume:
- Proceedings of the Sixth Workshop on Structured Prediction for NLP
- Month:
- May
- Year:
- 2022
- Address:
- Dublin, Ireland
- Editors:
- Andreas Vlachos, Priyanka Agrawal, André Martins, Gerasimos Lampouras, Chunchuan Lyu
- Venue:
- spnlp
- SIG:
- Publisher:
- Association for Computational Linguistics
- Note:
- Pages:
- 32–39
- Language:
- URL:
- https://aclanthology.org/2022.spnlp-1.4
- DOI:
- 10.18653/v1/2022.spnlp-1.4
- Cite (ACL):
- Daniel Daza, Michael Cochez, and Paul Groth. 2022. SlotGAN: Detecting Mentions in Text via Adversarial Distant Learning. In Proceedings of the Sixth Workshop on Structured Prediction for NLP, pages 32–39, Dublin, Ireland. Association for Computational Linguistics.
- Cite (Informal):
- SlotGAN: Detecting Mentions in Text via Adversarial Distant Learning (Daza et al., spnlp 2022)
- PDF:
- https://preview.aclanthology.org/nschneid-patch-3/2022.spnlp-1.4.pdf
- Data
- CoNLL 2003