Variational Sequential Labelers for Semi-Supervised Learning

Mingda Chen, Qingming Tang, Karen Livescu, Kevin Gimpel


Abstract
We introduce a family of multitask variational methods for semi-supervised sequence labeling. Our model family consists of a latent-variable generative model and a discriminative labeler. The generative models use latent variables to define the conditional probability of a word given its context, drawing inspiration from word prediction objectives commonly used in learning word embeddings. The labeler helps inject discriminative information into the latent space. We explore several latent variable configurations, including ones with hierarchical structure, which enables the model to account for both label-specific and word-specific information. Our models consistently outperform standard sequential baselines on 8 sequence labeling datasets, and improve further with unlabeled data.
Anthology ID:
D18-1020
Volume:
Proceedings of the 2018 Conference on Empirical Methods in Natural Language Processing
Month:
October-November
Year:
2018
Address:
Brussels, Belgium
Editors:
Ellen Riloff, David Chiang, Julia Hockenmaier, Jun’ichi Tsujii
Venue:
EMNLP
SIG:
SIGDAT
Publisher:
Association for Computational Linguistics
Note:
Pages:
215–226
Language:
URL:
https://aclanthology.org/D18-1020
DOI:
10.18653/v1/D18-1020
Bibkey:
Cite (ACL):
Mingda Chen, Qingming Tang, Karen Livescu, and Kevin Gimpel. 2018. Variational Sequential Labelers for Semi-Supervised Learning. In Proceedings of the 2018 Conference on Empirical Methods in Natural Language Processing, pages 215–226, Brussels, Belgium. Association for Computational Linguistics.
Cite (Informal):
Variational Sequential Labelers for Semi-Supervised Learning (Chen et al., EMNLP 2018)
Copy Citation:
PDF:
https://preview.aclanthology.org/improve-issue-templates/D18-1020.pdf
Attachment:
 D18-1020.Attachment.pdf
Video:
 https://preview.aclanthology.org/improve-issue-templates/D18-1020.mp4
Code
 mingdachen/vsl
Data
CoNLLCoNLL 2003