Is Language Modeling Enough? Evaluating Effective Embedding Combinations
Rudolf Schneider, Tom Oberhauser, Paul Grundmann, Felix Alexander Gers, Alexander Loeser, Steffen Staab
Abstract
Universal embeddings, such as BERT or ELMo, are useful for a broad set of natural language processing tasks like text classification or sentiment analysis. Moreover, specialized embeddings also exist for tasks like topic modeling or named entity disambiguation. We study if we can complement these universal embeddings with specialized embeddings. We conduct an in-depth evaluation of nine well known natural language understanding tasks with SentEval. Also, we extend SentEval with two additional tasks to the medical domain. We present PubMedSection, a novel topic classification dataset focussed on the biomedical domain. Our comprehensive analysis covers 11 tasks and combinations of six embeddings. We report that combined embeddings outperform state of the art universal embeddings without any embedding fine-tuning. We observe that adding topic model based embeddings helps for most tasks and that differing pre-training tasks encode complementary features. Moreover, we present new state of the art results on the MPQA and SUBJ tasks in SentEval.- Anthology ID:
- 2020.lrec-1.583
- Volume:
- Proceedings of the Twelfth Language Resources and Evaluation Conference
- Month:
- May
- Year:
- 2020
- Address:
- Marseille, France
- Editors:
- Nicoletta Calzolari, Frédéric Béchet, Philippe Blache, Khalid Choukri, Christopher Cieri, Thierry Declerck, Sara Goggi, Hitoshi Isahara, Bente Maegaard, Joseph Mariani, Hélène Mazo, Asuncion Moreno, Jan Odijk, Stelios Piperidis
- Venue:
- LREC
- SIG:
- Publisher:
- European Language Resources Association
- Note:
- Pages:
- 4739–4748
- Language:
- English
- URL:
- https://aclanthology.org/2020.lrec-1.583
- DOI:
- Cite (ACL):
- Rudolf Schneider, Tom Oberhauser, Paul Grundmann, Felix Alexander Gers, Alexander Loeser, and Steffen Staab. 2020. Is Language Modeling Enough? Evaluating Effective Embedding Combinations. In Proceedings of the Twelfth Language Resources and Evaluation Conference, pages 4739–4748, Marseille, France. European Language Resources Association.
- Cite (Informal):
- Is Language Modeling Enough? Evaluating Effective Embedding Combinations (Schneider et al., LREC 2020)
- PDF:
- https://preview.aclanthology.org/proper-vol2-ingestion/2020.lrec-1.583.pdf