Amir Fayazi
2022
Efficient Entity Embedding Construction from Type Knowledge for BERT
Yukun Feng
|
Amir Fayazi
|
Abhinav Rastogi
|
Manabu Okumura
Findings of the Association for Computational Linguistics: AACL-IJCNLP 2022
Recent work has shown advantages of incorporating knowledge graphs (KGs) into BERT for various NLP tasks. One common way is to feed entity embeddings as an additional input during pre-training. There are two limitations to such a method. First, to train the entity embeddings to include rich information of factual knowledge, it typically requires access to the entire KG. This is challenging for KGs with daily changes (e.g., Wikidata). Second, it requires a large scale pre-training corpus with entity annotations and high computational cost during pre-training. In this work, we efficiently construct entity embeddings only from the type knowledge, that does not require access to the entire KG. Although the entity embeddings contain only local information, they perform very well when combined with context. Furthermore, we show that our entity embeddings, constructed from BERT’s input embeddings, can be directly incorporated into the fine-tuning phase without requiring any specialized pre-training. In addition, these entity embeddings can also be constructed on the fly without requiring a large memory footprint to store them. Finally, we propose task-specific models that incorporate our entity embeddings for entity linking, entity typing, and relation classification. Experiments show that our models have comparable or superior performance to existing models while being more resource efficient.
2019
Robust Zero-Shot Cross-Domain Slot Filling with Example Values
Darsh Shah
|
Raghav Gupta
|
Amir Fayazi
|
Dilek Hakkani-Tur
Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics
Task-oriented dialog systems increasingly rely on deep learning-based slot filling models, usually needing extensive labeled training data for target domains. Often, however, little to no target domain training data may be available, or the training and target domain schemas may be misaligned, as is common for web forms on similar websites. Prior zero-shot slot filling models use slot descriptions to learn concepts, but are not robust to misaligned schemas. We propose utilizing both the slot description and a small number of examples of slot values, which may be easily available, to learn semantic representations of slots which are transferable across domains and robust to misaligned schemas. Our approach outperforms state-of-the-art models on two multi-domain datasets, especially in the low-data setting.
Search
Co-authors
- Darsh Shah 1
- Raghav Gupta 1
- Dilek Hakkani-Tur 1
- Yukun Feng 1
- Abhinav Rastogi 1
- show all...