Enriching Word Embeddings with Domain Knowledge for Readability Assessment

Zhiwei Jiang, Qing Gu, Yafeng Yin, Daoxu Chen


Abstract
In this paper, we present a method which learns the word embedding for readability assessment. For the existing word embedding models, they typically focus on the syntactic or semantic relations of words, while ignoring the reading difficulty, thus they may not be suitable for readability assessment. Hence, we provide the knowledge-enriched word embedding (KEWE), which encodes the knowledge on reading difficulty into the representation of words. Specifically, we extract the knowledge on word-level difficulty from three perspectives to construct a knowledge graph, and develop two word embedding models to incorporate the difficulty context derived from the knowledge graph to define the loss functions. Experiments are designed to apply KEWE for readability assessment on both English and Chinese datasets, and the results demonstrate both effectiveness and potential of KEWE.
Anthology ID:
C18-1031
Volume:
Proceedings of the 27th International Conference on Computational Linguistics
Month:
August
Year:
2018
Address:
Santa Fe, New Mexico, USA
Editors:
Emily M. Bender, Leon Derczynski, Pierre Isabelle
Venue:
COLING
SIG:
Publisher:
Association for Computational Linguistics
Note:
Pages:
366–378
Language:
URL:
https://preview.aclanthology.org/sigedu-bea-out-of-sync-correction/C18-1031/
DOI:
Bibkey:
Cite (ACL):
Zhiwei Jiang, Qing Gu, Yafeng Yin, and Daoxu Chen. 2018. Enriching Word Embeddings with Domain Knowledge for Readability Assessment. In Proceedings of the 27th International Conference on Computational Linguistics, pages 366–378, Santa Fe, New Mexico, USA. Association for Computational Linguistics.
Cite (Informal):
Enriching Word Embeddings with Domain Knowledge for Readability Assessment (Jiang et al., COLING 2018)
Copy Citation:
PDF:
https://preview.aclanthology.org/sigedu-bea-out-of-sync-correction/C18-1031.pdf