Jiaoyan Chen


2021

pdf bib
OntoEA: Ontology-guided Entity Alignment via Joint Knowledge Graph Embedding
Yuejia Xiang | Ziheng Zhang | Jiaoyan Chen | Xi Chen | Zhenxi Lin | Yefeng Zheng
Findings of the Association for Computational Linguistics: ACL-IJCNLP 2021

2020

pdf bib
An Industry Evaluation of Embedding-based Entity Alignment
Ziheng Zhang | Hualuo Liu | Jiaoyan Chen | Xi Chen | Bo Liu | YueJia Xiang | Yefeng Zheng
Proceedings of the 28th International Conference on Computational Linguistics: Industry Track

Embedding-based entity alignment has been widely investigated in recent years, but most proposed methods still rely on an ideal supervised learning setting with a large number of unbiased seed mappings for training and validation, which significantly limits their usage. In this study, we evaluate those state-of-the-art methods in an industrial context, where the impact of seed mappings with different sizes and different biases is explored. Besides the popular benchmarks from DBpedia and Wikidata, we contribute and evaluate a new industrial benchmark that is extracted from two heterogeneous knowledge graphs (KGs) under deployment for medical applications. The experimental results enable the analysis of the advantages and disadvantages of these alignment methods and the further discussion of suitable strategies for their industrial deployment.

pdf bib
Zero-shot Text Classification via Reinforced Self-training
Zhiquan Ye | Yuxia Geng | Jiaoyan Chen | Jingmin Chen | Xiaoxiao Xu | SuHang Zheng | Feng Wang | Jun Zhang | Huajun Chen
Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics

Zero-shot learning has been a tough problem since no labeled data is available for unseen classes during training, especially for classes with low similarity. In this situation, transferring from seen classes to unseen classes is extremely hard. To tackle this problem, in this paper we propose a self-training based method to efficiently leverage unlabeled data. Traditional self-training methods use fixed heuristics to select instances from unlabeled data, whose performance varies among different datasets. We propose a reinforcement learning framework to learn data selection strategy automatically and provide more reliable selection. Experimental results on both benchmarks and a real-world e-commerce dataset show that our approach significantly outperforms previous methods in zero-shot text classification