Zhao Zhang


2021

pdf bib
Many-to-English Machine Translation Tools, Data, and Pretrained Models
Thamme Gowda | Zhao Zhang | Chris Mattmann | Jonathan May
Proceedings of the 59th Annual Meeting of the Association for Computational Linguistics and the 11th International Joint Conference on Natural Language Processing: System Demonstrations

While there are more than 7000 languages in the world, most translation research efforts have targeted a few high resource languages. Commercial translation systems support only one hundred languages or fewer, and do not make these models available for transfer to low resource languages. In this work, we present useful tools for machine translation research: MTData, NLCodec and RTG. We demonstrate their usefulness by creating a multilingual neural machine translation model capable of translating from 500 source languages to English. We make this multilingual model readily downloadable and usable as a service, or as a parent model for transfer-learning to even lower-resource languages.

pdf bib
DyLex: Incorporating Dynamic Lexicons into BERT for Sequence Labeling
Baojun Wang | Zhao Zhang | Kun Xu | Guang-Yuan Hao | Yuyang Zhang | Lifeng Shang | Linlin Li | Xiao Chen | Xin Jiang | Qun Liu
Proceedings of the 2021 Conference on Empirical Methods in Natural Language Processing

Incorporating lexical knowledge into deep learning models has been proved to be very effective for sequence labeling tasks. However, previous works commonly have difficulty dealing with large-scale dynamic lexicons which often cause excessive matching noise and problems of frequent updates. In this paper, we propose DyLex, a plug-in lexicon incorporation approach for BERT based sequence labeling tasks. Instead of leveraging embeddings of words in the lexicon as in conventional methods, we adopt word-agnostic tag embeddings to avoid re-training the representation while updating the lexicon. Moreover, we employ an effective supervised lexical knowledge denoising method to smooth out matching noise. Finally, we introduce a col-wise attention based knowledge fusion mechanism to guarantee the pluggability of the proposed framework. Experiments on ten datasets of three tasks show that the proposed framework achieves new SOTA, even with very large scale lexicons.

2018

pdf bib
Knowledge Graph Embedding with Hierarchical Relation Structure
Zhao Zhang | Fuzhen Zhuang | Meng Qu | Fen Lin | Qing He
Proceedings of the 2018 Conference on Empirical Methods in Natural Language Processing

The rapid development of knowledge graphs (KGs), such as Freebase and WordNet, has changed the paradigm for AI-related applications. However, even though these KGs are impressively large, most of them are suffering from incompleteness, which leads to performance degradation of AI applications. Most existing researches are focusing on knowledge graph embedding (KGE) models. Nevertheless, those models simply embed entities and relations into latent vectors without leveraging the rich information from the relation structure. Indeed, relations in KGs conform to a three-layer hierarchical relation structure (HRS), i.e., semantically similar relations can make up relation clusters and some relations can be further split into several fine-grained sub-relations. Relation clusters, relations and sub-relations can fit in the top, the middle and the bottom layer of three-layer HRS respectively. To this end, in this paper, we extend existing KGE models TransE, TransH and DistMult, to learn knowledge representations by leveraging the information from the HRS. Particularly, our approach is capable to extend other KGE models. Finally, the experiment results clearly validate the effectiveness of the proposed approach against baselines.