Zihao Wang


2020

pdf bib
Semi-Supervised Bilingual Lexicon Induction with Two-way Interaction
Xu Zhao | Zihao Wang | Hao Wu | Yong Zhang
Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing (EMNLP)

Semi-supervision is a promising paradigm for Bilingual Lexicon Induction (BLI) with limited annotations. However, previous semisupervised methods do not fully utilize the knowledge hidden in annotated and nonannotated data, which hinders further improvement of their performance. In this paper, we propose a new semi-supervised BLI framework to encourage the interaction between the supervised signal and unsupervised alignment. We design two message-passing mechanisms to transfer knowledge between annotated and non-annotated data, named prior optimal transport and bi-directional lexicon update respectively. Then, we perform semi-supervised learning based on a cyclic or a parallel parameter feeding routine to update our models. Our framework is a general framework that can incorporate any supervised and unsupervised BLI methods based on optimal transport. Experimental results on MUSE and VecMap datasets show significant improvement of our models. Ablation study also proves that the two-way interaction between the supervised signal and unsupervised alignment accounts for the gain of the overall performance. Results on distant language pairs further illustrate the advantage and robustness of our proposed method.

pdf bib
A Relaxed Matching Procedure for Unsupervised BLI
Xu Zhao | Zihao Wang | Yong Zhang | Hao Wu
Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics

Recently unsupervised Bilingual Lexicon Induction(BLI) without any parallel corpus has attracted much research interest. One of the crucial parts in methods for the BLI task is the matching procedure. Previous works impose a too strong constraint on the matching and lead to many counterintuitive translation pairings. Thus We propose a relaxed matching procedure to find a more precise matching between two languages. We also find that aligning source and target language embedding space bidirectionally will bring significant improvement. We follow the previous iterative framework to conduct experiments. Results on standard benchmark demonstrate the effectiveness of our proposed method, which substantially outperforms previous unsupervised methods.

2019

pdf bib
Tackling Long-Tailed Relations and Uncommon Entities in Knowledge Graph Completion
Zihao Wang | Kwunping Lai | Piji Li | Lidong Bing | Wai Lam
Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing (EMNLP-IJCNLP)

For large-scale knowledge graphs (KGs), recent research has been focusing on the large proportion of infrequent relations which have been ignored by previous studies. For example few-shot learning paradigm for relations has been investigated. In this work, we further advocate that handling uncommon entities is inevitable when dealing with infrequent relations. Therefore, we propose a meta-learning framework that aims at handling infrequent relations with few-shot learning and uncommon entities by using textual descriptions. We design a novel model to better extract key information from textual descriptions. Besides, we also develop a novel generative model in our framework to enhance the performance by generating extra triplets during the training stage. Experiments are conducted on two datasets from real-world KGs, and the results show that our framework outperforms previous methods when dealing with infrequent relations and their accompanying uncommon entities.

2018

pdf bib
Responding E-commerce Product Questions via Exploiting QA Collections and Reviews
Qian Yu | Wai Lam | Zihao Wang
Proceedings of the 27th International Conference on Computational Linguistics

Providing instant responses for product questions in E-commerce sites can significantly improve satisfaction of potential consumers. We propose a new framework for automatically responding product questions newly posed by users via exploiting existing QA collections and review collections in a coordinated manner. Our framework can return a ranked list of snippets serving as the automated response for a given question, where each snippet can be a sentence from reviews or an existing question-answer pair. One major subtask in our framework is question-based response review ranking. Learning for response review ranking is challenging since there is no labeled response review available. The collection of existing QA pairs are exploited as distant supervision for learning to rank responses. With proposed distant supervision paradigm, the learned response ranking model makes use of the knowledge in the QA pairs and the corresponding retrieved review lists. Extensive experiments on datasets collected from a real-world commercial E-commerce site demonstrate the effectiveness of our proposed framework.

2017

pdf bib
Deep Recurrent Generative Decoder for Abstractive Text Summarization
Piji Li | Wai Lam | Lidong Bing | Zihao Wang
Proceedings of the 2017 Conference on Empirical Methods in Natural Language Processing

We propose a new framework for abstractive text summarization based on a sequence-to-sequence oriented encoder-decoder model equipped with a deep recurrent generative decoder (DRGN). Latent structure information implied in the target summaries is learned based on a recurrent latent random model for improving the summarization quality. Neural variational inference is employed to address the intractable posterior inference for the recurrent latent variables. Abstractive summaries are generated based on both the generative latent variables and the discriminative deterministic states. Extensive experiments on some benchmark datasets in different languages show that DRGN achieves improvements over the state-of-the-art methods.