Zhiwei Yu


2024

pdf
Distribution Shifts Are Bottlenecks: Extensive Evaluation for Grounding Language Models to Knowledge Bases
Yiheng Shu | Zhiwei Yu
Proceedings of the 18th Conference of the European Chapter of the Association for Computational Linguistics: Student Research Workshop

Grounding language models (LMs) to knowledge bases (KBs) helps to obtain rich and accurate facts. However, it remains challenging because of the enormous size, complex structure, and partial observability of KBs. One reason is that current benchmarks fail to reflect robustness challenges and fairly evaluate models.This paper analyzes whether these robustness challenges arise from distribution shifts, including environmental, linguistic, and modal aspects.This affects the ability of LMs to cope with unseen schema, adapt to language variations, and perform few-shot learning. Thus, the paper proposes extensive evaluation protocols and conducts experiments to demonstrate that, despite utilizing our proposed data augmentation method, both advanced small and large language models exhibit poor robustness in these aspects. We conclude that current LMs are too fragile to navigate in complex environments due to distribution shifts. This underscores the need for future research focusing on data collection, evaluation protocols, and learning paradigms.

2022

pdf
On the Effectiveness of Sentence Encoding for Intent Detection Meta-Learning
Tingting Ma | Qianhui Wu | Zhiwei Yu | Tiejun Zhao | Chin-Yew Lin
Proceedings of the 2022 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies

Recent studies on few-shot intent detection have attempted to formulate the task as a meta-learning problem, where a meta-learning model is trained with a certain capability to quickly adapt to newly specified few-shot tasks with potentially unseen intent categories. Prototypical networks have been commonly used in this setting, with the hope that good prototypical representations could be learned to capture the semantic similarity between the query and a few labeled instances. This intuition naturally leaves a question of whether or not a good sentence representation scheme could suffice for the task without further domain-specific adaptation. In this paper, we conduct empirical studies on a number of general-purpose sentence embedding schemes, showing that good sentence embeddings without any fine-tuning on intent detection data could produce a non-trivially strong performance. Inspired by the results from our qualitative analysis, we propose a frustratingly easy modification, which leads to consistent improvements over all sentence encoding schemes, including those from the state-of-the-art prototypical network variants with task-specific fine-tuning.

pdf
TIARA: Multi-grained Retrieval for Robust Question Answering over Large Knowledge Base
Yiheng Shu | Zhiwei Yu | Yuhan Li | Börje Karlsson | Tingting Ma | Yuzhong Qu | Chin-Yew Lin
Proceedings of the 2022 Conference on Empirical Methods in Natural Language Processing

Pre-trained language models (PLMs) have shown their effectiveness in multiple scenarios. However, KBQA remains challenging, especially regarding coverage and generalization settings. This is due to two main factors: i) understanding the semantics of both questions and relevant knowledge from the KB; ii) generating executable logical forms with both semantic and syntactic correctness. In this paper, we present a new KBQA model, TIARA, which addresses those issues by applying multi-grained retrieval to help the PLM focus on the most relevant KB context, viz., entities, exemplary logical forms, and schema items. Moreover, constrained decoding is used to control the output space and reduce generation errors. Experiments over important benchmarks demonstrate the effectiveness of our approach. TIARA outperforms previous SOTA, including those using PLMs or oracle entity annotations, by at least 4.1 and 1.1 F1 points on GrailQA and WebQuestionsSP, respectively. Specifically on GrailQA, TIARA outperforms previous models in all categories, with an improvement of 4.7 F1 points in zero-shot generalization.

2021

pdf
ReTraCk: A Flexible and Efficient Framework for Knowledge Base Question Answering
Shuang Chen | Qian Liu | Zhiwei Yu | Chin-Yew Lin | Jian-Guang Lou | Feng Jiang
Proceedings of the 59th Annual Meeting of the Association for Computational Linguistics and the 11th International Joint Conference on Natural Language Processing: System Demonstrations

We present Retriever-Transducer-Checker (ReTraCk), a neural semantic parsing framework for large scale knowledge base question answering (KBQA). ReTraCk is designed as a modular framework to maintain high flexibility. It includes a retriever to retrieve relevant KB items efficiently, a transducer to generate logical form with syntax correctness guarantees and a checker to improve transduction procedure. ReTraCk is ranked at top1 overall performance on the GrailQA leaderboard and obtains highly competitive performance on the typical WebQuestionsSP benchmark. Our system can interact with users timely, demonstrating the efficiency of the proposed framework.

2020

pdf
Homophonic Pun Generation with Lexically Constrained Rewriting
Zhiwei Yu | Hongyu Zang | Xiaojun Wan
Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing (EMNLP)

Punning is a creative way to make conversation enjoyable and literary writing elegant. In this paper, we focus on the task of generating a pun sentence given a pair of homophones. We first find the constraint words supporting the semantic incongruity for a sentence. Then we rewrite the sentence with explicit positive and negative constraints. Our model achieves the state-of-the-art results in both automatic and human evaluations. We further make an error analysis and discuss the challenges for the computational pun models.

pdf
Routing Enforced Generative Model for Recipe Generation
Zhiwei Yu | Hongyu Zang | Xiaojun Wan
Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing (EMNLP)

One of the most challenging part of recipe generation is to deal with the complex restrictions among the input ingredients. Previous researches simplify the problem by treating the inputs independently and generating recipes containing as much information as possible. In this work, we propose a routing method to dive into the content selection under the internal restrictions. The routing enforced generative model (RGM) can generate appropriate recipes according to the given ingredients and user preferences. Our model yields new state-of-the-art results on the recipe generation task with significant improvements on BLEU, F1 and human evaluation.

2019

pdf
Automated Chess Commentator Powered by Neural Chess Engine
Hongyu Zang | Zhiwei Yu | Xiaojun Wan
Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics

In this paper, we explore a new approach for automated chess commentary generation, which aims to generate chess commentary texts in different categories (e.g., description, comparison, planning, etc.). We introduce a neural chess engine into text generation models to help with encoding boards, predicting moves, and analyzing situations. By jointly training the neural chess engine and the generation models for different categories, the models become more effective. We conduct experiments on 5 categories in a benchmark Chess Commentary dataset and achieve inspiring results in both automatic and human evaluations.

pdf
How to Avoid Sentences Spelling Boring? Towards a Neural Approach to Unsupervised Metaphor Generation
Zhiwei Yu | Xiaojun Wan
Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 1 (Long and Short Papers)

Metaphor generation attempts to replicate human creativity with language, which is an attractive but challengeable text generation task. Previous efforts mainly focus on template-based or rule-based methods and result in a lack of linguistic subtlety. In order to create novel metaphors, we propose a neural approach to metaphor generation and explore the shared inferential structure of a metaphorical usage and a literal usage of a verb. Our approach does not require any manually annotated metaphors for training. We extract the metaphorically used verbs with their metaphorical senses in an unsupervised way and train a neural language model from wiki corpus. Then we generate metaphors conveying the assigned metaphorical senses with an improved decoding algorithm. Automatic metrics and human evaluations demonstrate that our approach can generate metaphors with good readability and creativity.

2018

pdf
A Neural Approach to Pun Generation
Zhiwei Yu | Jiwei Tan | Xiaojun Wan
Proceedings of the 56th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers)

Automatic pun generation is an interesting and challenging text generation task. Previous efforts rely on templates or laboriously manually annotated pun datasets, which heavily constrains the quality and diversity of generated puns. Since sequence-to-sequence models provide an effective technique for text generation, it is promising to investigate these models on the pun generation task. In this paper, we propose neural network models for homographic pun generation, and they can generate puns without requiring any pun data for training. We first train a conditional neural language model from a general text corpus, and then generate puns from the language model with an elaborately designed decoding algorithm. Automatic and human evaluations show that our models are able to generate homographic puns of good readability and quality.

2016

pdf
Planting Trees in the Desert: Delexicalized Tagging and Parsing Combined
Daniel Zeman | David Mareček | Zhiwei Yu | Zdeněk Žabokrtský
Proceedings of the 30th Pacific Asia Conference on Language, Information and Computation: Oral Papers

pdf
If You Even Don’t Have a Bit of Bible: Learning Delexicalized POS Taggers
Zhiwei Yu | David Mareček | Zdeněk Žabokrtský | Daniel Zeman
Proceedings of the Tenth International Conference on Language Resources and Evaluation (LREC'16)

Part-of-speech (POS) induction is one of the most popular tasks in research on unsupervised NLP. Various unsupervised and semi-supervised methods have been proposed to tag an unseen language. However, many of them require some partial understanding of the target language because they rely on dictionaries or parallel corpora such as the Bible. In this paper, we propose a different method named delexicalized tagging, for which we only need a raw corpus of the target language. We transfer tagging models trained on annotated corpora of one or more resource-rich languages. We employ language-independent features such as word length, frequency, neighborhood entropy, character classes (alphabetic vs. numeric vs. punctuation) etc. We demonstrate that such features can, to certain extent, serve as predictors of the part of speech, represented by the universal POS tag.