Yi Zhang

2022

Pre-trained language models have been recently shown to benefit task-oriented dialogue (TOD) systems. Despite their success, existing methods often formulate this task as a cascaded generation problem which can lead to error accumulation across different sub-tasks and greater data annotation overhead. In this study, we present PPTOD, a unified plug-and-play model for task-oriented dialogue. In addition, we introduce a new dialogue multi-task pre-training strategy that allows the model to learn the primary TOD task completion skills from heterogeneous dialog corpora. We extensively test our model on three benchmark TOD tasks, including end-to-end dialogue modelling, dialogue state tracking, and intent classification. Experimental results show that PPTOD achieves new state of the art on all evaluated tasks in both high-resource and low-resource scenarios. Furthermore, comparisons against previous SOTA methods show that the responses generated by PPTOD are more factually correct and semantically coherent as judged by human annotators.

In text classification tasks, useful information is encoded in the label names. Label semantic aware systems have leveraged this information for improved text classification performance during fine-tuning and prediction. However, use of label-semantics during pre-training has not been extensively explored. We therefore propose Label Semantic Aware Pre-training (LSAP) to improve the generalization and data efficiency of text classification systems. LSAP incorporates label semantics into pre-trained generative models (T5 in our case) by performing secondary pre-training on labeled sentences from a variety of domains. As domain-general pre-training requires large amounts of data, we develop a filtering and labeling pipeline to automatically create sentence-label pairs from unlabeled text. We perform experiments on intent (ATIS, Snips, TOPv2) and topic classification (AG News, Yahoo! Answers). LSAP obtains significant accuracy improvements over state-of-the-art models for few-shot text classification while maintaining performance comparable to state of the art in high-resource settings.

Many users turn to document retrieval systems (e.g. search engines) to seek answers to controversial or open-ended questions. However, classical document retrieval systems fall short at delivering users a set of direct and diverse responses in such cases, which requires identifying responses within web documents in the context of the query, and aggregating the responses based on their different perspectives. The goal of this work is to survey and study the user information needs for building a multi-perspective search engine of such. We examine the challenges of synthesizing such language understanding objectives with document retrieval, and study a new perspective-oriented document retrieval paradigm. We discuss and assess the inherent natural language understanding challenges one needs to address in order to achieve the goal. Following the design challenges and principles, we propose and evaluate a practical prototype pipeline system. We use the prototype system to conduct a user survey in order to assess the utility of our paradigm, as well as understanding the user information needs when issuing controversial and open-ended queries to a search engine.

Dialogue meaning representation formulates natural language utterance semantics in their conversational context in an explicit and machine-readable form. Previous work typically follows the intent-slot framework, which is easy for annotation yet limited in scalability for complex linguistic expressions. A line of works alleviates the representation issue by introducing hierarchical structures but challenging to express complex compositional semantics, such as negation and coreference. We propose Dialogue Meaning Representation (DMR), a pliable and easily extendable representation for task-oriented dialogue. Our representation contains a set of nodes and edges to represent rich compositional semantics. Moreover, we propose an inheritance hierarchy mechanism focusing on domain extensibility. Additionally, we annotated DMR-FastFood, a multi-turn dialogue dataset with more than 70k utterances, with DMR. We propose two evaluation tasks to evaluate different dialogue models and a novel coreference resolution model GNNCoref for the graph-based coreference resolution task. Experiments show that DMR can be parsed well with pre-trained Seq2Seq models, and GNNCoref outperforms the baseline models by a large margin.The dataset and code are available at https://github.com/amazon-research/dialogue-meaning-representation

pdf abs
Injecting Domain Knowledge in Language Models for Task-oriented Dialogue Systems
Denis Emelin | Daniele Bonadiman | Sawsan Alqahtani | Yi Zhang | Saab Mansour
Proceedings of the 2022 Conference on Empirical Methods in Natural Language Processing

Pre-trained language models (PLM) have advanced the state-of-the-art across NLP applications, but lack domain-specific knowledge that does not naturally occur in pre-training data. Previous studies augmented PLMs with symbolic knowledge for different downstream NLP tasks. However, knowledge bases (KBs) utilized in these studies are usually large-scale and static, in contrast to small, domain-specific, and modifiable knowledge bases that are prominent in real-world task-oriented dialogue (TOD) systems. In this paper, we showcase the advantages of injecting domain-specific knowledge prior to fine-tuning on TOD tasks. To this end, we utilize light-weight adapters that can be easily integrated with PLMs and serve as a repository for facts learned from different KBs. To measure the efficacy of proposed knowledge injection methods, we introduce Knowledge Probing using Response Selection (KPRS) – a probe designed specifically for TOD models. Experiments on KPRS and the response generation task show improvements of knowledge injection with adapters over strong baselines.

This paper describes the NiuTrans neural machine translation systems of the WMT22 General MT constrained task. We participate in four directions, including Chinese→English, English→Croatian, and Livonian↔English. Our models are based on several advanced Transformer variants, e.g., Transformer-ODE, Universal Multiscale Transformer (UMST). The main workflow consists of data filtering, large-scale data augmentation (i.e., iterative back-translation, iterative knowledge distillation), and specific-domain fine-tuning. Moreover, we try several multi-domain methods, such as a multi-domain model structure and a multi-domain data clustering method, to rise to this year’s newly proposed multi-domain test set challenge. For low-resource scenarios, we build a multi-language translation model to enhance the performance, and try to use the pre-trained language model (mBERT) to initialize the translation model.

2021

pdf abs
Translation as Cross-Domain Knowledge: Attention Augmentation for Unsupervised Cross-Domain Segmenting and Labeling Tasks
Ruixuan Luo | Yi Zhang | Sishuo Chen | Xu Sun
Findings of the Association for Computational Linguistics: EMNLP 2021

The nature of no word delimiter or inflection that can indicate segment boundaries or word semantics increases the difficulty of Chinese text understanding, and also intensifies the demand for word-level semantic knowledge to accomplish the tagging goal in Chinese segmenting and labeling tasks. However, for unsupervised Chinese cross-domain segmenting and labeling tasks, the model trained on the source domain frequently suffers from the deficient word-level semantic knowledge of the target domain. To address this issue, we propose a novel paradigm based on attention augmentation to introduce crucial cross-domain knowledge via a translation system. The proposed paradigm enables the model attention to draw cross-domain knowledge indicated by the implicit word-level cross-lingual alignment between the input and its corresponding translation. Aside from the model requiring cross-lingual input, we also establish an off-the-shelf model which eludes the dependency on cross-lingual translations. Experiments demonstrate that our proposal significantly advances the state-of-the-art results of cross-domain Chinese segmenting and labeling tasks.

pdf abs
ODIST: Open World Classification via Distributionally Shifted Instances
Lei Shu | Yassine Benajiba | Saab Mansour | Yi Zhang
Findings of the Association for Computational Linguistics: EMNLP 2021

In this work, we address the open-world classification problem with a method called ODIST, open world classification via distributionally shifted instances. This novel and straightforward method can create out-of-domain instances from the in-domain training instances with the help of a pre-trained generative language model. Experimental results show that ODIST performs better than state-of-the-art decision boundary finding method.

pdf abs
Using Optimal Transport as Alignment Objective for fine-tuning Multilingual Contextualized Embeddings
Sawsan Alqahtani | Garima Lalwani | Yi Zhang | Salvatore Romeo | Saab Mansour
Findings of the Association for Computational Linguistics: EMNLP 2021

Recent studies have proposed different methods to improve multilingual word representations in contextualized settings including techniques that align between source and target embedding spaces. For contextualized embeddings, alignment becomes more complex as we additionally take context into consideration. In this work, we propose using Optimal Transport (OT) as an alignment objective during fine-tuning to further improve multilingual contextualized representations for downstream cross-lingual transfer. This approach does not require word-alignment pairs prior to fine-tuning that may lead to sub-optimal matching and instead learns the word alignments within context in an unsupervised manner. It also allows different types of mappings due to soft matching between source and target sentences. We benchmark our proposed method on two tasks (XNLI and XQuAD) and achieve improvements over baselines as well as competitive results compared to similar recent works.

pdf abs
What is Your Article Based On? Inferring Fine-grained Provenance
Yi Zhang | Zachary Ives | Dan Roth
Proceedings of the 59th Annual Meeting of the Association for Computational Linguistics and the 11th International Joint Conference on Natural Language Processing (Volume 1: Long Papers)

When evaluating an article and the claims it makes, a critical reader must be able to assess where the information presented comes from, and whether the various claims are mutually consistent and support the conclusion. This motivates the study of claim provenance, which seeks to trace and explain the origins of claims. In this paper, we introduce new techniques to model and reason about the provenance of multiple interacting claims, including how to capture fine-grained information about the context. Our solution hinges on first identifying the sentences that potentially contain important external information. We then develop a query generator with our novel rank-aware cross attention mechanism, which aims at generating metadata for the source article, based on the context and the signals collected from a search engine. This establishes relevant search queries, and it allows us to obtain source article candidates for each identified sentence and propose an ILP based algorithm to infer the best sources. We experiment with a newly created evaluation dataset, Politi-Prov, based on fact-checking articles from www.politifact.com; our experimental results show that our solution leads to a significant improvement over baselines.

pdf abs
Regression Bugs Are In Your Model! Measuring, Reducing and Analyzing Regressions In NLP Model Updates
Yuqing Xie | Yi-An Lai | Yuanjun Xiong | Yi Zhang | Stefano Soatto
Proceedings of the 59th Annual Meeting of the Association for Computational Linguistics and the 11th International Joint Conference on Natural Language Processing (Volume 1: Long Papers)

Behavior of deep neural networks can be inconsistent between different versions. Regressions during model update are a common cause of concern that often over-weigh the benefits in accuracy or efficiency gain. This work focuses on quantifying, reducing and analyzing regression errors in the NLP model updates. Using negative flip rate as regression measure, we show that regression has a prevalent presence across tasks in the GLUE benchmark. We formulate the regression-free model updates into a constrained optimization problem, and further reduce it into a relaxed form which can be approximately optimized through knowledge distillation training method. We empirically analyze how model ensemble reduces regression. Finally, we conduct CheckList behavioral testing to understand the distribution of regressions across linguistic phenomena, and the efficacy of ensemble and distillation methods.

pdf abs
A Comparative Study on Schema-Guided Dialogue State Tracking
Jie Cao | Yi Zhang
Proceedings of the 2021 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies

Frame-based state representation is widely used in modern task-oriented dialog systems to model user intentions and slot values. However, a fixed design of domain ontology makes it difficult to extend to new services and APIs. Recent work proposed to use natural language descriptions to define the domain ontology instead of tag names for each intent or slot, thus offering a dynamic set of schema. In this paper, we conduct in-depth comparative studies to understand the use of natural language description for schema in dialog state tracking. Our discussion mainly covers three aspects: encoder architectures, impact of supplementary training, and effective schema description styles. We introduce a set of newly designed bench-marking descriptions and reveal the model robustness on both homogeneous and heterogeneous description styles in training and evaluation.

pdf abs
A Global Past-Future Early Exit Method for Accelerating Inference of Pre-trained Language Models
Kaiyuan Liao | Yi Zhang | Xuancheng Ren | Qi Su | Xu Sun | Bin He
Proceedings of the 2021 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies

Early exit mechanism aims to accelerate the inference speed of large-scale pre-trained language models. The essential idea is to exit early without passing through all the inference layers at the inference stage. To make accurate predictions for downstream tasks, the hierarchical linguistic information embedded in all layers should be jointly considered. However, much of the research up to now has been limited to use local representations of the exit layer. Such treatment inevitably loses information of the unused past layers as well as the high-level features embedded in future layers, leading to sub-optimal performance. To address this issue, we propose a novel Past-Future method to make comprehensive predictions from a global perspective. We first take into consideration all the linguistic information embedded in the past layers and then take a further step to engage the future information which is originally inaccessible for predictions. Extensive experiments demonstrate that our method outperforms previous early exit methods by a large margin, yielding better and robust performance.

pdf abs
Learning to Decompose and Organize Complex Tasks
Yi Zhang | Sujay Kumar Jauhar | Julia Kiseleva | Ryen White | Dan Roth
Proceedings of the 2021 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies

People rely on digital task management tools, such as email or to-do apps, to manage their tasks. Some of these tasks are large and complex, leading to action paralysis and feelings of being overwhelmed on the part of the user. The micro-productivity literature has shown that such tasks could benefit from being decomposed and organized, in order to reduce user cognitive load. Thus in this paper, we propose a novel end-to-end pipeline that consumes a complex task and induces a dependency graph from unstructured text to represent sub-tasks and their relationships. Our solution first finds nodes for sub-tasks from multiple ‘how-to’ articles on the web by injecting a neural text generator with three key desiderata – relevance, abstraction, and consensus. Then we resolve and infer edges between these subtask nodes by learning task dependency relations. We collect a new dataset of complex tasks with their sub-task graph to develop and evaluate our solutions. Both components of our graph induction solution are evaluated in experiments, demonstrating that our models outperform a state-of-the-art text generator significantly. Our generalizable and scalable end-to-end solution has important implications for boosting user productivity and assisting with digital task management.

2020

Conventional knowledge graph embedding (KGE) often suffers from limited knowledge representation, leading to performance degradation especially on the low-resource problem. To remedy this, we propose to enrich knowledge representation via pretrained language models by leveraging world knowledge from pretrained models. Specifically, we present a universal training framework named Pretrain-KGE consisting of three phases: semantic-based fine-tuning phase, knowledge extracting phase and KGE training phase. Extensive experiments show that our proposed Pretrain-KGE can improve results over KGE models, especially on solving the low-resource problem.

pdf abs
Context Analysis for Pre-trained Masked Language Models
Yi-An Lai | Garima Lalwani | Yi Zhang
Findings of the Association for Computational Linguistics: EMNLP 2020

Pre-trained language models that learn contextualized word representations from a large un-annotated corpus have become a standard component for many state-of-the-art NLP systems. Despite their successful applications in various downstream NLP tasks, the extent of contextual impact on the word representation has not been explored. In this paper, we present a detailed analysis of contextual impact in Transformer- and BiLSTM-based masked language models. We follow two different approaches to evaluate the impact of context: a masking based approach that is architecture agnostic, and a gradient based approach that requires back-propagation through networks. The findings suggest significant differences on the contextual impact between the two model architectures. Through further breakdown of analysis by syntactic categories, we find the contextual impact in Transformer-based MLM aligns well with linguistic intuition. We further explore the Transformer attention pruning based on our findings in contextual analysis.

pdf abs
Learning to Classify Intents and Slot Labels Given a Handful of Examples
Jason Krone | Yi Zhang | Mona Diab
Proceedings of the 2nd Workshop on Natural Language Processing for Conversational AI

Intent classification (IC) and slot filling (SF) are core components in most goal-oriented dialogue systems. Current IC/SF models perform poorly when the number of training examples per class is small. We propose a new few-shot learning task, few-shot IC/SF, to study and improve the performance of IC and SF models on classes not seen at training time in ultra low resource scenarios. We establish a few-shot IC/SF benchmark by defining few-shot splits for three public IC/SF datasets, ATIS, TOP, and Snips. We show that two popular few-shot learning algorithms, model agnostic meta learning (MAML) and prototypical networks, outperform a fine-tuning baseline on this benchmark. Prototypical networks achieves significant gains in IC performance on the ATIS and TOP datasets, while both prototypical networks and MAML outperform the baseline with respect to SF on all three datasets. In addition, we demonstrate that joint training as well as the use of pre-trained language models, ELMo and BERT in our case, are complementary to these few-shot learning methods and yield further gains.

pdf abs
Parallel Data Augmentation for Formality Style Transfer
Yi Zhang | Tao Ge | Xu Sun
Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics

The main barrier to progress in the task of Formality Style Transfer is the inadequacy of training data. In this paper, we study how to augment parallel data and propose novel and simple data augmentation methods for this task to obtain useful sentence pairs with easily accessible models and systems. Experiments demonstrate that our augmented parallel data largely helps improve formality style transfer when it is used to pre-train the model, leading to the state-of-the-art results in the GYAFC benchmark dataset.

pdf abs
“Who said it, and Why?” Provenance for Natural Language Claims
Yi Zhang | Zachary Ives | Dan Roth
Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics

In an era where generating content and publishing it is so easy, we are bombarded with information and are exposed to all kinds of claims, some of which do not always rank high on the truth scale. This paper suggests that the key to a longer-term, holistic, and systematic approach to navigating this information pollution is capturing the provenance of claims. To do that, we develop a formal definition of provenance graph for a given natural language claim, aiming to understand where the claim may come from and how it has evolved. To construct the graph, we model provenance inference, formulated mainly as an information extraction task and addressed via a textual entailment model. We evaluate our approach using two benchmark datasets, showing initial success in capturing the notion of provenance and its effectiveness on the application of claim verification.

pdf abs
Diversity, Density, and Homogeneity: Quantitative Characteristic Metrics for Text Collections
Yi-An Lai | Xuan Zhu | Yi Zhang | Mona Diab
Proceedings of the Twelfth Language Resources and Evaluation Conference

Summarizing data samples by quantitative measures has a long history, with descriptive statistics being a case in point. However, as natural language processing methods flourish, there are still insufficient characteristic metrics to describe a collection of texts in terms of the words, sentences, or paragraphs they comprise. In this work, we propose metrics of diversity, density, and homogeneity that quantitatively measure the dispersion, sparsity, and uniformity of a text collection. We conduct a series of simulations to verify that each metric holds desired properties and resonates with human intuitions. Experiments on real-world datasets demonstrate that the proposed characteristic metrics are highly correlated with text classification performance of a renowned model, BERT, which could inspire future applications.

2019

pdf abs
Goal-Embedded Dual Hierarchical Model for Task-Oriented Dialogue Generation
Yi-An Lai | Arshit Gupta | Yi Zhang
Proceedings of the 23rd Conference on Computational Natural Language Learning (CoNLL)

Hierarchical neural networks are often used to model inherent structures within dialogues. For goal-oriented dialogues, these models miss a mechanism adhering to the goals and neglect the distinct conversational patterns between two interlocutors. In this work, we propose Goal-Embedded Dual Hierarchical Attentional Encoder-Decoder (G-DuHA) able to center around goals and capture interlocutor-level disparity while modeling goal-oriented dialogues. Experiments on dialogue generation, response generation, and human evaluations demonstrate that the proposed model successfully generates higher-quality, more diverse and goal-centric dialogues. Moreover, we apply data augmentation via goal-oriented dialogue generation for task-oriented dialog systems with better performance achieved.

pdf abs
Amazon at MRP 2019: Parsing Meaning Representations with Lexical and Phrasal Anchoring
Jie Cao | Yi Zhang | Adel Youssef | Vivek Srikumar
Proceedings of the Shared Task on Cross-Framework Meaning Representation Parsing at the 2019 Conference on Natural Language Learning

This paper describes the system submission of our team Amazon to the shared task on Cross Framework Meaning Representation Parsing (MRP) at the 2019 Conference for Computational Language Learning (CoNLL). Via extensive analysis of implicit alignments in AMR, we recategorize five meaning representations (MRs) into two classes: Lexical- Anchoring and Phrasal-Anchoring. Then we propose a unified graph-based parsing framework for the lexical-anchoring MRs, and a phrase-structure parsing for one of the phrasal- anchoring MRs, UCCA. Our system submission ranked 1st in the AMR subtask, and later improvements show promising results on other frameworks as well.

pdf abs
Evidence-based Trustworthiness
Yi Zhang | Zachary Ives | Dan Roth
Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics

The information revolution brought with it information pollution. Information retrieval and extraction help us cope with abundant information from diverse sources. But some sources are of anonymous authorship, and some are of uncertain accuracy, so how can we determine what we should actually believe? Not all information sources are equally trustworthy, and simply accepting the majority view is often wrong. This paper develops a general framework for estimating the trustworthiness of information sources in an environment where multiple sources provide claims and supporting evidence, and each claim can potentially be produced by multiple sources. We consider two settings: one in which information sources directly assert claims, and a more realistic and challenging one, in which claims are inferred from evidence provided by sources, via (possibly noisy) NLP techniques. Our key contribution is to develop a family of probabilistic models that jointly estimate the trustworthiness of sources, and the credibility of claims they assert. This is done while accounting for the (possibly noisy) NLP needed to infer claims from evidence supplied by sources. We evaluate our framework on several datasets, showing strong results and significant improvement over baselines.

pdf abs
Multi-Domain Goal-Oriented Dialogues (MultiDoGO): Strategies toward Curating and Annotating Large Scale Dialogue Data
Denis Peskov | Nancy Clarke | Jason Krone | Brigi Fodor | Yi Zhang | Adel Youssef | Mona Diab
Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing (EMNLP-IJCNLP)

The need for high-quality, large-scale, goal-oriented dialogue datasets continues to grow as virtual assistants become increasingly wide-spread. However, publicly available datasets useful for this area are limited either in their size, linguistic diversity, domain coverage, or annotation granularity. In this paper, we present strategies toward curating and annotating large scale goal oriented dialogue data. We introduce the MultiDoGO dataset to overcome these limitations. With a total of over 81K dialogues harvested across six domains, MultiDoGO is over 8 times the size of MultiWOZ, the other largest comparable dialogue dataset currently available to the public. Over 54K of these harvested conversations are annotated for intent classes and slot labels. We adopt a Wizard-of-Oz approach wherein a crowd-sourced worker (the “customer”) is paired with a trained annotator (the “agent”). The data curation process was controlled via biases to ensure a diversity in dialogue flows following variable dialogue policies. We provide distinct class label tags for agents vs. customer utterances, along with applicable slot labels. We also compare and contrast our strategies on annotation granularity, i.e. turn vs. sentence level. Furthermore, we compare and contrast annotations curated by leveraging professional annotators vs the crowd. We believe our strategies for eliciting and annotating such a dialogue dataset scales across modalities and domains and potentially languages in the future. To demonstrate the efficacy of our devised strategies we establish neural baselines for classification on the agent and customer utterances as well as slot labeling for each domain.

2018

pdf
A Chinese Dataset with Negative Full Forms for General Abbreviation Prediction
Yi Zhang | Xu Sun
Proceedings of the Eleventh International Conference on Language Resources and Evaluation (LREC 2018)

pdf abs
Does Higher Order LSTM Have Better Accuracy for Segmenting and Labeling Sequence Data?
Yi Zhang | Xu Sun | Shuming Ma | Yang Yang | Xuancheng Ren
Proceedings of the 27th International Conference on Computational Linguistics

Existing neural models usually predict the tag of the current token independent of the neighboring tags. The popular LSTM-CRF model considers the tag dependencies between every two consecutive tags. However, it is hard for existing neural models to take longer distance dependencies between tags into consideration. The scalability is mainly limited by the complex model structures and the cost of dynamic programming during training. In our work, we first design a new model called “high order LSTM” to predict multiple tags for the current token which contains not only the current tag but also the previous several tags. We call the number of tags in one prediction as “order”. Then we propose a new method called Multi-Order BiLSTM (MO-BiLSTM) which combines low order and high order LSTMs together. MO-BiLSTM keeps the scalability to high order models with a pruning technique. We evaluate MO-BiLSTM on all-phrase chunking and NER datasets. Experiment results show that MO-BiLSTM achieves the state-of-the-art result in chunking and highly competitive results in two NER datasets.

pdf abs
Learning Sentiment Memories for Sentiment Modification without Parallel Data
Yi Zhang | Jingjing Xu | Pengcheng Yang | Xu Sun
Proceedings of the 2018 Conference on Empirical Methods in Natural Language Processing

The task of sentiment modification requires reversing the sentiment of the input and preserving the sentiment-independent content. However, aligned sentences with the same content but different sentiments are usually unavailable. Due to the lack of such parallel data, it is hard to extract sentiment independent content and reverse the sentiment in an unsupervised way. Previous work usually can not reconcile sentiment transformation and content preservation. In this paper, motivated by the fact the non-emotional context (e.g., “staff”) provides strong cues for the occurrence of emotional words (e.g., “friendly”), we propose a novel method that automatically extracts appropriate sentiment information from learned sentiment memories according to the specific context. Experiments show that our method substantially improves the content preservation degree and achieves the state-of-the-art performance.

pdf abs
A Skeleton-Based Model for Promoting Coherence Among Sentences in Narrative Story Generation
Jingjing Xu | Xuancheng Ren | Yi Zhang | Qi Zeng | Xiaoyan Cai | Xu Sun
Proceedings of the 2018 Conference on Empirical Methods in Natural Language Processing

Narrative story generation is a challenging problem because it demands the generated sentences with tight semantic connections, which has not been well studied by most existing generative models. To address this problem, we propose a skeleton-based model to promote the coherence of generated stories. Different from traditional models that generate a complete sentence at a stroke, the proposed model first generates the most critical phrases, called skeleton, and then expands the skeleton to a complete and fluent sentence. The skeleton is not manually defined, but learned by a reinforcement learning method. Compared to the state-of-the-art models, our skeleton-based model can generate significantly more coherent text according to human evaluation and automatic evaluation. The G-score is improved by 20.1% in human evaluation.

pdf bib abs
Scalable Wide and Deep Learning for Computer Assisted Coding
Marilisa Amoia | Frank Diehl | Jesus Gimenez | Joel Pinto | Raphael Schumann | Fabian Stemmer | Paul Vozila | Yi Zhang
Proceedings of the 2018 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 3 (Industry Papers)

In recent years the use of electronic medical records has accelerated resulting in large volumes of medical data when a patient visits a healthcare facility. As a first step towards reimbursement healthcare institutions need to associate ICD-10 billing codes to these documents. This is done by trained clinical coders who may use a computer assisted solution for shortlisting of codes. In this work, we present our work to build a machine learning based scalable system for predicting ICD-10 codes from electronic medical records. We address data imbalance issues by implementing two system architectures using convolutional neural networks and logistic regression models. We illustrate the pros and cons of those system designs and show that the best performance can be achieved by leveraging the advantages of both using a system combination approach.

2014

pdf abs
Information Extraction from German Patient Records via Hybrid Parsing and Relation Extraction Strategies
Hans-Ulrich Krieger | Christian Spurk | Hans Uszkoreit | Feiyu Xu | Yi Zhang | Frank Müller | Thomas Tolxdorff
Proceedings of the Ninth International Conference on Language Resources and Evaluation (LREC'14)

In this paper, we report on first attempts and findings to analyzing German patient records, using a hybrid parsing architecture and a combination of two relation extraction strategies. On a practical level, we are interested in the extraction of concepts and relations among those concepts, a necessary cornerstone for building medical information systems. The parsing pipeline consists of a morphological analyzer, a robust chunk parser adapted to Latin phrases used in medical diagnosis, a repair rule stage, and a probabilistic context-free parser that respects the output from the chunker. The relation extraction stage is a combination of two systems: SProUT, a shallow processor which uses hand-written rules to discover relation instances from local text units and DARE which extracts relation instances from complete sentences, using rules that are learned in a bootstrapping process, starting with semantic seeds. Two small experiments have been carried out for the parsing pipeline and the relation extraction stage.

Extracting instances of sentiment-oriented relations from user-generated web documents is important for online marketing analysis. Unlike previous work, we formulate this extraction task as a structured prediction problem and design the corresponding inference as an integer linear program. Our latent structural SVM based model can learn from training corpora that do not contain explicit annotations of sentiment-bearing expressions, and it can simultaneously recognize instances of both binary (polarity) and ternary (comparative) relations with regard to entity mentions of interest. The empirical evaluation shows that our approach significantly outperforms state-of-the-art systems across domains (cameras and movies) and across genres (reviews and forum posts). The gold standard corpus that we built will also be a valuable resource for the community.

2013

pdf
Deep Context-Free Grammar for Chinese with Broad-Coverage
Xiangli Wang | Yi Zhang | Yusuke Miyao | Takuya Matsuzaki | Junichi Tsujii
Proceedings of the Seventh SIGHAN Workshop on Chinese Language Processing

2012

pdf abs
Joint Grammar and Treebank Development for Mandarin Chinese with HPSG
Yi Zhang | Rui Wang | Yu Chen
Proceedings of the Eighth International Conference on Language Resources and Evaluation (LREC'12)

We present the ongoing development of MCG, a linguistically deep and precise grammar for Mandarin Chinese together with its accompanying treebank, both based on the linguistic framework of HPSG, and using MRS as the semantic representation. We highlight some key features of our grammar design, and review a number of challenging phenomena, with comparisons to alternative linguistic treatments and implementations. One of the distinguishing characteristics of our approach is the tight integration of grammar and treebank development. The two-step treebank annotation procedure benefits from the efficiency of the discriminant-based annotation approach, while giving the annotators full freedom of producing extra-grammatical structures. This not only allows the creation of a precise and full-coverage treebank with an imperfect grammar, but also provides prompt feedback for grammarians to identify the errors in the grammar design and implementation. Preliminary evaluation and error analysis shows that the grammar already covers most of the core phenomena for Mandarin Chinese, and the treebank annotation procedure reaches a stable speed of 35 sentences per hour with satisfying quality.

pdf abs
CLIMB grammars: three projects using metagrammar engineering
Antske Fokkens | Tania Avgustinova | Yi Zhang
Proceedings of the Eighth International Conference on Language Resources and Evaluation (LREC'12)

This paper introduces the CLIMB (Comparative Libraries of Implementations with Matrix Basis) methodology and grammars. The basic idea behind CLIMB is to use code generation as a general methodology for grammar development in order to create a more systematic approach to grammar development. The particular method used in this paper is closely related to the LinGO Grammar Matrix. Like the Grammar Matrix, resulting grammars are HPSG grammars that can map bidirectionally between strings and MRS representations. The main purpose of this paper is to provide insight into the process of using CLIMB for grammar development. In addition, we describe three projects that make use of this methodology or have concrete plans to adapt CLIMB in the future: CLIMB for Germanic languages, CLIMB for Slavic languages and CLIMB to combine two grammars of Mandarin Chinese. We present the first results that indicate feasibility and development time improvements for creating a medium to large coverage precision grammar.

pdf
Sentence Realization with Unlexicalized Tree Linearization Grammars
Rui Wang | Yi Zhang
Proceedings of COLING 2012: Posters

2011

pdf
An Empirical Comparison of Unknown Word Prediction Methods
Kostadin Cholakov | Gertjan van Noord | Valia Kordoni | Yi Zhang
Proceedings of 5th International Joint Conference on Natural Language Processing

pdf
Parser Evaluation over Local and Non-Local Deep Dependencies in a Large Corpus
Emily M. Bender | Dan Flickinger | Stephan Oepen | Yi Zhang
Proceedings of the 2011 Conference on Empirical Methods in Natural Language Processing

pdf
Spring Cleaning and Grammar Compression: Two Techniques for Detection of Redundancy in HPSG Grammars
Antske Fokkens | Yi Zhang | Emily M. Bender
Proceedings of the 25th Pacific Asia Conference on Language, Information and Computation

pdf
Minimally Supervised Domain-Adaptive Parse Reranking for Relation Extraction
Feiyu Xu | Hong Li | Yi Zhang | Hans Uszkoreit | Sebastian Krause
Proceedings of the 12th International Conference on Parsing Technologies

pdf
Large-Scale Corpus-Driven PCFG Approximation of an HPSG
Yi Zhang | Hans-Ulrich Krieger
Proceedings of the 12th International Conference on Parsing Technologies

pdf
Statistical Machine Transliteration with Multi-to-Multi Joint Source Channel Model
Yu Chen | Rui Wang | Yi Zhang
Proceedings of the 3rd Named Entities Workshop (NEWS 2011)

pdf
Engineering a Deep HPSG for Mandarin Chinese
Yi Zhang | Rui Wang | Yu Chen
Proceedings of the 9th Workshop on Asian Language Resources

pdf
Adaptability of Lexical Acquisition for Large-scale Grammars
Kostadin Cholakov | Gertjan van Noord | Valia Kordoni | Yi Zhang
Proceedings of the International Conference Recent Advances in Natural Language Processing 2011

2010

pdf
Discriminative Parse Reranking for Chinese with Homogeneous and Heterogeneous Annotations
Weiwei Sun | Rui Wang | Yi Zhang
CIPS-SIGHAN Joint Conference on Chinese Language Processing

pdf
Constraining robust constructions for broad-coverage parsing with precision grammars
Bart Cramer | Yi Zhang
Proceedings of the 23rd International Conference on Computational Linguistics (Coling 2010)

pdf
Discriminant Ranking for Efficient Treebanking
Yi Zhang | Valia Kordoni
Coling 2010: Posters

pdf
MARS: A Specialized RTE System for Parser Evaluation
Rui Wang | Yi Zhang
Proceedings of the 5th International Workshop on Semantic Evaluation

pdf abs
Semantic Feature Engineering for Enhancing Disambiguation Performance in Deep Linguistic Processing
Danielle Ben-Gera | Yi Zhang | Valia Kordoni
Proceedings of the Seventh International Conference on Language Resources and Evaluation (LREC'10)

The task of parse disambiguation has gained in importance over the last decade as the complexity of grammars used in deep linguistic processing has been increasing. In this paper we propose to employ the fine-grained HPSG formalism in order to investigate the contribution of deeper linguistic knowledge to the task of ranking the different trees the parser outputs. In particular, we focus on the incorporation of semantic features in the disambiguation component and the stability of our model cross domains. Our work is carried out within DELPH-IN (http://www.delph-in.net), using the LinGo Redwoods and the WeScience corpora, parsed with the English Resource Grammar and the PET parser.

pdf abs
Disambiguating Compound Nouns for a Dynamic HPSG Treebank of Wall Street Journal Texts
Valia Kordoni | Yi Zhang
Proceedings of the Seventh International Conference on Language Resources and Evaluation (LREC'10)

The aim of this paper is twofold. We focus, on the one hand, on the task of dynamically annotating English compound nouns, and on the other hand we propose disambiguation methods and techniques which facilitate the annotation task. Both the aforementioned are part of a larger on-going effort which aims to create HPSG annotation for the texts from theWall Street Journal (henceforward WSJ) sections of the Penn Treebank (henceforward PTB) with the help of a hand-written large-scale and wide-coverage grammar of English, the English Resource Grammar (henceforward ERG; Flickinger (2002)). As we show in this paper, such annotations are very rich linguistically, since apart from syntax they also incorporate semantics, which does not only ensure that the treebank is guaranteed to be a truly sharable, re-usable and multi-functional linguistic resource, but also calls for the necessity of a better disambiguation of the internal (syntactic) structure of larger units of words, such as compound nouns, since this has an impact on the representation of their meaning, which is of utmost interest if the linguistic annotation of a given corpus is to be further understood as the practice of adding interpretative linguistic information of the highest quality in order to give added value to the corpus.

pdf abs
Hybrid Constituent and Dependency Parsing with Tsinghua Chinese Treebank
Rui Wang | Yi Zhang
Proceedings of the Seventh International Conference on Language Resources and Evaluation (LREC'10)

In this paper, we describe our hybrid parsing model on the Mandarin Chinese processing. In particular, we work on the Tsinghua Chinese Treebank (TCT), whose annotation has both constitutes and the head information of each constitute. The model we design combines the mainstream constitute parsing and dependency parsing. We present in detail 1) how to (partially) encode the head information into the constitute parsing, 2) how to encode constitute information into the dependency parsing, and 3) how to restore the head information using the dependency structure. For each of them, we take different strategies to deal with different cases. In an open shared task evaluation, we achieve an f1-score of 85.23% for the constitute parsing, 82.35% with partial head information, and 74.27% with complete head information. The error analysis shows the challenge of restoring multiple-headed constitutes and also some potentials to use the dependency structure to guide the constitute parsing, which will be our future work to explore.

pdf bib
Chart Mining-based Lexical Acquisition with Precision Grammars
Yi Zhang | Timothy Baldwin | Valia Kordoni | David Martinez | Jeremy Nicholson
Human Language Technologies: The 2010 Annual Conference of the North American Chapter of the Association for Computational Linguistics

2009

pdf
Recognizing Textual Relatedness with Predicate-Argument Structures
Rui Wang | Yi Zhang
Proceedings of the 2009 Conference on Empirical Methods in Natural Language Processing

pdf
Chinese Novelty Mining
Yi Zhang | Flora S. Tsai
Proceedings of the 2009 Conference on Empirical Methods in Natural Language Processing

pdf
Hybrid Multilingual Parsing with HPSG for SRL
Yi Zhang | Rui Wang | Stephan Oepen
Proceedings of the Thirteenth Conference on Computational Natural Language Learning (CoNLL 2009): Shared Task

pdf
Construction of a German HPSG grammar from a detailed treebank
Bart Cramer | Yi Zhang
Proceedings of the 2009 Workshop on Grammar Engineering Across Frameworks (GEAF 2009)

pdf
Annotating Wall Street Journal Texts Using a Hand-Crafted Deep Linguistic Grammar
Valia Kordoni | Yi Zhang
Proceedings of the Third Linguistic Annotation Workshop (LAW III)

pdf
An Extensible Crosslinguistic Readability Framework
Jesse Kirchner | Justin Nuger | Yi Zhang
Proceedings of the 2nd Workshop on Building and Using Comparable Corpora: from Parallel to Non-parallel Corpora (BUCC)

pdf
Using Treebanking Discriminants as Parse Disambiguation Features
Md. Faisal Mahbub Chowdhury | Yi Zhang | Valia Kordoni
Proceedings of the 11th International Conference on Parsing Technologies (IWPT’09)

pdf bib
Exploiting the Russian National Corpus in the Development of a Russian Resource Grammar
Tania Avgustinova | Yi Zhang
Proceedings of the Workshop on Adaptation of Language Resources and Technology to New Domains

pdf
Enabling Adaptation of Lexicalised Grammars to New Domains
Valia Kordoni | Yi Zhang
Proceedings of the Workshop on Adaptation of Language Resources and Technology to New Domains

pdf
A Non-negative Matrix Tri-factorization Approach to Sentiment Classification with Lexical Prior Knowledge
Tao Li | Yi Zhang | Vikas Sindhwani
Proceedings of the Joint Conference of the 47th Annual Meeting of the ACL and the 4th International Joint Conference on Natural Language Processing of the AFNLP

pdf
Cross-Domain Dependency Parsing Using a Deep Linguistic Grammar
Yi Zhang | Rui Wang
Proceedings of the Joint Conference of the 47th Annual Meeting of the ACL and the 4th International Joint Conference on Natural Language Processing of the AFNLP

2008

pdf
Mapping between Compositional Semantic Representations and Lexical Semantic Resources: Towards Accurate Deep Semantic Parsing
Sergio Roa | Valia Kordoni | Yi Zhang
Proceedings of ACL-08: HLT, Short Papers

pdf abs
Evaluating and Extending the Coverage of HPSG Grammars: A Case Study for German
Jeremy Nicholson | Valia Kordoni | Yi Zhang | Timothy Baldwin | Rebecca Dridan
Proceedings of the Sixth International Conference on Language Resources and Evaluation (LREC'08)

In this work, we examine and attempt to extend the coverage of a German HPSG grammar. We use the grammar to parse a corpus of newspaper text and evaluate the proportion of sentences which have a correct attested parse, and analyse the cause of errors in terms of lexical or constructional gaps which prevent parsing. Then, using a maximum entropy model, we evaluate prediction of lexical types in the HPSG type hierarchy for unseen lexemes. By automatically adding entries to the lexicon, we observe that we can increase coverage without substantially decreasing precision.

pdf abs
Robust Parsing with a Large HPSG Grammar
Yi Zhang | Valia Kordoni
Proceedings of the Sixth International Conference on Language Resources and Evaluation (LREC'08)

In this paper we propose a partial parsing model which achieves robust parsing with a large HPSG grammar. Constraint-based precision grammars, like the HPSG grammar we are using for the experiments reported in this paper, typically lack robustness, especially when applied to real world texts. To maximally recover the linguistic knowledge from an unsuccessful parse, a proper selection model must be used. Also, the efficiency challenges usually presented by the selection model must be answered. Building on the work reported in (Zhang et al., 2007), we further propose a new partial parsing model that splits the parsing process into two stages, both of which use the bottom-up chart-based parsing algorithm. The algorithm is implemented and a preliminary experiment shows promising results.

pdf
Towards Domain-Independent Deep Linguistic Processing: Ensuring Portability and Re-Usability of Lexicalised Grammars
Kostadin Cholakov | Valia Kordoni | Yi Zhang
Coling 2008: Proceedings of the workshop on Grammar Engineering Across Frameworks

pdf
Hybrid Learning of Dependency Structures from Heterogeneous Linguistic Resources
Yi Zhang | Rui Wang | Hans Uszkoreit
CoNLL 2008: Proceedings of the Twelfth Conference on Computational Natural Language Learning

2007

pdf
Partial Parse Selection for Robust Deep Processing
Yi Zhang | Valia Kordoni | Erin Fitzgerald
ACL 2007 Workshop on Deep Linguistic Processing

pdf
The Corpus and the Lexicon: Standardising Deep Lexical Acquisition Evaluation
Yi Zhang | Timothy Baldwin | Valia Kordoni
ACL 2007 Workshop on Deep Linguistic Processing

pdf
Efficiency in Unification-Based N-Best Parsing
Yi Zhang | Stephan Oepen | John Carroll
Proceedings of the Tenth International Conference on Parsing Technologies

pdf
Validation and Evaluation of Automatically Acquired Multiword Expressions for Grammar Engineering
Aline Villavicencio | Valia Kordoni | Yi Zhang | Marco Idiart | Carlos Ramisch
Proceedings of the 2007 Joint Conference on Empirical Methods in Natural Language Processing and Computational Natural Language Learning (EMNLP-CoNLL)

2006

pdf abs
Automated Deep Lexical Acquisition for Robust Open Texts Processing
Yi Zhang | Valia Kordoni
Proceedings of the Fifth International Conference on Language Resources and Evaluation (LREC’06)

In this paper, we report on methods to detect and repair lexical errors for deep grammars. The lack of coverage has for long been the major problem for deep processing. The existence of various errors in the hand-crafted large grammars prevents their usage in real applications. The manual detection and repair of errors requires asignificant amount of human effort. An experiment with the British National Corpus shows about 70% of the sentences contain unknownword(s) for the English Resource Grammar. With the help of error mining methods, many lexical errors are discovered, which cause a large part of the parsing failures. Moreover, with a lexical type predictor based on a maximum entropy model, new lexical entries are automatically generated. The contribution of various features for the model is evaluated. With the disambiguated full parsing results, the precision of the predictor is enhanced significantly.

pdf
Automated Multiword Expression Prediction for Grammar Engineering
Yi Zhang | Valia Kordoni | Aline Villavicencio | Marco Idiart
Proceedings of the Workshop on Multiword Expressions: Identifying and Exploiting Underlying Properties