Vincent Ng


2024

pdf
Universal Anaphora: The First Three Years
Massimo Poesio | Maciej Ogrodniczuk | Vincent Ng | Sameer Pradhan | Juntao Yu | Nafise Sadat Moosavi | Silviu Paun | Amir Zeldes | Anna Nedoluzhko | Michal Novák | Martin Popel | Zdeněk Žabokrtský | Daniel Zeman
Proceedings of the 2024 Joint International Conference on Computational Linguistics, Language Resources and Evaluation (LREC-COLING 2024)

The aim of the Universal Anaphora initiative is to push forward the state of the art in anaphora and anaphora resolution by expanding the aspects of anaphoric interpretation which are or can be reliably annotated in anaphoric corpora, producing unified standards to annotate and encode these annotations, delivering datasets encoded according to these standards, and developing methods for evaluating models that carry out this type of interpretation. Although several papers on aspects of the initiative have appeared, no overall description of the initiative’s goals, proposals and achievements has been published yet except as an online draft. This paper aims to fill this gap, as well as to discuss its progress so far.

pdf
ICLE++: Modeling Fine-Grained Traits for Holistic Essay Scoring
Shengjie Li | Vincent Ng
Proceedings of the 2024 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies (Volume 1: Long Papers)

The majority of the recently developed models for automated essay scoring (AES) are evaluated solely on the ASAP corpus. However, ASAP is not without its limitations. For instance, it is not clear whether models trained on ASAP can generalize well when evaluated on other corpora. In light of these limitations, we introduce ICLE++, a corpus of persuasive student essays annotated with both holistic scores and trait-specific scores. Not only can ICLE++ be used to test the generalizability of AES models trained on ASAP, but it can also facilitate the evaluation of models developed for newer AES problems such as multi-trait scoring and cross-prompt scoring. We believe that ICLE++, which represents a culmination of our long-term effort in annotating the essays in the ICLE corpus, contributes to the set of much-needed annotated corpora for AES research.

2023

pdf bib
Proceedings of The Sixth Workshop on Computational Models of Reference, Anaphora and Coreference (CRAC 2023)
Maciej Ogrodniczuk | Vincent Ng | Sameer Pradhan | Massimo Poesio
Proceedings of The Sixth Workshop on Computational Models of Reference, Anaphora and Coreference (CRAC 2023)

pdf
PairSpanBERT: An Enhanced Language Model for Bridging Resolution
Hideo Kobayashi | Yufang Hou | Vincent Ng
Proceedings of the 61st Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers)

We present PairSpanBERT, a SpanBERT-based pre-trained model specialized for bridging resolution. To this end, we design a novel pre-training objective that aims to learn the contexts in which two mentions are implicitly linked to each other from a large amount of data automatically generated either heuristically or via distance supervision with a knowledge graph. Despite the noise inherent in the automatically generated data, we achieve the best results reported to date on three evaluation datasets for bridging resolution when replacing SpanBERT with PairSpanBERT in a state-of-the-art resolver that jointly performs entity coreference resolution and bridging resolution.

2022

pdf
Legal Judgment Prediction via Event Extraction with Constraints
Yi Feng | Chuanyi Li | Vincent Ng
Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers)

While significant progress has been made on the task of Legal Judgment Prediction (LJP) in recent years, the incorrect predictions made by SOTA LJP models can be attributed in part to their failure to (1) locate the key event information that determines the judgment, and (2) exploit the cross-task consistency constraints that exist among the subtasks of LJP. To address these weaknesses, we propose EPM, an Event-based Prediction Model with constraints, which surpasses existing SOTA models in performance on a standard LJP dataset.

pdf
Constrained Multi-Task Learning for Bridging Resolution
Hideo Kobayashi | Yufang Hou | Vincent Ng
Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers)

We examine the extent to which supervised bridging resolvers can be improved without employing additional labeled bridging data by proposing a novel constrained multi-task learning framework for bridging resolution, within which we (1) design cross-task consistency constraints to guide the learning process; (2) pre-train the entity coreference model in the multi-task framework on the large amount of publicly available coreference data; and (3) integrating prior knowledge encoded in rule-based resolvers. Our approach achieves state-of-the-art results on three standard evaluation corpora.

pdf bib
Proceedings of the CODI-CRAC 2022 Shared Task on Anaphora, Bridging, and Discourse Deixis in Dialogue
Juntao Yu | Sopan Khosla | Ramesh Manuvinakurike | Lori Levin | Vincent Ng | Massimo Poesio | Michael Strube | Carolyn Rose
Proceedings of the CODI-CRAC 2022 Shared Task on Anaphora, Bridging, and Discourse Deixis in Dialogue

pdf bib
The CODI-CRAC 2022 Shared Task on Anaphora, Bridging, and Discourse Deixis in Dialogue
Juntao Yu | Sopan Khosla | Ramesh Manuvinakurike | Lori Levin | Vincent Ng | Massimo Poesio | Michael Strube | Carolyn Rosé
Proceedings of the CODI-CRAC 2022 Shared Task on Anaphora, Bridging, and Discourse Deixis in Dialogue

The CODI-CRAC 2022 Shared Task on Anaphora Resolution in Dialogues is the second edition of an initiative focused on detecting different types of anaphoric relations in conversations of different kinds. Using five conversational datasets, four of which have been newly annotated with a wide range of anaphoric relations: identity, bridging references and discourse deixis, we defined multiple tasks focusing individually on these key relations. The second edition of the shared task maintained the focus on these relations and used the same datasets as in 2021, but new test data were annotated, the 2021 data were checked, and new subtasks were added. In this paper, we discuss the annotation schemes, the datasets, the evaluation scripts used to assess the system performance on these tasks, and provide a brief summary of the participating systems and the results obtained across 230 runs from three teams, with most submissions achieving significantly better results than our baseline methods.

pdf
Neural Anaphora Resolution in Dialogue Revisited
Shengjie Li | Hideo Kobayashi | Vincent Ng
Proceedings of the CODI-CRAC 2022 Shared Task on Anaphora, Bridging, and Discourse Deixis in Dialogue

We present the systems that we developed for all three tracks of the CODI-CRAC 2022 shared task, namely the anaphora resolution track, the bridging resolution track, and the discourse deixis resolution track. Combining an effective encoding of the input using the SpanBERTLarge encoder with an extensive hyperparameter search process, our systems achieved the highest scores in all phases of all three tracks.

pdf
DiscoSense: Commonsense Reasoning with Discourse Connectives
Prajjwal Bhargava | Vincent Ng
Proceedings of the 2022 Conference on Empirical Methods in Natural Language Processing

We present DiscoSense, a benchmark for commonsense reasoning via understanding a wide variety of discourse connectives. We generate compelling distractors in DiscoSense using Conditional Adversarial Filtering, an extension of Adversarial Filtering that employs conditional generation. We show that state-of-the-art pre-trained language models struggle to perform well on DiscoSense, which makes this dataset ideal for evaluating next-generation commonsense reasoning systems.

pdf
End-to-End Neural Discourse Deixis Resolution in Dialogue
Shengjie Li | Vincent Ng
Proceedings of the 2022 Conference on Empirical Methods in Natural Language Processing

We adapt Lee et al.’s (2018) span-based entity coreference model to the task of end-to-end discourse deixis resolution in dialogue, specifically by proposing extensions to their model that exploit task-specific characteristics. The resulting model, dd-utt, achieves state-of-the-art results on the four datasets in the CODI-CRAC 2021 shared task.

pdf bib
Proceedings of the Fifth Workshop on Computational Models of Reference, Anaphora and Coreference
Maciej Ogrodniczuk | Sameer Pradhan | Anna Nedoluzhko | Vincent Ng | Massimo Poesio
Proceedings of the Fifth Workshop on Computational Models of Reference, Anaphora and Coreference

pdf
End-to-End Neural Bridging Resolution
Hideo Kobayashi | Yufang Hou | Vincent Ng
Proceedings of the 29th International Conference on Computational Linguistics

The state of bridging resolution research is rather unsatisfactory: not only are state-of-the-art resolvers evaluated in unrealistic settings, but the neural models underlying these resolvers are weaker than those used for entity coreference resolution. In light of these problems, we evaluate bridging resolvers in an end-to-end setting, strengthen them with better encoders, and attempt to gain a better understanding of them via perturbation experiments and a manual analysis of their outputs.

2021

pdf
Bridging Resolution: Making Sense of the State of the Art
Hideo Kobayashi | Vincent Ng
Proceedings of the 2021 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies

While Yu and Poesio (2020) have recently demonstrated the superiority of their neural multi-task learning (MTL) model to rule-based approaches for bridging anaphora resolution, there is little understanding of (1) how it is better than the rule-based approaches (e.g., are the two approaches making similar or complementary mistakes?) and (2) what should be improved. To shed light on these issues, we (1) propose a hybrid rule-based and MTL approach that would enable a better understanding of their comparative strengths and weaknesses; and (2) perform a manual analysis of the errors made by the MTL model.

pdf
Constrained Multi-Task Learning for Event Coreference Resolution
Jing Lu | Vincent Ng
Proceedings of the 2021 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies

We propose a neural event coreference model in which event coreference is jointly trained with five tasks: trigger detection, entity coreference, anaphoricity determination, realis detection, and argument extraction. To guide the learning of this complex model, we incorporate cross-task consistency constraints into the learning process as soft constraints via designing penalty functions. In addition, we propose the novel idea of viewing entity coreference and event coreference as a single coreference task, which we believe is a step towards a unified model of coreference resolution. The resulting model achieves state-of-the-art results on the KBP 2017 event coreference dataset.

pdf
Conundrums in Event Coreference Resolution: Making Sense of the State of the Art
Jing Lu | Vincent Ng
Proceedings of the 2021 Conference on Empirical Methods in Natural Language Processing

Despite recent promising results on the application of span-based models for event reference interpretation, there is a lack of understanding of what has been improved. We present an empirical analysis of a state-of-the-art span-based event reference systems with the goal of providing the general NLP audience with a better understanding of the state of the art and reference researchers with directions for future research.

pdf bib
Proceedings of the Fourth Workshop on Computational Models of Reference, Anaphora and Coreference
Maciej Ogrodniczuk | Sameer Pradhan | Massimo Poesio | Yulia Grishina | Vincent Ng
Proceedings of the Fourth Workshop on Computational Models of Reference, Anaphora and Coreference

pdf bib
Proceedings of the CODI-CRAC 2021 Shared Task on Anaphora, Bridging, and Discourse Deixis in Dialogue
Sopan Khosla | Ramesh Manuvinakurike | Vincent Ng | Massimo Poesio | Michael Strube | Carolyn Rosé
Proceedings of the CODI-CRAC 2021 Shared Task on Anaphora, Bridging, and Discourse Deixis in Dialogue

pdf bib
The CODI-CRAC 2021 Shared Task on Anaphora, Bridging, and Discourse Deixis in Dialogue
Sopan Khosla | Juntao Yu | Ramesh Manuvinakurike | Vincent Ng | Massimo Poesio | Michael Strube | Carolyn Rosé
Proceedings of the CODI-CRAC 2021 Shared Task on Anaphora, Bridging, and Discourse Deixis in Dialogue

In this paper, we provide an overview of the CODI-CRAC 2021 Shared-Task: Anaphora Resolution in Dialogue. The shared task focuses on detecting anaphoric relations in different genres of conversations. Using five conversational datasets, four of which have been newly annotated with a wide range of anaphoric relations: identity, bridging references and discourse deixis, we defined multiple subtasks focusing individually on these key relations. We discuss the evaluation scripts used to assess the system performance on these subtasks, and provide a brief summary of the participating systems and the results obtained across ?? runs from 5 teams, with most submissions achieving significantly better results than our baseline methods.

pdf bib
Neural Anaphora Resolution in Dialogue
Hideo Kobayashi | Shengjie Li | Vincent Ng
Proceedings of the CODI-CRAC 2021 Shared Task on Anaphora, Bridging, and Discourse Deixis in Dialogue

We describe the systems that we developed for the three tracks of the CODI-CRAC 2021 shared task, namely entity coreference resolution, bridging resolution, and discourse deixis resolution. Our team ranked second for entity coreference resolution, first for bridging resolution, and first for discourse deixis resolution.

pdf
The CODI-CRAC 2021 Shared Task on Anaphora, Bridging, and Discourse Deixis Resolution in Dialogue: A Cross-Team Analysis
Shengjie Li | Hideo Kobayashi | Vincent Ng
Proceedings of the CODI-CRAC 2021 Shared Task on Anaphora, Bridging, and Discourse Deixis in Dialogue

The CODI-CRAC 2021 shared task is the first shared task that focuses exclusively on anaphora resolution in dialogue and provides three tracks, namely entity coreference resolution, bridging resolution, and discourse deixis resolution. We perform a cross-task analysis of the systems that participated in the shared task in each of these tracks.

pdf
Don’t Miss the Potential Customers! Retrieving Similar Ads to Improve User Targeting
Yi Feng | Ting Wang | Chuanyi Li | Vincent Ng | Jidong Ge | Bin Luo | Yucheng Hu | Xiaopeng Zhang
Findings of the Association for Computational Linguistics: EMNLP 2021

User targeting is an essential task in the modern advertising industry: given a package of ads for a particular category of products (e.g., green tea), identify the online users to whom the ad package should be targeted. A (ad package specific) user targeting model is typically trained using historical clickthrough data: positive instances correspond to users who have clicked on an ad in the package before, whereas negative instances correspond to users who have not clicked on any ads in the package that were displayed to them. Collecting a sufficient amount of positive training data for training an accurate user targeting model, however, is by no means trivial. This paper focuses on the development of a method for automatic augmentation of the set of positive training instances. Experimental results on two datasets, including a real-world company dataset, demonstrate the effectiveness of our proposed method.

2020

pdf
Unsupervised Argumentation Mining in Student Essays
Isaac Persing | Vincent Ng
Proceedings of the Twelfth Language Resources and Evaluation Conference

State-of-the-art systems for argumentation mining are supervised, thus relying on training data containing manually annotated argument components and the relationships between them. To eliminate the reliance on annotated data, we present a novel approach to unsupervised argument mining. The key idea is to bootstrap from a small set of argument components automatically identified using simple heuristics in combination with reliable contextual cues. Results on a Stab and Gurevych’s corpus of 402 essays show that our unsupervised approach rivals two supervised baselines in performance and achieves 73.5-83.7% of the performance of a state-of-the-art neural approach.

pdf
Aspect-Based Sentiment Analysis as Fine-Grained Opinion Mining
Gerardo Ocampo Diaz | Xuanming Zhang | Vincent Ng
Proceedings of the Twelfth Language Resources and Evaluation Conference

We show how the general fine-grained opinion mining concepts of opinion target and opinion expression are related to aspect-based sentiment analysis (ABSA) and discuss their benefits for resource creation over popular ABSA annotation schemes. Specifically, we first discuss why opinions modeled solely in terms of (entity, aspect) pairs inadequately captures the meaning of the sentiment originally expressed by authors and how opinion expressions and opinion targets can be used to avoid the loss of information. We then design a meaning-preserving annotation scheme and apply it to two popular ABSA datasets, the 2016 SemEval ABSA Restaurant and Laptop datasets. Finally, we discuss the importance of opinion expressions and opinion targets for next-generation ABSA systems. We make our datasets publicly available for download.

pdf bib
Proceedings of the Third Workshop on Computational Models of Reference, Anaphora and Coreference
Maciej Ogrodniczuk | Vincent Ng | Yulia Grishina | Sameer Pradhan
Proceedings of the Third Workshop on Computational Models of Reference, Anaphora and Coreference

pdf
Event Coreference Resolution with Non-Local Information
Jing Lu | Vincent Ng
Proceedings of the 1st Conference of the Asia-Pacific Chapter of the Association for Computational Linguistics and the 10th International Joint Conference on Natural Language Processing

We present two extensions to a state-of-theart joint model for event coreference resolution, which involve incorporating (1) a supervised topic model for improving trigger detection by providing global context, and (2) a preprocessing module that seeks to improve event coreference by discarding unlikely candidate antecedents of an event mention using discourse contexts computed based on salient entities. The resulting model yields the best results reported to date on the KBP 2017 English and Chinese datasets.

pdf
Bridging Resolution: A Survey of the State of the Art
Hideo Kobayashi | Vincent Ng
Proceedings of the 28th International Conference on Computational Linguistics

Bridging reference resolution is an anaphora resolution task that is arguably more challenging and less studied than entity coreference resolution. Given that significant progress has been made on coreference resolution in recent years, we believe that bridging resolution will receive increasing attention in the NLP community. Nevertheless, progress on bridging resolution is currently hampered in part by the scarcity of large annotated corpora for model training as well as the lack of standardized evaluation protocols. This paper presents a survey of the current state of research on bridging reference resolution and discusses future research directions.

pdf
Conundrums in Entity Coreference Resolution: Making Sense of the State of the Art
Jing Lu | Vincent Ng
Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing (EMNLP)

Despite the significant progress on entity coreference resolution observed in recent years, there is a general lack of understanding of what has been improved. We present an empirical analysis of state-of-the-art resolvers with the goal of providing the general NLP audience with a better understanding of the state of the art and coreference researchers with directions for future research.

pdf
Identifying Exaggerated Language
Li Kong | Chuanyi Li | Jidong Ge | Bin Luo | Vincent Ng
Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing (EMNLP)

While hyperbole is one of the most prevalent rhetorical devices, it is arguably one of the least studied devices in the figurative language processing community. We contribute to the study of hyperbole by (1) creating a corpus focusing on sentence-level hyperbole detection, (2) performing a statistical and manual analysis of our corpus, and (3) addressing the automatic hyperbole detection task.

2019

pdf bib
Proceedings of the Second Workshop on Computational Models of Reference, Anaphora and Coreference
Maciej Ogrodniczuk | Sameer Pradhan | Yulia Grishina | Vincent Ng
Proceedings of the Second Workshop on Computational Models of Reference, Anaphora and Coreference

pdf
Give Me More Feedback II: Annotating Thesis Strength and Related Attributes in Student Essays
Zixuan Ke | Hrishikesh Inamdar | Hui Lin | Vincent Ng
Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics

While the vast majority of existing work on automated essay scoring has focused on holistic scoring, researchers have recently begun work on scoring specific dimensions of essay quality. Nevertheless, progress on dimension-specific essay scoring is limited in part by the lack of annotated corpora. To facilitate advances in this area, we design a scoring rubric for scoring a core, yet unexplored dimension of persuasive essay quality, thesis strength, and annotate a corpus of essays with thesis strength scores. We additionally identify the attributes that could impact thesis strength and annotate the essays with the values of these attributes, which, when predicted by computational models, could provide further feedback to students on why her essay receives a particular thesis strength score.

pdf bib
Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing (EMNLP-IJCNLP)
Kentaro Inui | Jing Jiang | Vincent Ng | Xiaojun Wan
Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing (EMNLP-IJCNLP)

pdf
Improving Event Coreference Resolution by Learning Argument Compatibility from Unlabeled Data
Yin Jou Huang | Jing Lu | Sadao Kurohashi | Vincent Ng
Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 1 (Long and Short Papers)

Argument compatibility is a linguistic condition that is frequently incorporated into modern event coreference resolution systems. If two event mentions have incompatible arguments in any of the argument roles, they cannot be coreferent. On the other hand, if these mentions have compatible arguments, then this may be used as information towards deciding their coreferent status. One of the key challenges in leveraging argument compatibility lies in the paucity of labeled data. In this work, we propose a transfer learning framework for event coreference resolution that utilizes a large amount of unlabeled data to learn argument compatibility of event mentions. In addition, we adopt an interactive inference network based model to better capture the compatible and incompatible relations between the context words of event mentions. Our experiments on the KBP 2017 English dataset confirm the effectiveness of our model in learning argument compatibility, which in turn improves the performance of the overall event coreference model.

2018

pdf
Give Me More Feedback: Annotating Argument Persuasiveness and Related Attributes in Student Essays
Winston Carlile | Nishant Gurrapadi | Zixuan Ke | Vincent Ng
Proceedings of the 56th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers)

While argument persuasiveness is one of the most important dimensions of argumentative essay quality, it is relatively little studied in automated essay scoring research. Progress on scoring argument persuasiveness is hindered in part by the scarcity of annotated corpora. We present the first corpus of essays that are simultaneously annotated with argument components, argument persuasiveness scores, and attributes of argument components that impact an argument’s persuasiveness. This corpus could trigger the development of novel computational models concerning argument persuasiveness that provide useful feedback to students on why their arguments are (un)persuasive in addition to how persuasive they are.

pdf
Modeling and Prediction of Online Product Review Helpfulness: A Survey
Gerardo Ocampo Diaz | Vincent Ng
Proceedings of the 56th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers)

As the amount of free-form user-generated reviews in e-commerce websites continues to increase, there is an increasing need for automatic mechanisms that sift through the vast amounts of user reviews and identify quality content. Review helpfulness modeling is a task which studies the mechanisms that affect review helpfulness and attempts to accurately predict it. This paper provides an overview of the most relevant work in helpfulness prediction and understanding in the past decade, discusses the insights gained from said work, and provides guidelines for future research.

pdf bib
Proceedings of the First Workshop on Computational Models of Reference, Anaphora and Coreference
Massimo Poesio | Vincent Ng | Maciej Ogrodniczuk
Proceedings of the First Workshop on Computational Models of Reference, Anaphora and Coreference

pdf bib
Proceedings of the 5th Workshop on Natural Language Processing Techniques for Educational Applications
Yuen-Hsien Tseng | Hsin-Hsi Chen | Vincent Ng | Mamoru Komachi
Proceedings of the 5th Workshop on Natural Language Processing Techniques for Educational Applications

pdf
Modeling Trolling in Social Media Conversations
Luis Gerardo Mojica de la Vega | Vincent Ng
Proceedings of the Eleventh International Conference on Language Resources and Evaluation (LREC 2018)

pdf
Improving Unsupervised Keyphrase Extraction using Background Knowledge
Yang Yu | Vincent Ng
Proceedings of the Eleventh International Conference on Language Resources and Evaluation (LREC 2018)

2017

pdf bib
Proceedings of the 2nd Workshop on Coreference Resolution Beyond OntoNotes (CORBON 2017)
Maciej Ogrodniczuk | Vincent Ng
Proceedings of the 2nd Workshop on Coreference Resolution Beyond OntoNotes (CORBON 2017)

pdf
Lightly-Supervised Modeling of Argument Persuasiveness
Isaac Persing | Vincent Ng
Proceedings of the Eighth International Joint Conference on Natural Language Processing (Volume 1: Long Papers)

We propose the first lightly-supervised approach to scoring an argument’s persuasiveness. Key to our approach is the novel hypothesis that lightly-supervised persuasiveness scoring is possible by explicitly modeling the major errors that negatively impact persuasiveness. In an evaluation on a new annotated corpus of online debate arguments, our approach rivals its fully-supervised counterparts in performance by four scoring metrics when using only 10% of the available training instances.

pdf
Joint Learning for Event Coreference Resolution
Jing Lu | Vincent Ng
Proceedings of the 55th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers)

While joint models have been developed for many NLP tasks, the vast majority of event coreference resolvers, including the top-performing resolvers competing in the recent TAC KBP 2016 Event Nugget Detection and Coreference task, are pipeline-based, where the propagation of errors from the trigger detection component to the event coreference component is a major performance limiting factor. To address this problem, we propose a model for jointly learning event coreference, trigger detection, and event anaphoricity. Our joint model is novel in its choice of tasks and its features for capturing cross-task interactions. To our knowledge, this is the first attempt to train a mention-ranking model and employ event anaphoricity for event coreference. Our model achieves the best results to date on the KBP 2016 English and Chinese datasets.

2016

pdf bib
Proceedings of the Workshop on Coreference Resolution Beyond OntoNotes (CORBON 2016)
Maciej Ogrodniczuk | Vincent Ng
Proceedings of the Workshop on Coreference Resolution Beyond OntoNotes (CORBON 2016)

pdf bib
Proceedings of the 3rd Workshop on Natural Language Processing Techniques for Educational Applications (NLPTEA2016)
Hsin-Hsi Chen | Yuen-Hsien Tseng | Vincent Ng | Xiaofei Lu
Proceedings of the 3rd Workshop on Natural Language Processing Techniques for Educational Applications (NLPTEA2016)

pdf
End-to-End Argumentation Mining in Student Essays
Isaac Persing | Vincent Ng
Proceedings of the 2016 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies

pdf
Chinese Zero Pronoun Resolution with Deep Neural Networks
Chen Chen | Vincent Ng
Proceedings of the 54th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers)

pdf
Modeling Stance in Student Essays
Isaac Persing | Vincent Ng
Proceedings of the 54th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers)

pdf
Event Coreference Resolution with Multi-Pass Sieves
Jing Lu | Vincent Ng
Proceedings of the Tenth International Conference on Language Resources and Evaluation (LREC'16)

Multi-pass sieve approaches have been successfully applied to entity coreference resolution and many other tasks in natural language processing (NLP), owing in part to the ease of designing high-precision rules for these tasks. However, the same is not true for event coreference resolution: typically lying towards the end of the standard information extraction pipeline, an event coreference resolver assumes as input the noisy outputs of its upstream components such as the trigger identification component and the entity coreference resolution component. The difficulty in designing high-precision rules makes it challenging to successfully apply a multi-pass sieve approach to event coreference resolution. In this paper, we investigate this challenge, proposing the first multi-pass sieve approach to event coreference resolution. When evaluated on the version of the KBP 2015 corpus available to the participants of EN Task 2 (Event Nugget Detection and Coreference), our approach achieves an Avg F-score of 40.32%, outperforming the best participating system by 0.67% in Avg F-score.

pdf
Markov Logic Networks for Text Mining: A Qualitative and Empirical Comparison with Integer Linear Programming
Luis Gerardo Mojica de la Vega | Vincent Ng
Proceedings of the Tenth International Conference on Language Resources and Evaluation (LREC'16)

Joint inference approaches such as Integer Linear Programming (ILP) and Markov Logic Networks (MLNs) have recently been successfully applied to many natural language processing (NLP) tasks, often outperforming their pipeline counterparts. However, MLNs are arguably much less popular among NLP researchers than ILP. While NLP researchers who desire to employ these joint inference frameworks do not necessarily have to understand their theoretical underpinnings, it is imperative that they understand which of them should be applied under what circumstances. With the goal of helping NLP researchers better understand the relative strengths and weaknesses of MLNs and ILP; we will compare them along different dimensions of interest, such as expressiveness, ease of use, scalability, and performance. To our knowledge, this is the first systematic comparison of ILP and MLNs on an NLP task.

pdf
Joint Inference for Event Coreference Resolution
Jing Lu | Deepak Venugopal | Vibhav Gogate | Vincent Ng
Proceedings of COLING 2016, the 26th International Conference on Computational Linguistics: Technical Papers

Event coreference resolution is a challenging problem since it relies on several components of the information extraction pipeline that typically yield noisy outputs. We hypothesize that exploiting the inter-dependencies between these components can significantly improve the performance of an event coreference resolver, and subsequently propose a novel joint inference based event coreference resolver using Markov Logic Networks (MLNs). However, the rich features that are important for this task are typically very hard to explicitly encode as MLN formulas since they significantly increase the size of the MLN, thereby making joint inference and learning infeasible. To address this problem, we propose a novel solution where we implicitly encode rich features into our model by augmenting the MLN distribution with low dimensional unit clauses. Our approach achieves state-of-the-art results on two standard evaluation corpora.

bib
Advanced Markov Logic Techniques for Scalable Joint Inference in NLP
Deepak Venugopal | Vibhav Gogate | Vincent Ng
Proceedings of the 2016 Conference on Empirical Methods in Natural Language Processing: Tutorial Abstracts

In the early days of the statistical NLP era, many language processing tasks were tackled using the so-called pipeline architecture: the given task is broken into a series of sub-tasks such that the output of one sub-task is an input to the next sub-task in the sequence. The pipeline architecture is appealing for various reasons, including modularity, modeling convenience, and manageable computational complexity. However, it suffers from the error propagation problem: errors made in one sub-task are propagated to the next sub-task in the sequence, leading to poor accuracy on that sub-task, which in turn leads to more errors downstream. Another disadvantage associated with it is lack of feedback: errors made in a sub-task are often not corrected using knowledge uncovered while solving another sub-task down the pipeline.Realizing these weaknesses, researchers have turned to joint inference approaches in recent years. One such approach involves the use of Markov logic, which is defined as a set of weighted first-order logic formulas and, at a high level, unifies first-order logic with probabilistic graphical models. It is an ideal modeling language (knowledge representation) for compactly representing relational and uncertain knowledge in NLP. In a typical use case of MLNs in NLP, the application designer describes the background knowledge using a few first-order logic sentences and then uses software packages such as Alchemy, Tuffy, and Markov the beast to perform learning and inference (prediction) over the MLN. However, despite its obvious advantages, over the years, researchers and practitioners have found it difficult to use MLNs effectively in many NLP applications. The main reason for this is that it is hard to scale inference and learning algorithms for MLNs to large datasets and complex models, that are typical in NLP.In this tutorial, we will introduce the audience to recent advances in scaling up inference and learning in MLNs as well as new approaches to make MLNs a "black-box" for NLP applications (with only minor tuning required on the part of the user). Specifically, we will introduce attendees to a key idea that has emerged in the MLN research community over the last few years, lifted inference , which refers to inference techniques that take advantage of symmetries (e.g., synonyms), both exact and approximate, in the MLN . We will describe how these next-generation inference techniques can be used to perform effective joint inference. We will also present our new software package for inference and learning in MLNs, Alchemy 2.0, which is based on lifted inference, focusing primarily on how it can be used to scale up inference and learning in large models and datasets for applications such as semantic similarity determination, information extraction and question answering.

2015

pdf
Recovering Traceability Links in Requirements Documents
Zeheng Li | Mingrui Chen | LiGuo Huang | Vincent Ng
Proceedings of the Nineteenth Conference on Computational Natural Language Learning

pdf
UTD: Ensemble-Based Spatial Relation Extraction
Jennifer D’Souza | Vincent Ng
Proceedings of the 9th International Workshop on Semantic Evaluation (SemEval 2015)

pdf bib
Proceedings of the Eighth SIGHAN Workshop on Chinese Language Processing
Liang-Chih Yu | Zhifang Sui | Yue Zhang | Vincent Ng
Proceedings of the Eighth SIGHAN Workshop on Chinese Language Processing

pdf
Sieve-Based Spatial Relation Extraction with Expanding Parse Trees
Jennifer D’Souza | Vincent Ng
Proceedings of the 2015 Conference on Empirical Methods in Natural Language Processing

pdf
Modeling Argument Strength in Student Essays
Isaac Persing | Vincent Ng
Proceedings of the 53rd Annual Meeting of the Association for Computational Linguistics and the 7th International Joint Conference on Natural Language Processing (Volume 1: Long Papers)

pdf
Sieve-Based Entity Linking for the Biomedical Domain
Jennifer D’Souza | Vincent Ng
Proceedings of the 53rd Annual Meeting of the Association for Computational Linguistics and the 7th International Joint Conference on Natural Language Processing (Volume 2: Short Papers)

pdf
Chinese Zero Pronoun Resolution: A Joint Unsupervised Discourse-Aware Model Rivaling State-of-the-Art Resolvers
Chen Chen | Vincent Ng
Proceedings of the 53rd Annual Meeting of the Association for Computational Linguistics and the 7th International Joint Conference on Natural Language Processing (Volume 2: Short Papers)

pdf
Chinese Event Coreference Resolution: An Unsupervised Probabilistic Model Rivaling Supervised Resolvers
Chen Chen | Vincent Ng
Proceedings of the 2015 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies

2014

pdf
Automatic Keyphrase Extraction: A Survey of the State of the Art
Kazi Saidul Hasan | Vincent Ng
Proceedings of the 52nd Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers)

pdf
Modeling Prompt Adherence in Student Essays
Isaac Persing | Vincent Ng
Proceedings of the 52nd Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers)

pdf
Scoring Coreference Partitions of Predicted Mentions: A Reference Implementation
Sameer Pradhan | Xiaoqiang Luo | Marta Recasens | Eduard Hovy | Vincent Ng | Michael Strube
Proceedings of the 52nd Annual Meeting of the Association for Computational Linguistics (Volume 2: Short Papers)

pdf
SinoCoreferencer: An End-to-End Chinese Event Coreference Resolver
Chen Chen | Vincent Ng
Proceedings of the Ninth International Conference on Language Resources and Evaluation (LREC'14)

Compared to entity coreference resolution, there is a relatively small amount of work on event coreference resolution. Much work on event coreference was done for English. In fact, to our knowledge, there are no publicly available results on Chinese event coreference resolution. This paper describes the design, implementation, and evaluation of SinoCoreferencer, an end-to-end state-of-the-art ACE-style Chinese event coreference system. We have made SinoCoreferencer publicly available, in hope to facilitate the development of high-level Chinese natural language applications that can potentially benefit from event coreference information.

pdf
Annotating Inter-Sentence Temporal Relations in Clinical Notes
Jennifer D’Souza | Vincent Ng
Proceedings of the Ninth International Conference on Language Resources and Evaluation (LREC'14)

Owing in part to the surge of interest in temporal relation extraction, a number of datasets manually annotated with temporal relations between event-event pairs and event-time pairs have been produced recently. However, it is not uncommon to find missing annotations in these manually annotated datasets. Many researchers attributed this problem to “annotator fatigue”. While some of these missing relations can be recovered automatically, many of them cannot. Our goals in this paper are to (1) manually annotate certain types of missing links that cannot be automatically recovered in the i2b2 Clinical Temporal Relations Challenge Corpus, one of the recently released evaluation corpora for temporal relation extraction; and (2) empirically determine the usefulness of these additional annotations. We will make our annotations publicly available, in hopes of enabling a more accurate evaluation of temporal relation extraction systems.

pdf
Ensemble-Based Medical Relation Classification
Jennifer D’Souza | Vincent Ng
Proceedings of COLING 2014, the 25th International Conference on Computational Linguistics: Technical Papers

pdf
Why are You Taking this Stance? Identifying and Classifying Reasons in Ideological Debates
Kazi Saidul Hasan | Vincent Ng
Proceedings of the 2014 Conference on Empirical Methods in Natural Language Processing (EMNLP)

pdf
Chinese Zero Pronoun Resolution: An Unsupervised Probabilistic Model Rivaling Supervised Resolvers
Chen Chen | Vincent Ng
Proceedings of the 2014 Conference on Empirical Methods in Natural Language Processing (EMNLP)

pdf
Relieving the Computational Bottleneck: Joint Inference for Event Extraction with High-Dimensional Features
Deepak Venugopal | Chen Chen | Vibhav Gogate | Vincent Ng
Proceedings of the 2014 Conference on Empirical Methods in Natural Language Processing (EMNLP)

pdf
Vote Prediction on Comments in Social Polls
Isaac Persing | Vincent Ng
Proceedings of the 2014 Conference on Empirical Methods in Natural Language Processing (EMNLP)

2013

pdf
Simple Yet Powerful Native Language Identification on TOEFL11
Ching-Yi Wu | Po-Hsiang Lai | Yang Liu | Vincent Ng
Proceedings of the Eighth Workshop on Innovative Use of NLP for Building Educational Applications

pdf
Frame Semantics for Stance Classification
Kazi Saidul Hasan | Vincent Ng
Proceedings of the Seventeenth Conference on Computational Natural Language Learning

pdf
Chinese Zero Pronoun Resolution: Some Recent Advances
Chen Chen | Vincent Ng
Proceedings of the 2013 Conference on Empirical Methods in Natural Language Processing

pdf
Chinese Event Coreference Resolution: Understanding the State of the Art
Chen Chen | Vincent Ng
Proceedings of the Sixth International Joint Conference on Natural Language Processing

pdf
Stance Classification of Ideological Debates: Data, Models, Features, and Constraints
Kazi Saidul Hasan | Vincent Ng
Proceedings of the Sixth International Joint Conference on Natural Language Processing

pdf
Linguistically Aware Coreference Evaluation Metrics
Chen Chen | Vincent Ng
Proceedings of the Sixth International Joint Conference on Natural Language Processing

pdf
Classifying Temporal Relations with Rich Linguistic Knowledge
Jennifer D’Souza | Vincent Ng
Proceedings of the 2013 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies

pdf
Modeling Thesis Clarity in Student Essays
Isaac Persing | Vincent Ng
Proceedings of the 51st Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers)

pdf
Extra-Linguistic Constraints on Stance Recognition in Ideological Debates
Kazi Saidul Hasan | Vincent Ng
Proceedings of the 51st Annual Meeting of the Association for Computational Linguistics (Volume 2: Short Papers)

2012

pdf
Combining the Best of Two Worlds: A Hybrid Approach to Multilingual Coreference Resolution
Chen Chen | Vincent Ng
Joint Conference on EMNLP and CoNLL - Shared Task

pdf
Joint Modeling for Chinese Event Extraction with Rich Linguistic Features
Chen Chen | Vincent Ng
Proceedings of COLING 2012

pdf
Chinese Noun Phrase Coreference Resolution: Insights into the State of the Art
Chen Chen | Vincent Ng
Proceedings of COLING 2012: Posters

pdf
Predicting Stance in Ideological Debate with Rich Linguistic Knowledge
Kazi Saidul Hasan | Vincent Ng
Proceedings of COLING 2012: Posters

pdf
Resolving Complex Cases of Definite Pronouns: The Winograd Schema Challenge
Altaf Rahman | Vincent Ng
Proceedings of the 2012 Joint Conference on Empirical Methods in Natural Language Processing and Computational Natural Language Learning

pdf
Learning the Fine-Grained Information Status of Discourse Entities
Altaf Rahman | Vincent Ng
Proceedings of the 13th Conference of the European Chapter of the Association for Computational Linguistics

pdf
Translation-Based Projection for Multilingual Coreference Resolution
Altaf Rahman | Vincent Ng
Proceedings of the 2012 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies

2011

pdf
Coreference Resolution with World Knowledge
Altaf Rahman | Vincent Ng
Proceedings of the 49th Annual Meeting of the Association for Computational Linguistics: Human Language Technologies

pdf
Learning the Information Status of Noun Phrases in Spoken Dialogues
Altaf Rahman | Vincent Ng
Proceedings of the 2011 Conference on Empirical Methods in Natural Language Processing

pdf
Syntactic Parsing for Ranking-Based Coreference Resolution
Altaf Rahman | Vincent Ng
Proceedings of 5th International Joint Conference on Natural Language Processing

2010

pdf
Supervised Noun Phrase Coreference Research: The First Fifteen Years
Vincent Ng
Proceedings of the 48th Annual Meeting of the Association for Computational Linguistics

pdf
Modeling Organization in Student Essays
Isaac Persing | Alan Davis | Vincent Ng
Proceedings of the 2010 Conference on Empirical Methods in Natural Language Processing

pdf
Inducing Fine-Grained Semantic Classes via Hierarchical and Collective Classification
Altaf Rahman | Vincent Ng
Proceedings of the 23rd International Conference on Computational Linguistics (Coling 2010)

pdf
Conundrums in Unsupervised Keyphrase Extraction: Making Sense of the State-of-the-Art
Kazi Saidul Hasan | Vincent Ng
Coling 2010: Posters

2009

pdf
Mine the Easy, Classify the Hard: A Semi-Supervised Approach to Automatic Sentiment Classification
Sajib Dasgupta | Vincent Ng
Proceedings of the Joint Conference of the 47th Annual Meeting of the ACL and the 4th International Joint Conference on Natural Language Processing of the AFNLP

pdf
Semi-Supervised Cause Identification from Aviation Safety Reports
Isaac Persing | Vincent Ng
Proceedings of the Joint Conference of the 47th Annual Meeting of the ACL and the 4th International Joint Conference on Natural Language Processing of the AFNLP

pdf
Learning-Based Named Entity Recognition for Morphologically-Rich, Resource-Scarce Languages
Kazi Saidul Hasan | Md. Altaf ur Rahman | Vincent Ng
Proceedings of the 12th Conference of the European Chapter of the ACL (EACL 2009)

pdf
Weakly Supervised Part-of-Speech Tagging for Morphologically-Rich, Resource-Scarce Languages
Kazi Saidul Hasan | Vincent Ng
Proceedings of the 12th Conference of the European Chapter of the ACL (EACL 2009)

pdf
Topic-wise, Sentiment-wise, or Otherwise? Identifying the Hidden Dimension for Unsupervised Text Classification
Sajib Dasgupta | Vincent Ng
Proceedings of the 2009 Conference on Empirical Methods in Natural Language Processing

pdf
Supervised Models for Coreference Resolution
Altaf Rahman | Vincent Ng
Proceedings of the 2009 Conference on Empirical Methods in Natural Language Processing

pdf
Discriminative Models for Semi-Supervised Natural Language Learning
Sajib Dasgupta | Vincent Ng
Proceedings of the NAACL HLT 2009 Workshop on Semi-supervised Learning for Natural Language Processing

pdf
Graph-Cut-Based Anaphoricity Determination for Coreference Resolution
Vincent Ng
Proceedings of Human Language Technologies: The 2009 Annual Conference of the North American Chapter of the Association for Computational Linguistics

2008

pdf
Book Reviews: Semisupervised Learning for Computational Linguistics by Steven Abney
Vincent Ng
Computational Linguistics, Volume 34, Number 3, September 2008

pdf
Unsupervised Models for Coreference Resolution
Vincent Ng
Proceedings of the 2008 Conference on Empirical Methods in Natural Language Processing

2007

pdf
High-Performance, Language-Independent Morphological Segmentation
Sajib Dasgupta | Vincent Ng
Human Language Technologies 2007: The Conference of the North American Chapter of the Association for Computational Linguistics; Proceedings of the Main Conference

pdf
Unsupervised Part-of-Speech Acquisition for Resource-Scarce Languages
Sajib Dasgupta | Vincent Ng
Proceedings of the 2007 Joint Conference on Empirical Methods in Natural Language Processing and Computational Natural Language Learning (EMNLP-CoNLL)

pdf
Semantic Class Induction and Coreference Resolution
Vincent Ng
Proceedings of the 45th Annual Meeting of the Association of Computational Linguistics

2006

pdf
Examining the Role of Linguistic Knowledge Sources in the Automatic Identification and Classification of Reviews
Vincent Ng | Sajib Dasgupta | S. M. Niaz Arifin
Proceedings of the COLING/ACL 2006 Main Conference Poster Sessions

2005

pdf
Machine Learning for Coreference Resolution: From Local Classification to Global Ranking
Vincent Ng
Proceedings of the 43rd Annual Meeting of the Association for Computational Linguistics (ACL’05)

2004

pdf
Learning Noun Phrase Anaphoricity to Improve Conference Resolution: Issues in Representation and Optimization
Vincent Ng
Proceedings of the 42nd Annual Meeting of the Association for Computational Linguistics (ACL-04)

2003

pdf
Weakly Supervised Natural Language Learning Without Redundant Views
Vincent Ng | Claire Cardie
Proceedings of the 2003 Human Language Technology Conference of the North American Chapter of the Association for Computational Linguistics

pdf
Bootstrapping Coreference Classifiers with Multiple Machine Learning Algorithms
Vincent Ng | Claire Cardie
Proceedings of the 2003 Conference on Empirical Methods in Natural Language Processing

2002

pdf
Improving Machine Learning Approaches to Coreference Resolution
Vincent Ng | Claire Cardie
Proceedings of the 40th Annual Meeting of the Association for Computational Linguistics

pdf
Identifying Anaphoric and Non-Anaphoric Noun Phrases to Improve Coreference Resolution
Vincent Ng | Claire Cardie
COLING 2002: The 19th International Conference on Computational Linguistics

pdf
Combining Sample Selection and Error-Driven Pruning for Machine Learning of Coreference Rules
Vincent Ng | Claire Cardie
Proceedings of the 2002 Conference on Empirical Methods in Natural Language Processing (EMNLP 2002)

2001

pdf
Multidocument Summarization via Information Extraction
Michael White | Tanya Korelsky | Claire Cardie | Vincent Ng | David Pierce | Kiri Wagstaff
Proceedings of the First International Conference on Human Language Technology Research

2000

pdf
Examining the Role of Statistical and Linguistic Knowledge Sources in a General-Knowledge Question-Answering System
Claire Cardie | Vincent Ng | David Pierce | Chris Buckley
Sixth Applied Natural Language Processing Conference