Ralph Grishman

Also published as: R. Grishman

2021

pdf abs
Learning Relatedness between Types with Prototypes for Relation Extraction
Lisheng Fu | Ralph Grishman
Proceedings of the 16th Conference of the European Chapter of the Association for Computational Linguistics: Main Volume

Relation schemas are often pre-defined for each relation dataset. Relation types can be related from different datasets and have overlapping semantics. We hypothesize we can combine these datasets according to the semantic relatedness between the relation types to overcome the problem of lack of training data. It is often easy to discover the connection between relation types based on relation names or annotation guides, but hard to measure the exact similarity and take advantage of the connection between the relation types from different datasets. We propose to use prototypical examples to represent each relation type and use these examples to augment related types from a different dataset. We obtain further improvement (ACE05) with this type augmentation over a strong baseline which uses multi-task learning between datasets to obtain better feature representation for relations. We make our implementation publicly available: https://github.com/fufrank5/relatedness

2018

pdf abs
A Case Study on Learning a Unified Encoder of Relations
Lisheng Fu | Bonan Min | Thien Huu Nguyen | Ralph Grishman
Proceedings of the 2018 EMNLP Workshop W-NUT: The 4th Workshop on Noisy User-generated Text

Typical relation extraction models are trained on a single corpus annotated with a pre-defined relation schema. An individual corpus is often small, and the models may often be biased or overfitted to the corpus. We hypothesize that we can learn a better representation by combining multiple relation datasets. We attempt to use a shared encoder to learn the unified feature representation and to augment it with regularization by adversarial training. The additional corpora feeding the encoder can help to learn a better feature representation layer even though the relation schemas are different. We use ACE05 and ERE datasets as our case study for experiments. The multi-task model obtains significant improvement on both datasets.

2017

pdf abs
Domain Adaptation for Relation Extraction with Domain Adversarial Neural Network
Lisheng Fu | Thien Huu Nguyen | Bonan Min | Ralph Grishman
Proceedings of the Eighth International Joint Conference on Natural Language Processing (Volume 2: Short Papers)

Relations are expressed in many domains such as newswire, weblogs and phone conversations. Trained on a source domain, a relation extractor’s performance degrades when applied to target domains other than the source. A common yet labor-intensive method for domain adaptation is to construct a target-domain-specific labeled dataset for adapting the extractor. In response, we present an unsupervised domain adaptation method which only requires labels from the source domain. Our method is a joint model consisting of a CNN-based relation classifier and a domain-adversarial classifier. The two components are optimized jointly to learn a domain-independent representation for prediction on the target domain. Our model outperforms the state-of-the-art on all three test domains of ACE 2005.

2016

pdf
A Two-stage Approach for Extending Event Detection to New Types via Neural Networks
Thien Huu Nguyen | Lisheng Fu | Kyunghyun Cho | Ralph Grishman
Proceedings of the 1st Workshop on Representation Learning for NLP

pdf
Modeling Skip-Grams for Event Detection with Convolutional Neural Networks
Thien Huu Nguyen | Ralph Grishman
Proceedings of the 2016 Conference on Empirical Methods in Natural Language Processing

pdf
Joint Event Extraction via Recurrent Neural Networks
Thien Huu Nguyen | Kyunghyun Cho | Ralph Grishman
Proceedings of the 2016 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies

pdf abs
Entity Linking with a Paraphrase Flavor
Maria Pershina | Yifan He | Ralph Grishman
Proceedings of the Tenth International Conference on Language Resources and Evaluation (LREC'16)

The task of Named Entity Linking is to link entity mentions in the document to their correct entries in a knowledge base and to cluster NIL mentions. Ambiguous, misspelled, and incomplete entity mention names are the main challenges in the linking process. We propose a novel approach that combines two state-of-the-art models ― for entity disambiguation and for paraphrase detection ― to overcome these challenges. We consider name variations as paraphrases of the same entity mention and adopt a paraphrase model for this task. Our approach utilizes a graph-based disambiguation model based on Personalized Page Rank, and then refines and clusters its output using the paraphrase similarity between entity mention strings. It achieves a competitive performance of 80.5% in B3+F clustering score on diagnostic TAC EDL 2014 data.

The Knowledge Based Population (KBP) evaluation track of the Text Analysis Conferences (TAC) has been held for the past 3 years. One of the two tasks of KBP is slot filling: finding within a large corpus the values of a set of attributes of given people and organizations. This task has proven very challenging, with top systems rarely exceeding 30% F-measure. In this paper, we present an error analysis and classification for those answers which could be found by a manual corpus search but were not found by any of the systems participating in the 2010 evaluation. The most common sources of failure were limitations on inference, errors in coreference (particularly with nominal anaphors), and errors in named entity recognition. We relate the types of errors to the characteristics of the task and show the wide diversity of problems that must be addressed to improve overall performance.

2011

pdf bib
INVITED TALK 1: The Knowledge Base Population Task: Challenges for Information Extraction
Ralph Grishman
Proceedings of the RANLP 2011 Workshop on Information Extraction and Knowledge Acquisition

pdf bib
Fine-grained Entity Set Refinement with User Feedback
Bonan Min | Ralph Grishman
Proceedings of the RANLP 2011 Workshop on Information Extraction and Knowledge Acquisition

pdf bib
Acquiring Topic Features to improve Event Extraction: in Pre-selected and Balanced Collections
Shasha Liao | Ralph Grishman
Proceedings of the International Conference Recent Advances in Natural Language Processing 2011

pdf
Semi-supervised Relation Extraction with Large-scale Word Clustering
Ang Sun | Ralph Grishman | Satoshi Sekine
Proceedings of the 49th Annual Meeting of the Association for Computational Linguistics: Human Language Technologies

pdf
Knowledge Base Population: Successful Approaches and Challenges
Heng Ji | Ralph Grishman
Proceedings of the 49th Annual Meeting of the Association for Computational Linguistics: Human Language Technologies

pdf
Can Document Selection Help Semi-supervised Learning? A Case Study On Event Extraction
Shasha Liao | Ralph Grishman
Proceedings of the 49th Annual Meeting of the Association for Computational Linguistics: Human Language Technologies

pdf
Exploiting Syntactic and Distributional Information for Spelling Correction with Web-Scale N-gram Models
Wei Xu | Joel Tetreault | Martin Chodorow | Ralph Grishman | Le Zhao
Proceedings of the 2011 Conference on Empirical Methods in Natural Language Processing

pdf
Using Prediction from Sentential Scope to Build a Pseudo Co-Testing Learner for Event Extraction
Shasha Liao | Ralph Grishman
Proceedings of 5th International Joint Conference on Natural Language Processing

pdf
Passage Retrieval for Information Extraction using Distant Supervision
Wei Xu | Ralph Grishman | Le Zhao
Proceedings of 5th International Joint Conference on Natural Language Processing

2010

pdf
Large Corpus-based Semantic Feature Extraction for Pronoun Coreference
Shasha Liao | Ralph Grishman
Proceedings of the Second Workshop on NLP Challenges in the Information Explosion Era (NLPIX 2010)

pdf
Filtered Ranking for Bootstrapping in Event Extraction
Shasha Liao | Ralph Grishman
Proceedings of the 23rd International Conference on Computational Linguistics (Coling 2010)

pdf
Semi-supervised Semantic Pattern Discovery with Guidance from Unsupervised Pattern Clusters
Ang Sun | Ralph Grishman
Coling 2010: Posters

pdf
Using Document Level Cross-Event Inference to Improve Event Extraction
Shasha Liao | Ralph Grishman
Proceedings of the 48th Annual Meeting of the Association for Computational Linguistics

pdf
Utility Evaluation of Cross-document Information Extraction
Heng Ji | Zheng Chen | Jonathan Feldman | Antonio Gonzalez | Ralph Grishman | Vivek Upadhyay
Human Language Technologies: The 2010 Annual Conference of the North American Chapter of the Association for Computational Linguistics

pdf abs
The Impact of Task and Corpus on Event Extraction Systems
Ralph Grishman
Proceedings of the Seventh International Conference on Language Resources and Evaluation (LREC'10)

The term event extraction covers a wide range of information extraction tasks, and methods developed and evaluated for one task may prove quite unsuitable for another. Understanding these task differences is essential to making broad progress in event extraction. We look back at the MUC and ACE tasks in terms of one characteristic, the breadth of the scenario ― how wide a range of information is subsumed in a single extraction task. We examine how this affects strategies for collecting information and methods for semi-supervised training of new extractors. We also consider the heterogeneity of corpora ― how varied the topics of documents in a corpus are. Extraction systems may be intended in principle for general news but are typically evaluated on topic-focused corpora, and this evaluation context may affect system design. As one case study, we examine the task of identifying physical attack events in news corpora, observing the effect on system performance of shifting from an attack-event-rich corpus to a more varied corpus and considering how the impact of this shift may be mitigated.

2009

pdf
Cross-document Event Extraction and Tracking: Task, Evaluation, Techniques and Challenges
Heng Ji | Ralph Grishman | Zheng Chen | Prashant Gupta
Proceedings of the International Conference RANLP-2009

pdf
Updating a Name Tagger Using Contemporary Unlabeled Data
Cristina Mota | Ralph Grishman
Proceedings of the ACL-IJCNLP 2009 Conference Short Papers

pdf
A Parse-and-Trim Approach with Information Significance for Chinese Sentence Compression
Wei Xu | Ralph Grishman
Proceedings of the 2009 Workshop on Language Generation and Summarisation (UCNLG+Sum 2009)

2008

pdf abs
Is this NE tagger getting old?
Cristina Mota | Ralph Grishman
Proceedings of the Sixth International Conference on Language Resources and Evaluation (LREC'08)

This paper focuses on the influence of changing the text time frame on the performance of a named entity tagger. We followed a twofold approach to investigate this subject: on the one hand, we analyzed a corpus that spans 8 years, and, on the other hand, we assessed the performance of a name tagger trained and tested on that corpus. We created 8 samples from the corpus, each drawn from the articles for a particular year. In terms of corpus analysis, we calculated the corpus similarity and names shared between samples. To see the effect on tagger performance, we implemented a semi-supervised name tagger based on co-training; then, we trained and tested our tagger on those samples. We observed that corpus similarity, names shared between samples, and tagger performance all decay as the time gap between the samples increases. Furthermore, we observed that the corpus similarity and names shared correlate with the tagger F-measure. These results show that named entity recognition systems may become obsolete in a short period of time.

pdf
Refining Event Extraction through Cross-Document Inference
Heng Ji | Ralph Grishman
Proceedings of ACL-08: HLT

2007

pdf
Question Answering Using Integrated Information Retrieval and Information Extraction
Barry Schiffman | Kathleen McKeown | Ralph Grishman | James Allan
Human Language Technologies 2007: The Conference of the North American Chapter of the Association for Computational Linguistics; Proceedings of the Main Conference

2006

pdf
Data Selection in Semi-supervised Learning for Name Tagging
Heng Ji | Ralph Grishman
Proceedings of the Workshop on Information Extraction Beyond The Document

pdf
Re-Ranking Algorithms for Name Tagging
Heng Ji | Cynthia Rudin | Ralph Grishman
Proceedings of the Workshop on Computationally Hard Problems and Joint Inference in Speech and Language Processing

pdf
Analysis and Repair of Name Tagger Errors
Heng Ji | Ralph Grishman
Proceedings of the COLING/ACL 2006 Main Conference Poster Sessions

2005

pdf
Improving Name Tagging by Reference Resolution and Relation Detection
Heng Ji | Ralph Grishman
Proceedings of the 43rd Annual Meeting of the Association for Computational Linguistics (ACL’05)

pdf
Extracting Relations with Integrated Information Using Kernel Methods
Shubin Zhao | Ralph Grishman
Proceedings of the 43rd Annual Meeting of the Association for Computational Linguistics (ACL’05)

pdf
Using Semantic Relations to Refine Coreference Decisions
Heng Ji | David Westbrook | Ralph Grishman
Proceedings of Human Language Technology Conference and Conference on Empirical Methods in Natural Language Processing

2004

pdf
Discriminative Slot Detection Using Kernel Methods
Shubin Zhao | Adam Meyers | Ralph Grishman
COLING 2004: Proceedings of the 20th International Conference on Computational Linguistics

pdf
Cross-lingual Information Extraction System Evaluation
Kiyoshi Sudo | Satoshi Sekine | Ralph Grishman
COLING 2004: Proceedings of the 20th International Conference on Computational Linguistics

pdf
Discovering Relations among Named Entities from Large Corpora
Takaaki Hasegawa | Satoshi Sekine | Ralph Grishman
Proceedings of the 42nd Annual Meeting of the Association for Computational Linguistics (ACL-04)

pdf
Applying Coreference to Improve Name Recognition
Heng Ji | Ralph Grishman
Proceedings of the Conference on Reference Resolution and Its Applications

2003

pdf
An Improved Extraction Pattern Representation Model for Automatic IE Pattern Acquisition
Kiyoshi Sudo | Satoshi Sekine | Ralph Grishman
Proceedings of the 41st Annual Meeting of the Association for Computational Linguistics

pdf
pre-CODIE–Crosslingual On-Demand Information Extraction
Kiyoshi Sudo | Satoshi Sekine | Ralph Grishman
Companion Volume of the Proceedings of HLT-NAACL 2003 - Demonstrations

2002

pdf
Formal Mechanisms for Capturing Regularizations
Adam Meyers | Ralph Grishman | Michiko Kosaka
Proceedings of the Third International Conference on Language Resources and Evaluation (LREC’02)

pdf
Summarization System Integrated with Named Entity Tagging and IE pattern Discovery
Chikashi Nobata | Satoshi Sekine | Hitoshi Isahara | Ralph Grishman
Proceedings of the Third International Conference on Language Resources and Evaluation (LREC’02)

pdf
Standards & best practice for multilingual computational lexicons: ISLE MILE and more”
Nicoletta Calzolari | Ralph Grishman | Martha Palmer
Proceedings of the Third International Conference on Language Resources and Evaluation (LREC’02)

pdf
Towards Best Practice for Multiword Expressions in Computational Lexicons
Nicoletta Calzolari | Charles J. Fillmore | Ralph Grishman | Nancy Ide | Alessandro Lenci | Catherine MacLeod | Antonio Zampolli
Proceedings of the Third International Conference on Language Resources and Evaluation (LREC’02)

pdf
Diversity of Scenarios in Information extraction
Silja Huttunen | Roman Yangarber | Ralph Grishman
Proceedings of the Third International Conference on Language Resources and Evaluation (LREC’02)

pdf
Unsupervised Learning of Generalized Names
Roman Yangarber | Winston Lin | Ralph Grishman
COLING 2002: The 19th International Conference on Computational Linguistics

pdf
Complexity of Event Structure in IE Scenarios
Silja Huttunen | Roman Yangarber | Ralph Grishman
COLING 2002: The 19th International Conference on Computational Linguistics

2001

pdf
Automatic Pattern Acquisition for Japanese Information Extraction
Kiyoshi Sudo | Satoshi Sekine | Ralph Grishman
Proceedings of the First International Conference on Human Language Technology Research

pdf
Covering Treebanks with GLARF
A. Meyers | Ralph Grishman | Michiko Kosaka | Shubin Zhao
Proceedings of the ACL 2001 Workshop on Sharing Tools and Resources

2000

pdf
Chart-Based Transfer Rule Application in Machine Translation
Adam Meyers | Michiko Kosaka | Ralph Grishman
COLING 2000 Volume 1: The 18th International Conference on Computational Linguistics

pdf
Automatic Acquisition of Domain Knowledge for Information Extraction
Roman Yangarber | Ralph Grishman | Pasi Tapanainen | Silja Huttunen
COLING 2000 Volume 2: The 18th International Conference on Computational Linguistics

pdf
A Treebank of Spanish and its Application to Parsing
Antonio Moreno | Ralph Grishman | Susana López | Fernando Sánchez | Satoshi Sekine
Proceedings of the Second International Conference on Language Resources and Evaluation (LREC’00)

pdf
The American National Corpus: A Standardized Resource for American English
Catherine Macleod | Nancy Ide | Ralph Grishman
Proceedings of the Second International Conference on Language Resources and Evaluation (LREC’00)

pdf
Unsupervised Discovery of Scenario-Level Patterns for Information Extraction
Roman Yangarber | Ralph Grishman | Pasi Tapanainen
Sixth Applied Natural Language Processing Conference

1998

pdf
Deriving Transfer Rules from Dominance-Preserving Alignments
Adam Meyers | Roman Yangarber | Ralph Grishman | Catherine Macleod | Antonio Moreno-Sandoval
COLING 1998 Volume 2: The 17th International Conference on Computational Linguistics

pdf
Exploiting Diverse Knowledge Sources via Maximum Entropy in Named Entity Recognition
Andrew Borthwick | John Sterling | Eugene Agichtein | Ralph Grishman
Sixth Workshop on Very Large Corpora

pdf
A Decision Tree Method for Finding and Classifying Names in Japanese Texts
Satoshi Sekine | Ralph Grishman | Hiroyuki Shinnou
Sixth Workshop on Very Large Corpora

pdf
NYU: Description of the Proteus/PET System as Used for MUC-7 ST
Roman Yangarber | Ralph Grishman
Seventh Message Understanding Conference (MUC-7): Proceedings of a Conference Held in Fairfax, Virginia, April 29 - May 1, 1998

pdf
NYU: Description of the MENE Named Entity System as Used in MUC-7
Andrew Borthwick | John Sterling | Eugene Agichtein | Ralph Grishman
Seventh Message Understanding Conference (MUC-7): Proceedings of a Conference Held in Fairfax, Virginia, April 29 - May 1, 1998

pdf
Deriving Transfer Rules from Dominance-Preserving Alignments
Adam Meyers | Roman Yangarber | Ralph Grishman | Catherine Macleod | Antonio Moreno-Sandoval
36th Annual Meeting of the Association for Computational Linguistics and 17th International Conference on Computational Linguistics, Volume 2

pdf
Research in Information Extraction: 1996-98
Ralph Grishman
TIPSTER TEXT PROGRAM PHASE III: Proceedings of a Workshop held at Baltimore, Maryland, October 13-15, 1998

pdf
Transforming Examples into Patterns for Information Extraction
Roman Yangarber | Ralph Grishman
TIPSTER TEXT PROGRAM PHASE III: Proceedings of a Workshop held at Baltimore, Maryland, October 13-15, 1998

pdf abs
A multilingual procedure for dictionary-based sentence alignment
Adam Meyers | Michiko Kosaka | Ralph Grishman
Proceedings of the Third Conference of the Association for Machine Translation in the Americas: Technical Papers

This paper describes a sentence alignment technique based on a machine readable dictionary. Alignment takes place in a single pass through the text, based on the scores of matches between pairs of source and target sentences. Pairings consisting of sets of matches are evaluated using a version of the Gale-Shapely solution to the stable marriage problem. An algorithm is described which can handle N-to-1 (or 1-to-N) matches, for n ≥ 0, i.e., deletions, 1-to-1 (including scrambling), and 1-to-many matches. A simple frequency based method for acquiring supplemental dictionary entries is also discussed. We achieve high quality alignments using available bilingual dictionaries, both for closely related language pairs (Spanish/English) and more distantly related pairs (Japanese/English).

1996

pdf
The Role of Syntax in Information Extraction
Ralph Grishman
TIPSTER TEXT PROGRAM PHASE II: Proceedings of a Workshop held at Vienna, Virginia, May 6-8, 1996

pdf
Building an Architecture: A CAWG Saga
Ralph Grishman
TIPSTER TEXT PROGRAM PHASE II: Proceedings of a Workshop held at Vienna, Virginia, May 6-8, 1996

pdf
TIPSTER Text Phase II Architecture Design Version 2.1p 19 June 1996
Ralph Grishman
TIPSTER TEXT PROGRAM PHASE II: Proceedings of a Workshop held at Vienna, Virginia, May 6-8, 1996

pdf
Design of the MUC-6 Evaluation
Ralph Grishman | Beth Sundheim
TIPSTER TEXT PROGRAM PHASE II: Proceedings of a Workshop held at Vienna, Virginia, May 6-8, 1996

pdf
Alignment of Shared Forests for Bilingual Corpora
Adam Meyers | Roman Yangarber | Ralph Grishman
COLING 1996 Volume 1: The 16th International Conference on Computational Linguistics

pdf
Message Understanding Conference- 6: A Brief History
Ralph Grishman | Beth Sundheim
COLING 1996 Volume 1: The 16th International Conference on Computational Linguistics

pdf
The Influence of Tagging on the Classification of Lexical Complements
Catherine Macleod | Adam Meyers | Ralph Grishman
COLING 1996 Volume 1: The 16th International Conference on Computational Linguistics

1995

pdf bib
Design of the MUC-6 Evaluation
Ralph Grishman | Beth Sundheim
Sixth Message Understanding Conference (MUC-6): Proceedings of a Conference Held in Columbia, Maryland, November 6-8, 1995

pdf
The NYU System for MUC-6 or Where’s the Syntax?
Ralph Grishman
Sixth Message Understanding Conference (MUC-6): Proceedings of a Conference Held in Columbia, Maryland, November 6-8, 1995

pdf abs
A Corpus-based Probabilistic Grammar with Only Two Non-terminals
Satoshi Sekine | Ralph Grishman
Proceedings of the Fourth International Workshop on Parsing Technologies

The availability of large, syntactically-bracketed corpora such as the Penn Tree Bank affords us the opportunity to automatically build or train broad-coverage grammars, and in particular to train probabilistic grammars. A number of recent parsing experiments have also indicated that grammars whose production probabilities are dependent on the context can be more effective than context-free grammars in selecting a correct parse. To make maximal use of context, we have automatically constructed, from the Penn Tree Bank version 2, a grammar in which the symbols S and NP are the only real nonterminals, and the other non-terminals or grammatical nodes are in effect embedded into the right-hand-sides of the S and NP rules. For example, one of the rules extracted from the tree bank would be S -> NP VBX JJ CC VBX NP [1] ( where NP is a non-terminal and the other symbols are terminals – part-of-speech tags of the Tree Bank). The most common structure in the Tree Bank associated with this expansion is (S NP (VP (VP VBX (ADJ JJ) CC (VP VBX NP)))) [2]. So if our parser uses rule [1] in parsing a sentence, it will generate structure [2] for the corresponding part of the sentence. Using 94% of the Penn Tree Bank for training, we extracted 32,296 distinct rules ( 23,386 for S, and 8,910 for NP). We also built a smaller version of the grammar based on higher frequency patterns for use as a back-up when the larger grammar is unable to produce a parse due to memory limitation. We applied this parser to 1,989 Wall Street Journal sentences (separate from the training set and with no limit on sentence length). Of the parsed sentences (1,899), the percentage of no-crossing sentences is 33.9%, and Parseval recall and precision are 73.43% and 72 .61%.