Martha Palmer

Also published as: Martha S. Palmer, Martha Stone Palmer

2024

This paper reports the first release of the UMR (Uniform Meaning Representation) data set. UMR is a graph-based meaning representation formalism consisting of a sentence-level graph and a document-level graph. The sentence-level graph represents predicate-argument structures, named entities, word senses, aspectuality of events, as well as person and number information for entities. The document-level graph represents coreferential, temporal, and modal relations that go beyond sentence boundaries. UMR is designed to capture the commonalities and variations across languages and this is done through the use of a common set of abstract concepts, relations, and attributes as well as concrete concepts derived from words from invidual languages. This UMR release includes annotations for six languages (Arapaho, Chinese, English, Kukama, Navajo, Sanapana) that vary greatly in terms of their linguistic properties and resource availability. We also describe on-going efforts to enlarge this data set and extend it to other genres and modalities. We also briefly describe the available infrastructure (UMR annotation guidelines and tools) that others can use to create similar data sets.

This paper introduces GLAMR, an Abstract Meaning Representation (AMR) interpretation of Generative Lexicon (GL) semantic components. It includes a structured subeventual interpretation of linguistic predicates, and encoding of the opposition structure of property changes of event arguments. Both of these features are recently encoded in VerbNet (VN), and form the scaffolding for the semantic form associated with VN frame files. We develop a new syntax, concepts, and roles for subevent structure based on VN for connecting subevents to atomic predicates. Our proposed extension is compatible with current AMR specification. We also present an approach to automatically augment AMR graphs by inserting subevent structure of the predicates and identifying the subevent arguments from the semantic roles. A pilot annotation of GLAMR graphs of 65 documents (486 sentences), based on procedural texts as a source, is presented as a public dataset. The annotation includes subevents, argument property change, and document-level anaphoric links. Finally, we provide baseline models for converting text to GLAMR and vice versa, along with the application of GLAMR for generating enriched paraphrases with details on subevent transformation and arguments that are not present in the surface form of the texts.

pdf abs
Linear Cross-document Event Coreference Resolution with X-AMR
Shafiuddin Rehan Ahmed | George Arthur Baker | Evi Judge | Michael Reagan | Kristin Wright-Bettner | Martha Palmer | James H. Martin
Proceedings of the 2024 Joint International Conference on Computational Linguistics, Language Resources and Evaluation (LREC-COLING 2024)

Event Coreference Resolution (ECR) as a pairwise mention classification task is expensive both for automated systems and manual annotations. The task’s quadratic difficulty is exacerbated when using Large Language Models (LLMs), making prompt engineering for ECR prohibitively costly. In this work, we propose a graphical representation of events, X-AMR, anchored around individual mentions using a cross-document version of Abstract Meaning Representation. We then linearize the ECR with a novel multi-hop coreference algorithm over the event graphs. The event graphs simplify ECR, making it a) LLM cost-effective, b) compositional and interpretable, and c) easily annotated. For a fair assessment, we first enrich an existing ECR benchmark dataset with these event graphs using an annotator-friendly tool we introduce. Then, we employ GPT-4, the newest LLM by OpenAI, for these annotations. Finally, using the ECR algorithm, we assess GPT-4 against humans and analyze its limitations. Through this research, we aim to advance the state-of-the-art for efficient ECR and shed light on the potential shortcomings of current LLMs at this task. Code and annotations: https://github.com/ahmeshaf/gpt_coref

pdf abs
ReCAP: Semantic Role Enhanced Caption Generation
Abhidip Bhattacharyya | Martha Palmer | Christoffer Heckman
Proceedings of the 2024 Joint International Conference on Computational Linguistics, Language Resources and Evaluation (LREC-COLING 2024)

Even though current vision language (V+L) models have achieved success in generating image captions, they often lack specificity and overlook various aspects of the image. Additionally, the attention learned through weak supervision operates opaquely and is difficult to control. To address these limitations, we propose the use of semantic roles as control signals in caption generation. Our hypothesis is that, by incorporating semantic roles as signals, the generated captions can be guided to follow specific predicate argument structures. To validate the effectiveness of our approach, we conducted experiments using data and compared the results with a baseline model VL-BART(CITATION). The experiments showed a significant improvement, with a gain of 45% in Smatch score (Standard NLP evaluation metric for semantic representations), demonstrating the efficacy of our approach. By focusing on specific objects and their associated semantic roles instead of providing a general description, our framework produces captions that exhibit enhanced quality, diversity, and controllability.

pdf abs
Adapting Abstract Meaning Representation Parsing to the Clinical Narrative – the SPRING THYME parser
Jon Cai | Kristin Wright-Bettner | Martha Palmer | Guergana Savova | James Martin
Proceedings of the 6th Clinical Natural Language Processing Workshop

This paper is dedicated to the design and evaluation of the first AMR parser tailored for clinical notes. Our objective was to facilitate the precise transformation of the clinical notes into structured AMR expressions, thereby enhancing the interpretability and usability of clinical text data at scale. Leveraging the colon cancer dataset from the Temporal Histories of Your Medical Events (THYME) corpus, we adapted a state-of-the-art AMR parser utilizing continuous training. Our approach incorporates data augmentation techniques to enhance the accuracy of AMR structure predictions. Notably, through this learning strategy, our parser achieved an impressive F1 score of 88% on the THYME corpus’s colon cancer dataset. Moreover, our research delved into the efficacy of data required for domain adaptation within the realm of clinical notes, presenting domain adaptation data requirements for AMR parsing. This exploration not only underscores the parser’s robust performance but also highlights its potential in facilitating a deeper understanding of clinical narratives through structured semantic representations.

This paper presents the first integration of PropBank role information into Wikidata, in order to provide a novel resource for information extraction, one combining Wikidata’s ontological metadata with PropBank’s rich argument structure encoding for event classes. We discuss a technique for PropBank augmentation to existing eventive Wikidata items, as well as identification of gaps in Wikidata’s coverage based on manual examination of over 11,300 PropBank rolesets. We propose five new Wikidata properties to integrate PropBank structure into Wikidata so that the annotated mappings can be added en masse. We then outline the methodology and challenges of this integration, including annotation with the combined resources.

pdf abs
X-AMR Annotation Tool
Shafiuddin Rehan Ahmed | Jon Cai | Martha Palmer | James H. Martin
Proceedings of the 18th Conference of the European Chapter of the Association for Computational Linguistics: System Demonstrations

This paper presents a novel Cross-document Abstract Meaning Representation (X-AMR) annotation tool designed for annotating key corpus-level event semantics. Leveraging machine assistance through the Prodigy Annotation Tool, we enhance the user experience, ensuring ease and efficiency in the annotation process. Through empirical analyses, we demonstrate the effectiveness of our tool in augmenting an existing event corpus, highlighting its advantages when integrated with GPT-4. Code and annotations: href{https://anonymous.4open.science/r/xamr-9ED0}{anonymous.4open.science/r/xamr-9ED0} footnote Demo: {href{https://youtu.be/TuirftxciNE}{https://youtu.be/TuirftxciNE}} footnote Live Link: {href{https://tinyurl.com/mrxmafwh}{https://tinyurl.com/mrxmafwh}}

pdf abs
Expanding Russian PropBank: Challenges and Insights for Developing New SRL Resources
Skatje Myers | Roman Khamov | Adam Pollins | Rebekah Tozier | Olga Babko-Malaya | Martha Palmer
Proceedings of the Fifth International Workshop on Designing Meaning Representations @ LREC-COLING 2024

Semantic role labeling (SRL) resources, such as Proposition Bank (PropBank), provide useful input to downstream applications. In this paper we present some challenges and insights we learned while expanding the previously developed Russian PropBank. This new effort involved annotation and adjudication of all predicates within a subset of the prior work in order to provide a test corpus for future applications. We discuss a number of new issues that arose while developing our PropBank for Russian as well as our solutions. Framing issues include: distinguishing between morphological processes that warrant new frames, differentiating between modal verbs and predicate verbs, and maintaining accurate representations of a given language’s semantics. Annotation issues include disagreements derived from variability in Universal Dependency parses and semantic ambiguity within the text. Finally, we demonstrate how Russian sentence structures reveal inherent limitations to PropBank’s ability to capture semantic data. These discussions should prove useful to anyone developing a PropBank or similar SRL resources for a new language.

pdf bib abs
My Big, Fat 50-Year Journey
Martha Palmer
Computational Linguistics, Volume 50, Issue 1 - March 2024

My most heartfelt thanks to ACL for this tremendous honor. I’m completely thrilled. I cannot tell you how surprised I was when I got Iryna’s email. It is amazing that my first ACL conference since 2019 in Florence includes this award. What a wonderful way to be back with all of my friends and family here at ACL. I’m going to tell you about my big fat 50-year journey. What have I been doing for the last 50 years? Well, finding meaning, quite literally in words. Or in other words, exploring how computational lexical semantics can support natural language understanding. This is going to be quick. Hold onto your hats, here we go.

2023

pdf abs
Learning Semantic Role Labeling from Compatible Label Sequences
Tao Li | Ghazaleh Kazeminejad | Susan Brown | Vivek Srikumar | Martha Palmer
Findings of the Association for Computational Linguistics: EMNLP 2023

Semantic role labeling (SRL) has multiple disjoint label sets, e.g., VerbNet and PropBank. Creating these datasets is challenging, therefore a natural question is how to use each one to help the other. Prior work has shown that cross-task interaction helps, but only explored multitask learning so far. A common issue with multi-task setup is that argument sequences are still separately decoded, running the risk of generating structurally inconsistent label sequences (as per lexicons like Semlink). In this paper, we eliminate such issue with a framework that jointly models VerbNet and PropBank labels as one sequence. In this setup, we show that enforcing Semlink constraints during decoding constantly improves the overall F1. With special input constructions, our joint model infers VerbNet arguments from given PropBank arguments with over 99 F1. For learning, we propose a constrained marginal model that learns with knowledge defined in Semlink to further benefit from the large amounts of PropBank-only data. On the joint benchmark based on CoNLL05, our models achieve state-of-the-art F1’s, outperforming the prior best in-domain model by 3.5 (VerbNet) and 0.8 (PropBank). For out-of-domain generalization, our models surpass the prior best by 3.4 (VerbNet) and 0.2 (PropBank).

pdf abs
A Survey of Challenges and Methods in the Computational Modeling of Multi-Party Dialog
Ananya Ganesh | Martha Palmer | Katharina Kann
Proceedings of the 5th Workshop on NLP for Conversational AI (NLP4ConvAI 2023)

Advances in conversational AI systems, powered in particular by large language models, have facilitated rapid progress in understanding and generating dialog. Typically, task-oriented or open-domain dialog systems have been designed to work with two-party dialog, i.e., the exchange of utterances between a single user and a dialog system. However, modern dialog systems may be deployed in scenarios such as classrooms or meetings where conversational analysis of multiple speakers is required. This survey will present research around computational modeling of “multi-party dialog”, outlining differences from two-party dialog, challenges and issues in working with multi-party dialog, and methods for representing multi-party dialog. We also provide an overview of dialog datasets created for the study of multi-party dialog, as well as tasks that are of interest in this domain.

Rooted in AMR, Uniform Meaning Representation (UMR) is a graph-based formalism with nodes as concepts and edges as relations between them. When used to represent natural language semantics, UMR maps words in a sentence to concepts in the UMR graph. Multiword expressions (MWEs) pose a particular challenge to UMR annotation because they deviate from the default one-to-one mapping between words and concepts. There are different types of MWEs which require different kinds of annotation that must be specified in guidelines. This paper discusses the specific treatment for each type of MWE in UMR.

pdf abs
Mind the Gap between the Application Track and the Real World
Ananya Ganesh | Jie Cao | E. Margaret Perkoff | Rosy Southwell | Martha Palmer | Katharina Kann
Proceedings of the 61st Annual Meeting of the Association for Computational Linguistics (Volume 2: Short Papers)

Recent advances in NLP have led to a rise in inter-disciplinary and application-oriented research. While this demonstrates the growing real-world impact of the field, research papers frequently feature experiments that do not account for the complexities of realistic data and environments. To explore the extent of this gap, we investigate the relationship between the real-world motivations described in NLP papers and the models and evaluation which comprise the proposed solution. We first survey papers from the NLP Applications track from ACL 2020 and EMNLP 2020, asking which papers have differences between their stated motivation and their experimental setting, and if so, mention them. We find that many papers fall short of considering real-world input and output conditions due to adopting simplified modeling or evaluation settings. As a case study, we then empirically show that the performance of an educational dialog understanding system deteriorates when used in a realistic classroom environment.

Schema induction builds a graph representation explaining how events unfold in a scenario. Existing approaches have been based on information retrieval (IR) and information extraction (IE), often with limited human curation. We demonstrate a human-in-the-loop schema induction system powered by GPT-3. We first describe the different modules of our system, including prompting to generate schematic elements, manual edit of those elements, and conversion of those into a schema graph. By qualitatively comparing our system to previous ones, we show that our system not only transfers to new domains more easily than previous approaches, but also reduces efforts of human curation thanks to our interactive interface.

This paper presents detailed mappings between the structures used in Abstract Meaning Representation (AMR) and those used in Uniform Meaning Representation (UMR). These structures include general semantic roles, rolesets, and concepts that are largely shared between AMR and UMR, but with crucial differences. While UMR annotation of new low-resource languages is ongoing, AMR-annotated corpora already exist for many languages, and these AMR corpora are ripe for conversion to UMR format. Rather than focusing on semantic coverage that is new to UMR (which will likely need to be dealt with manually), this paper serves as a resource (with illustrated mappings) for users looking to understand the fine-grained adjustments that have been made to the representation techniques for semantic categoriespresent in both AMR and UMR.

The progress of event extraction research has been hindered by the absence of wide-coverage, large-scale datasets. To make event extraction systems more accessible, we build a general-purpose event detection dataset GLEN, which covers 205K event mentions with 3,465 different types, making it more than 20x larger in ontology than today’s largest event dataset. GLEN is created by utilizing the DWD Overlay, which provides a mapping between Wikidata Qnodes and PropBank rolesets. This enables us to use the abundant existing annotation for PropBank as distant supervision. In addition, we also propose a new multi-stage event detection model specifically designed to handle the large ontology size in GLEN. We show that our model exhibits superior performance compared to a range of baselines including InstructGPT. Finally, we perform error analysis and show that label noise is still the largest challenge for improving performance for this new dataset.

In this paper, we present RESIN-EDITOR, an interactive event graph visualizer and editor designed for analyzing complex events. Our RESIN-EDITOR system allows users to render and freely edit hierarchical event graphs extracted from multimedia and multi-document news clusters with guidance from human-curated event schemas. RESIN-EDITOR’s unique features include hierarchical graph visualization, comprehensive source tracing, and interactive user editing, which significantly outperforms existing Information Extraction (IE) visualization tools in both IE result analysis and general model improvements. In our evaluation of RESIN-EDITOR, we demonstrate ways in which our tool is effective in understanding complex events and enhancing system performances. The source code, a video demonstration, and a live website for RESIN-EDITOR have been made publicly available.

In this paper, we introduce CAMRA (Copilot for AMR Annotatations), a cutting-edge web-based tool designed for constructing Abstract Meaning Representation (AMR) from natural language text. CAMRA offers a novel approach to deep lexical semantics annotation such as AMR, treating AMR annotation akin to coding in programming languages. Leveraging the familiarity of programming paradigms, CAMRA encompasses all essential features of existing AMR editors, including example lookup, while going a step further by integrating Propbank roleset lookup as an autocomplete feature within the tool. Notably, CAMRA incorporates AMR parser models as coding co-pilots, greatly enhancing the efficiency and accuracy of AMR annotators.

pdf abs
CRAPES:Cross-modal Annotation Projection for Visual Semantic Role Labeling
Abhidip Bhattacharyya | Martha Palmer | Christoffer Heckman
Proceedings of the 12th Joint Conference on Lexical and Computational Semantics (*SEM 2023)

Automatic image comprehension is an important yet challenging task that includes identifying actions in an image and corresponding action participants. Most current approaches to this task, now termed Grounded Situation Recognition (GSR), start by predicting a verb that describes the action and then predict the nouns that can participate in the action as arguments to the verb. This problem formulation limits each image to a single action even though several actions could be depicted. In contrast, text-based Semantic Role Labeling (SRL) aims to label all actions in a sentence, typically resulting in at least two or three predicate argument structures per sentence. We hypothesize that expanding GSR to follow the more liberal SRL text-based approach to action and participant identification could improve image comprehension results. To test this hypothesis and to preserve generalization capabilities, we use general-purpose vision and language components as a front-end. This paper presents our results, a substantial 28.6 point jump in performance on the SWiG dataset, which confirm our hypothesis. We also discuss the benefits of loosely coupled broad-coverage off-the-shelf components which generalized well to out of domain images, and can decrease the need for manual image semantic role annotation.

pdf abs
Event Semantic Knowledge in Procedural Text Understanding
Ghazaleh Kazeminejad | Martha Palmer
Proceedings of the 12th Joint Conference on Lexical and Computational Semantics (*SEM 2023)

The task of entity state tracking aims to automatically analyze procedural texts – texts that describe a step-by-step process (e.g. a baking recipe). Specifically, the goal is to track various states of the entities participating in a given process. Some of the challenges for this NLP task include annotated data scarcity and annotators’ reliance on commonsense knowledge to annotate implicit state information. Zhang et al. (2021) successfully incorporated commonsense entity-centric knowledge from ConceptNet into their BERT-based neural-symbolic architecture. Since English mostly encodes state change information in verbs, we attempted to test whether injecting semantic knowledge of events (retrieved from the state-of-the-art VerbNet parser) into a neural model can also improve the performance on this task. To achieve this, we adapt the methodology introduced by Zhang et al. (2021) for incorporating symbolic entity information from ConceptNet to the incorporation of VerbNet event semantics. We evaluate the performance of our model on the ProPara dataset (Mishra et al., 2018). In addition, we introduce a purely symbolic model for entity state tracking that uses a simple set of case statements, and is informed mostly by linguistic knowledge retrieved from various computational lexical resources. Our approach is inherently domain-agnostic, and our model is explainable and achieves state-of-the-art results on the Recipes dataset (Bosselut et al., 2017).

pdf abs
Leveraging Active Learning to Minimise SRL Annotation Across Corpora
Skatje Myers | Martha Palmer
Proceedings of the 12th Joint Conference on Lexical and Computational Semantics (*SEM 2023)

In this paper we investigate the application of active learning to semantic role labeling (SRL) using Bayesian Active Learning by Disagreement (BALD). Our new predicate-focused selection method quickly improves efficiency on three different specialised domain corpora. This is encouraging news for researchers wanting to port SRL to domain specific applications. Interestingly, with the large and diverse \textit{OntoNotes} corpus, the sentence selection approach, that collects a larger number of predicates, taking more time to annotate, fares better than the predicate approach. In this paper, we analyze both the selections made by our two selections methods for the various domains and the differences between these corpora in detail.

With 102,530,067 items currently in its crowd-sourced knowledge base, Wikidata provides NLP practitioners a unique and powerful resource for inference and reasoning over real-world entities. However, because Wikidata is very entity focused, events and actions are often labeled with eventive nouns (e.g., the process of diagnosing a person’s illness is labeled “diagnosis”), and the typical participants in an event are not described or linked to that event concept (e.g., the medical professional or patient). Motivated by a need for an adaptable, comprehensive, domain-flexible ontology for information extraction, including identifying the roles entities are playing in an event, we present a curated subset of Wikidata in which events have been enriched with PropBank roles. To enable richer narrative understanding between events from Wikidata concepts, we have also provided a comprehensive mapping from temporal Qnodes and Pnodes to the Allen Interval Temporal Logic relations.

UMR-Writer is a web-based tool for annotating semantic graphs with the Uniform Meaning Representation (UMR) scheme. UMR is a graph-based semantic representation that can be applied cross-linguistically for deep semantic analysis of texts. In this work, we implemented a new keyboard interface in UMR-Writer 2.0, which is a powerful addition to the original mouse interface, supporting faster annotation for more experienced annotators. The new interface also addresses issues with the original mouse interface. Additionally, we demonstrate an efficient workflow for annotation project management in UMR-Writer 2.0, which has been applied to many projects.

2022

This paper describes the evolution of the PropBank approach to semantic role labeling over the last two decades. During this time the PropBank frame files have been expanded to include non-verbal predicates such as adjectives, prepositions and multi-word expressions. The number of domains, genres and languages that have been PropBanked has also expanded greatly, creating an opportunity for much more challenging and robust testing of the generalization capabilities of PropBank semantic role labeling systems. We also describe the substantial effort that has gone into ensuring the consistency and reliability of the various annotated datasets and resources, to better support the training and evaluation of such systems

pdf abs
Aligning Images and Text with Semantic Role Labels for Fine-Grained Cross-Modal Understanding
Abhidip Bhattacharyya | Cecilia Mauceri | Martha Palmer | Christoffer Heckman
Proceedings of the Thirteenth Language Resources and Evaluation Conference

As vision processing and natural language processing continue to advance, there is increasing interest in multimodal applications, such as image retrieval, caption generation, and human-robot interaction. These tasks require close alignment between the information in the images and text. In this paper, we present a new multimodal dataset that combines state of the art semantic annotation for language with the bounding boxes of corresponding images. This richer multimodal labeling supports cross-modal inference for applications in which such alignment is useful. Our semantic representations, developed in the natural language processing community, abstract away from the surface structure of the sentence, focusing on specific actions and the roles of their participants, a level that is equally relevant to images. We then utilize these representations in the form of semantic role labels in the captions and the images and demonstrate improvements in standard tasks such as image retrieval. The potential contributions of these additional labels is evaluated using a role-aware retrieval system based on graph convolutional and recurrent neural networks. The addition of semantic roles into this system provides a significant increase in capability and greater flexibility for these tasks, and could be extended to state-of-the-art techniques relying on transformers with larger amounts of annotated data.

We introduce RESIN-11, a new schema-guided event extraction&prediction framework that can be applied to a large variety of newsworthy scenarios. The framework consists of two parts: (1) an open-domain end-to-end multimedia multilingual information extraction system with weak-supervision and zero-shot learningbased techniques. (2) schema matching and schema-guided event prediction based on our curated schema library. We build a demo website based on our dockerized system and schema library publicly available for installation (https://github.com/RESIN-KAIROS/RESIN-11). We also include a video demonstrating the system.

Claim detection and verification are crucial for news understanding and have emerged as promising technologies for mitigating misinformation and disinformation in the news. However, most existing work has focused on claim sentence analysis while overlooking additional crucial attributes (e.g., the claimer and the main object associated with the claim).In this work, we present NewsClaims, a new benchmark for attribute-aware claim detection in the news domain. We extend the claim detection problem to include extraction of additional attributes related to each claim and release 889 claims annotated over 143 news articles. NewsClaims aims to benchmark claim detection systems in emerging scenarios, comprising unseen topics with little or no training data. To this end, we see that zero-shot and prompt-based baselines show promising performance on this benchmark, while still considerably behind human performance.

pdf bib abs
Meaning Representations for Natural Languages: Design, Models and Applications
Jeffrey Flanigan | Ishan Jindal | Yunyao Li | Tim O’Gorman | Martha Palmer | Nianwen Xue
Proceedings of the 2022 Conference on Empirical Methods in Natural Language Processing: Tutorial Abstracts

This tutorial reviews the design of common meaning representations, SoTA models for predicting meaning representations, and the applications of meaning representations in a wide range of downstream NLP tasks and real-world applications. Reporting by a diverse team of NLP researchers from academia and industry with extensive experience in designing, building and using meaning representations, our tutorial has three components: (1) an introduction to common meaning representations, including basic concepts and design challenges; (2) a review of SoTA methods on building models for meaning representations; and (3) an overview of applications of meaning representations in downstream NLP tasks and real-world applications. We will also present qualitative comparisons of common meaning representations and a quantitative study on how their differences impact model performance. Finally, we will share best practices in choosing the right meaning representation for downstream tasks.

2021

pdf
What Would a Teacher Do? Predicting Future Talk Moves
Ananya Ganesh | Martha Palmer | Katharina Kann
Findings of the Association for Computational Linguistics: ACL-IJCNLP 2021

To combat COVID-19, both clinicians and scientists need to digest the vast amount of relevant biomedical knowledge in literature to understand the disease mechanism and the related biological functions. We have developed a novel and comprehensive knowledge discovery framework, COVID-KG to extract fine-grained multimedia knowledge elements (entities, relations and events) from scientific literature. We then exploit the constructed multimedia knowledge graphs (KGs) for question answering and report generation, using drug repurposing as a case study. Our framework also provides detailed contextual sentences, subfigures, and knowledge subgraphs as evidence. All of the data, KGs, reports.

We present a new information extraction system that can automatically construct temporal event graphs from a collection of news documents from multiple sources, multiple languages (English and Spanish for our experiment), and multiple data modalities (speech, text, image and video). The system advances state-of-the-art from two aspects: (1) extending from sentence-level event extraction to cross-document cross-lingual cross-media event extraction, coreference resolution and temporal event tracking; (2) using human curated event schema library to match and enhance the extraction output. We have made the dockerlized system publicly available for research purpose at GitHub, with a demo video.

pdf abs
AutoAspect: Automatic Annotation of Tense and Aspect for Uniform Meaning Representations
Daniel Chen | Martha Palmer | Meagan Vigus
Proceedings of the Joint 15th Linguistic Annotation Workshop (LAW) and 3rd Designing Meaning Representations (DMR) Workshop

We present AutoAspect, a novel, rule-based annotation tool for labeling tense and aspect. The pilot version annotates English data. The aspect labels are designed specifically for Uniform Meaning Representations (UMR), an annotation schema that aims to encode crosslingual semantic information. The annotation tool combines syntactic and semantic cues to assign aspects on a sentence-by-sentence basis, following a sequence of rules that each output a UMR aspect. Identified events proceed through the sequence until they are assigned an aspect. We achieve a recall of 76.17% for identifying UMR events and an accuracy of 62.57% on all identified events, with high precision values for 2 of the aspect labels.

pdf abs
Automatic Entity State Annotation using the VerbNet Semantic Parser
Ghazaleh Kazeminejad | Martha Palmer | Tao Li | Vivek Srikumar
Proceedings of the Joint 15th Linguistic Annotation Workshop (LAW) and 3rd Designing Meaning Representations (DMR) Workshop

Tracking entity states is a natural language processing task assumed to require human annotation. In order to reduce the time and expenses associated with annotation, we introduce a new method to automatically extract entity states, including location and existence state of entities, following Dalvi et al. (2018) and Tandon et al. (2020). For this purpose, we rely primarily on the semantic representations generated by the state of the art VerbNet parser (Gung, 2020), and extract the entities (event participants) and their states, based on the semantic predicates of the generated VerbNet semantic representation, which is in propositional logic format. For evaluation, we used ProPara (Dalvi et al., 2018), a reading comprehension dataset which is annotated with entity states in each sentence, and tracks those states in paragraphs of natural human-authored procedural texts. Given the presented limitations of the method, the peculiarities of the ProPara dataset annotations, and that our system, Lexis, makes no use of task-specific training data and relies solely on VerbNet, the results are promising, showcasing the value of lexical resources.

pdf abs
Predicate Representations and Polysemy in VerbNet Semantic Parsing
James Gung | Martha Palmer
Proceedings of the 14th International Conference on Computational Semantics (IWCS)

Despite recent advances in semantic role labeling propelled by pre-trained text encoders like BERT, performance lags behind when applied to predicates observed infrequently during training or to sentences in new domains. In this work, we investigate how role labeling performance on low-frequency predicates and out-of-domain data can be further improved by using VerbNet, a verb lexicon that groups verbs into hierarchical classes based on shared syntactic and semantic behavior and defines semantic representations describing relations between arguments. We find that VerbNet classes provide an effective level of abstraction, improving generalization on low-frequency predicates by allowing them to learn from the training examples of other predicates belonging to the same class. We also find that joint training of VerbNet role labeling and predicate disambiguation of VerbNet classes for polysemous verbs leads to improvements in both tasks, naturally supporting the extraction of VerbNet’s semantic representations.

pdf abs
Tuning Deep Active Learning for Semantic Role Labeling
Skatje Myers | Martha Palmer
Proceedings of the 14th International Conference on Computational Semantics (IWCS)

Active learning has been shown to reduce annotation requirements for numerous natural language processing tasks, including semantic role labeling (SRL). SRL involves labeling argument spans for potentially multiple predicates in a sentence, which makes it challenging to aggregate the numerous decisions into a single score for determining new instances to annotate. In this paper, we apply two ways of aggregating scores across multiple predicates in order to choose query sentences with two methods of estimating model certainty: using the neural network’s outputs and using dropout-based Bayesian Active Learning by Disagreement. We compare these methods with three passive baselines — random sentence selection, random whole-document selection, and selecting sentences with the most predicates — and analyse the effect these strategies have on the learning curve with respect to reducing the number of annotated sentences and predicates to achieve high performance.

The SemLink resource provides mappings between a variety of lexical semantic ontologies, each with their strengths and weaknesses. To take advantage of these differences, the ability to move between resources is essential. This work describes advances made to improve the usability of the SemLink resource: the automatic addition of new instances and mappings, manual corrections, sense-based vectors and collocation information, and architecture built to automatically update the resource when versions of the underlying resources change. These updates improve coverage, provide new tools to leverage the capabilities of these resources, and facilitate seamless updates, ensuring the consistency and applicability of these mappings in the future.

Acquiring training data for natural language processing systems can be expensive and time-consuming. Given a few training examples crafted by experts, large corpora can be mined for thousands of semantically similar examples that provide useful variability to improve model generalization. We present TopGuNN, a fast contextualized k-NN retrieval system that can efficiently index and search over contextual embeddings generated from large corpora. TopGuNN is demonstrated for a training data augmentation use case over the Gigaword corpus. Using approximate k-NN and an efficient architecture, TopGuNN performs queries over an embedding space of 4.63TB (approximately 1.5B embeddings) in less than a day.

pdf abs
Fine-grained Information Extraction from Biomedical Literature based on Knowledge-enriched Abstract Meaning Representation
Zixuan Zhang | Nikolaus Parulian | Heng Ji | Ahmed Elsayed | Skatje Myers | Martha Palmer
Proceedings of the 59th Annual Meeting of the Association for Computational Linguistics and the 11th International Joint Conference on Natural Language Processing (Volume 1: Long Papers)

Biomedical Information Extraction from scientific literature presents two unique and non-trivial challenges. First, compared with general natural language texts, sentences from scientific papers usually possess wider contexts between knowledge elements. Moreover, comprehending the fine-grained scientific entities and events urgently requires domain-specific background knowledge. In this paper, we propose a novel biomedical Information Extraction (IE) model to tackle these two challenges and extract scientific entities and events from English research papers. We perform Abstract Meaning Representation (AMR) to compress the wide context to uncover a clear semantic structure for each complex sentence. Besides, we construct the sentence-level knowledge graph from an external knowledge base and use it to enrich the AMR graph to improve the model’s understanding of complex scientific concepts. We use an edge-conditioned graph attention network to encode the knowledge-enriched AMR graph for biomedical IE tasks. Experiments on the GENIA 2011 dataset show that the AMR and external knowledge have contributed 1.8% and 3.0% absolute F-score gains respectively. In order to evaluate the impact of our approach on real-world problems that involve topic-specific fine-grained knowledge elements, we have also created a new ontology and annotated corpus for entity and event extraction for the COVID-19 scientific literature, which can serve as a new benchmark for the biomedical IE community.

pdf abs
A Graphical Interface for Curating Schemas
Piyush Mishra | Akanksha Malhotra | Susan Windisch Brown | Martha Palmer | Ghazaleh Kazeminejad
Proceedings of the 59th Annual Meeting of the Association for Computational Linguistics and the 11th International Joint Conference on Natural Language Processing: System Demonstrations

Much past work has focused on extracting information like events, entities, and relations from documents. Very little work has focused on analyzing these results for better model understanding. In this paper, we introduce a curation interface that takes an Information Extraction (IE) system’s output in a pre-defined format and generates a graphical representation of its elements. The interface supports editing while curating schemas for complex events like Improvised Explosive Device (IED) based scenarios. We identify various schemas that either have linear event chains or contain parallel events with complicated temporal ordering. We iteratively update an induced schema to uniquely identify events specific to it, add optional events around them, and prune unnecessary events. The resulting schemas are improved and enriched versions of the machine-induced versions.

2020

pdf abs
Leveraging Non-Specialists for Accurate and Time Efficient AMR Annotation
Mary Martin | Cecilia Mauceri | Martha Palmer | Christoffer Heckman
Proceedings of the LREC 2020 Workshop on "Citizen Linguistics in Language Resource Development"

Abstract Meaning Representations (AMRs), a syntax-free representation of phrase semantics are useful for capturing the meaning of a phrase and reflecting the relationship between concepts that are referred to. However, annotating AMRs are time consuming and expensive. The existing annotation process requires expertly trained workers who have knowledge of an extensive set of guidelines for parsing phrases. In this paper, we propose a cost-saving two-step process for the creation of a corpus of AMR-phrase pairs for spatial referring expressions. The first step uses non-specialists to perform simple annotations that can be leveraged in the second step to accelerate the annotation performed by the experts. We hypothesize that our process will decrease the cost per annotation and improve consistency across annotators. Few corpora of spatial referring expressions exist and the resulting language resource will be valuable for referring expression comprehension and generation modeling.

pdf abs
Spatial AMR: Expanded Spatial Annotation in the Context of a Grounded Minecraft Corpus
Julia Bonn | Martha Palmer | Zheng Cai | Kristin Wright-Bettner
Proceedings of the Twelfth Language Resources and Evaluation Conference

This paper presents an expansion to the Abstract Meaning Representation (AMR) annotation schema that captures fine-grained semantically and pragmatically derived spatial information in grounded corpora. We describe a new lexical category conceptualization and set of spatial annotation tools built in the context of a multimodal corpus consisting of 170 3D structure-building dialogues between a human architect and human builder in Minecraft. Minecraft provides a particularly beneficial spatial relation-elicitation environment because it automatically tracks locations and orientations of objects and avatars in the space according to an absolute Cartesian coordinate system. Through a two-step process of sentence-level and document-level annotation designed to capture implicit information, we leverage these coordinates and bearings in the AMRs in combination with spatial framework annotation to ground the spatial language in the dialogues to absolute space.

Spatial Reasoning from language is essential for natural language understanding. Supporting it requires a representation scheme that can capture spatial phenomena encountered in language as well as in images and videos. Existing spatial representations are not sufficient for describing spatial configurations used in complex tasks. This paper extends the capabilities of existing spatial representation languages and increases coverage of the semantic aspects that are needed to ground spatial meaning of natural language text in the world. Our spatial relation language is able to represent a large, comprehensive set of spatial concepts crucial for reasoning and is designed to support composition of static and dynamic spatial configurations. We integrate this language with the Abstract Meaning Representation (AMR) annotation schema and present a corpus annotated by this extended AMR. To exhibit the applicability of our representation scheme, we annotate text taken from diverse datasets and show how we extend the capabilities of existing spatial representation languages with fine-grained decomposition of semantics and blend it seamlessly with AMRs of sentences and discourse representations as a whole.

pdf abs
The Russian PropBank
Sarah Moeller | Irina Wagner | Martha Palmer | Kathryn Conger | Skatje Myers
Proceedings of the Twelfth Language Resources and Evaluation Conference

This paper presents a proposition bank for Russian (RuPB), a resource for semantic role labeling (SRL). The motivating goal for this resource is to automatically project semantic role labels from English to Russian. This paper describes frame creation strategies, coverage, and the process of sense disambiguation. It discusses language-specific issues that complicated the process of building the PropBank and how these challenges were exploited as language-internal guidance for consistency and coherence.

We present refinements over existing temporal relation annotations in the Electronic Medical Record clinical narrative. We refined the THYME corpus annotations to more faithfully represent nuanced temporality and nuanced temporal-coreferential relations. The main contributions are in re-defining CONTAINS and OVERLAP relations into CONTAINS, CONTAINS-SUBEVENT, OVERLAP and NOTED-ON. We demonstrate that these refinements lead to substantial gains in learnability for state-of-the-art transformer models as compared to previously reported results on the original THYME corpus. We thus establish a baseline for the automatic extraction of these refined temporal relations. Although our study is done on clinical narrative, we believe it addresses far-reaching challenges that are corpus- and domain- agnostic.

pdf abs
Structured Tuning for Semantic Role Labeling
Tao Li | Parth Anand Jawale | Martha Palmer | Vivek Srikumar
Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics

Recent neural network-driven semantic role labeling (SRL) systems have shown impressive improvements in F1 scores. These improvements are due to expressive input representations, which, at least at the surface, are orthogonal to knowledge-rich constrained decoding mechanisms that helped linear SRL models. Introducing the benefits of structure to inform neural models presents a methodological challenge. In this paper, we present a structured tuning framework to improve models using softened constraints only at training time. Our framework leverages the expressiveness of neural networks and provides supervision with structured loss components. We start with a strong baseline (RoBERTa) to validate the impact of our approach, and show that our framework outperforms the baseline by learning to comply with declarative constraints. Additionally, our experiments with smaller training sizes show that we can achieve consistent improvements under low-resource scenarios.

2019

pdf abs
Linguistic Analysis Improves Neural Metaphor Detection
Kevin Stowe | Sarah Moeller | Laura Michaelis | Martha Palmer
Proceedings of the 23rd Conference on Computational Natural Language Learning (CoNLL)

In the field of metaphor detection, deep learning systems are the ubiquitous and achieve strong performance on many tasks. However, due to the complicated procedures for manually identifying metaphors, the datasets available are relatively small and fraught with complications. We show that using syntactic features and lexical resources can automatically provide additional high-quality training data for metaphoric language, and this data can cover gaps and inconsistencies in metaphor annotation, improving state-of-the-art word-level metaphor identification. This novel application of automatically improving training data improves classification across numerous tasks, and reconfirms the necessity of high-quality data for deep learning frameworks.

pdf abs
ClearTAC: Verb Tense, Aspect, and Form Classification Using Neural Nets
Skatje Myers | Martha Palmer
Proceedings of the First International Workshop on Designing Meaning Representations

This paper proposes using a Bidirectional LSTM-CRF model in order to identify the tense and aspect of verbs. The information that this classifier outputs can be useful for ordering events and can provide a pre-processing step to improve efficiency of annotating this type of information. This neural network architecture has been successfully employed for other sequential labeling tasks, and we show that it significantly outperforms the rule-based tool TMV-annotator on the Propbank I dataset.

This paper announces the release of a new version of the English lexical resource VerbNet with substantially revised semantic representations designed to facilitate computer planning and reasoning based on human language. We use the transfer of possession and transfer of information event representations to illustrate both the general framework of the representations and the types of nuances the new representations can capture. These representations use a Generative Lexicon-inspired subevent structure to track attributes of event participants across time, highlighting oppositions and temporal and causal relations among the subevents.

pdf abs
Explaining Simple Natural Language Inference
Aikaterini-Lida Kalouli | Annebeth Buis | Livy Real | Martha Palmer | Valeria de Paiva
Proceedings of the 13th Linguistic Annotation Workshop

The vast amount of research introducing new corpora and techniques for semi-automatically annotating corpora shows the important role that datasets play in today’s research, especially in the machine learning community. This rapid development raises concerns about the quality of the datasets created and consequently of the models trained, as recently discussed with respect to the Natural Language Inference (NLI) task. In this work we conduct an annotation experiment based on a small subset of the SICK corpus. The experiment reveals several problems in the annotation guidelines, and various challenges of the NLI task itself. Our quantitative evaluation of the experiment allows us to assign our empirical observations to specific linguistic phenomena and leads us to recommendations for future annotation tasks, for NLI and possibly for other tasks.

pdf abs
Enhancing biomedical word embeddings by retrofitting to verb clusters
Billy Chiu | Simon Baker | Martha Palmer | Anna Korhonen
Proceedings of the 18th BioNLP Workshop and Shared Task

Verbs play a fundamental role in many biomed-ical tasks and applications such as relation and event extraction. We hypothesize that performance on many downstream tasks can be improved by aligning the input pretrained embeddings according to semantic verb classes. In this work, we show that by using semantic clusters for verbs, a large lexicon of verbclasses derived from biomedical literature, weare able to improve the performance of common pretrained embeddings in downstream tasks by retrofitting them to verb classes. We present a simple and computationally efficient approach using a widely-available “off-the-shelf” retrofitting algorithm to align pretrained embeddings according to semantic verb clusters. We achieve state-of-the-art results on text classification and relation extraction tasks.

pdf bib abs
Cross-document coreference: An approach to capturing coreference without context
Kristin Wright-Bettner | Martha Palmer | Guergana Savova | Piet de Groen | Timothy Miller
Proceedings of the Tenth International Workshop on Health Text Mining and Information Analysis (LOUHI 2019)

This paper discusses a cross-document coreference annotation schema that was developed to further automatic extraction of timelines in the clinical domain. Lexical senses and coreference choices are determined largely by context, but cross-document work requires reasoning across contexts that are not necessarily coherent. We found that an annotation approach that relies less on context-guided annotator intuitions and more on schematic rules was most effective in creating meaningful and consistent cross-document relations.

pdf bib abs
Syntactic composition and selectional preferences in Hindi Light Verb Constructions
Ashwini Vaidya | Martha Palmer
Linguistic Issues in Language Technology, Volume 17, 2019

Previous work on light verb constructions (e.g. chorii kar ‘theft do; steal’) in Hindi describes their syntactic formation via co-predication (Ahmed et al., 2012, Butt, 2014). This implies that both noun and light verb contribute their arguments, and these overlapping argument structures must be composed in the syntax. In this paper, we present a co-predication analysis using Tree-Adjoining Grammar, which models syntactic composition and semantic selectional preferences without transformations (deletion or argument identification). The analysis has two key components (i) an underspecified category for the nominal and (ii) combinatorial constraints on the noun and light verb to specify selectional preferences. The former has the advantage of syntactic composition without argument identification and the latter prevents over-generalization, while recognizing the semantic contribution of both predicates. This work additionally accounts for the agreement facts for the Hindi LVC.

2018

pdf abs
Automatically Extracting Qualia Relations for the Rich Event Ontology
Ghazaleh Kazeminejad | Claire Bonial | Susan Windisch Brown | Martha Palmer
Proceedings of the 27th International Conference on Computational Linguistics

Commonsense, real-world knowledge about the events that entities or “things in the world” are typically involved in, as well as part-whole relationships, is valuable for allowing computational systems to draw everyday inferences about the world. Here, we focus on automatically extracting information about (1) the events that typically bring about certain entities (origins), (2) the events that are the typical functions of entities, and (3) part-whole relationships in entities. These correspond to the agentive, telic and constitutive qualia central to the Generative Lexicon. We describe our motivations and methods for extracting these qualia relations from the Suggested Upper Merged Ontology (SUMO) and show that human annotators overwhelmingly find the information extracted to be reasonable. Because ontologies provide a way of structuring this information and making it accessible to agents and computational systems generally, efforts are underway to incorporate the extracted information to an ontology hub of Natural Language Processing semantic role labeling resources, the Rich Event Ontology.

There are few corpora that endeavor to represent the semantic content of entire documents. We present a corpus that accomplishes one way of capturing document level semantics, by annotating coreference and similar phenomena (bridging and implicit roles) on top of gold Abstract Meaning Representations of sentence-level semantics. We present a new corpus of this annotation, with analysis of its quality, alongside a plausible baseline for comparison. It is hoped that this Multi-Sentence AMR corpus (MS-AMR) may become a feasible method for developing rich representations of document meaning, useful for tasks such as information extraction and question answering.

pdf abs
Leveraging Syntactic Constructions for Metaphor Identification
Kevin Stowe | Martha Palmer
Proceedings of the Workshop on Figurative Language Processing

Identification of metaphoric language in text is critical for generating effective semantic representations for natural language understanding. Computational approaches to metaphor identification have largely relied on heuristic based models or feature-based machine learning, using hand-crafted lexical resources coupled with basic syntactic information. However, recent work has shown the predictive power of syntactic constructions in determining metaphoric source and target domains (Sullivan 2013). Our work intends to explore syntactic constructions and their relation to metaphoric language. We undertake a corpus-based analysis of predicate-argument constructions and their metaphoric properties, and attempt to effectively represent syntactic constructions as features for metaphor processing, both in identifying source and target domains and in distinguishing metaphoric words from non-metaphoric.

pdf abs
Improving Classification of Twitter Behavior During Hurricane Events
Kevin Stowe | Jennings Anderson | Martha Palmer | Leysia Palen | Ken Anderson
Proceedings of the Sixth International Workshop on Natural Language Processing for Social Media

A large amount of social media data is generated during natural disasters, and identifying the relevant portions of this data is critical for researchers attempting to understand human behavior, the effects of information sources, and preparatory actions undertaken during these events. In order to classify human behavior during hazard events, we employ machine learning for two tasks: identifying hurricane related tweets and classifying user evacuation behavior during hurricanes. We show that feature-based and deep learning methods provide different benefits for tweet classification, and ensemble-based methods using linguistic, temporal, and geospatial features can effectively classify user behavior.

When a hazard such as a hurricane threatens, people are forced to make a wide variety of decisions, and the information they receive and produce can influence their own and others’ actions. As social media grows more popular, an increasing number of people are using social media platforms to obtain and share information about approaching threats and discuss their interpretations of the threat and their protective decisions. This work aims to improve understanding of natural disasters through social media and provide an annotation scheme to identify themes in user’s social media behavior and facilitate efforts in supervised machine learning. To that end, this work has three contributions: (1) the creation of an annotation scheme to consistently identify hazard-related themes in Twitter, (2) an overview of agreement rates and difficulties in identifying annotation categories, and (3) a public release of both the dataset and guidelines developed from this scheme.

pdf abs
SemEval 2018 Task 6: Parsing Time Normalizations
Egoitz Laparra | Dongfang Xu | Ahmed Elsayed | Steven Bethard | Martha Palmer
Proceedings of the 12th International Workshop on Semantic Evaluation

This paper presents the outcomes of the Parsing Time Normalization shared task held within SemEval-2018. The aim of the task is to parse time expressions into the compositional semantic graphs of the Semantically Compositional Annotation of Time Expressions (SCATE) schema, which allows the representation of a wider variety of time expressions than previous approaches. Two tracks were included, one to evaluate the parsing of individual components of the produced graphs, in a classic information extraction way, and another one to evaluate the quality of the time intervals resulting from the interpretation of those graphs. Though 40 participants registered for the task, only one team submitted output, achieving 0.55 F1 in Track 1 (parsing) and 0.70 F1 in Track 2 (intervals).

pdf
Integrating Generative Lexicon Event Structures into VerbNet
Susan Windisch Brown | James Pustejovsky | Annie Zaenen | Martha Palmer
Proceedings of the Eleventh International Conference on Language Resources and Evaluation (LREC 2018)

pdf
The New Propbank: Aligning Propbank with AMR through POS Unification
Tim O’Gorman | Sameer Pradhan | Martha Palmer | Julia Bonn | Katie Conger | James Gung
Proceedings of the Eleventh International Conference on Language Resources and Evaluation (LREC 2018)

2017

pdf abs
The Rich Event Ontology
Susan Brown | Claire Bonial | Leo Obrst | Martha Palmer
Proceedings of the Events and Stories in the News Workshop

In this paper we describe a new lexical semantic resource, The Rich Event On-tology, which provides an independent conceptual backbone to unify existing semantic role labeling (SRL) schemas and augment them with event-to-event causal and temporal relations. By unifying the FrameNet, VerbNet, Automatic Content Extraction, and Rich Entities, Relations and Events resources, the ontology serves as a shared hub for the disparate annotation schemas and therefore enables the combination of SRL training data into a larger, more diverse corpus. By adding temporal and causal relational information not found in any of the independent resources, the ontology facilitates reasoning on and across documents, revealing relationships between events that come together in temporal and causal chains to build more complex scenarios. We envision the open resource serving as a valuable tool for both moving from the ontology to text to query for event types and scenarios of interest, and for moving from text to the ontology to access interpretations of events using the combined semantic information housed there.

Agents that communicate back and forth with humans to help them execute non-linguistic tasks are a long sought goal of AI. These agents need to translate between utterances and actionable meaning representations that can be interpreted by task-specific problem solvers in a context-dependent manner. They should also be able to learn such actionable interpretations for new predicates on the fly. We define an agent architecture for this scenario and present a series of experiments in the Blocks World domain that illustrate how our architecture supports language learning and problem solving in this domain.

pdf abs
Unsupervised AMR-Dependency Parse Alignment
Wei-Te Chen | Martha Palmer
Proceedings of the 15th Conference of the European Chapter of the Association for Computational Linguistics: Volume 1, Long Papers

In this paper, we introduce an Abstract Meaning Representation (AMR) to Dependency Parse aligner. Alignment is a preliminary step for AMR parsing, and our aligner improves current AMR parser performance. Our aligner involves several different features, including named entity tags and semantic role labels, and uses Expectation-Maximization training. Results show that our aligner reaches an 87.1% F-Score score with the experimental data, and enhances AMR parsing.

pdf abs
SemEval-2017 Task 12: Clinical TempEval
Steven Bethard | Guergana Savova | Martha Palmer | James Pustejovsky
Proceedings of the 11th International Workshop on Semantic Evaluation (SemEval-2017)

Clinical TempEval 2017 aimed to answer the question: how well do systems trained on annotated timelines for one medical condition (colon cancer) perform in predicting timelines on another medical condition (brain cancer)? Nine sub-tasks were included, covering problems in time expression identification, event expression identification and temporal relation identification. Participant systems were evaluated on clinical and pathology notes from Mayo Clinic cancer patients, annotated with an extension of TimeML for the clinical domain. 11 teams participated in the tasks, with the best systems achieving F1 scores above 0.55 for time expressions, above 0.70 for event expressions, and above 0.40 for temporal relations. Most tasks observed about a 20 point drop over Clinical TempEval 2016, where systems were trained and evaluated on the same domain (colon cancer).

pdf bib
Proceedings of the 2017 Conference on Empirical Methods in Natural Language Processing
Martha Palmer | Rebecca Hwa | Sebastian Riedel
Proceedings of the 2017 Conference on Empirical Methods in Natural Language Processing

2016

High accuracy for automated translation and information retrieval calls for linguistic annotations at various language levels. The plethora of informal internet content sparked the demand for porting state-of-art natural language processing (NLP) applications to new social media as well as diverse language adaptation. Effort launched by the BOLT (Broad Operational Language Translation) program at DARPA (Defense Advanced Research Projects Agency) successfully addressed the internet information with enhanced NLP systems. BOLT aims for automated translation and linguistic analysis for informal genres of text and speech in online and in-person communication. As a part of this program, the Linguistic Data Consortium (LDC) developed valuable linguistic resources in support of the training and evaluation of such new technologies. This paper focuses on methodologies, infrastructure, and procedure for developing linguistic annotation at various language levels, including Treebank (TB), word alignment (WA), PropBank (PB), and co-reference (CoRef). Inspired by the OntoNotes approach with adaptations to the tasks to reflect the goals and scope of the BOLT project, this effort has introduced more annotation types of informal and free-style genres in English, Chinese and Egyptian Arabic. The corpus produced is by far the largest multi-lingual, multi-level and multi-genre annotation corpus of informal text and speech.

This paper describes our efforts for the development of a Proposition Bank for Urdu, an Indo-Aryan language. Our primary goal is the labeling of syntactic nodes in the existing Urdu dependency Treebank with specific argument labels. In essence, it involves annotation of predicate argument structures of both simple and complex predicates in the Treebank corpus. We describe the overall process of building the PropBank of Urdu. We discuss various statistics pertaining to the Urdu PropBank and the issues which the annotators encountered while developing the PropBank. We also discuss how these challenges were addressed to successfully expand the PropBank corpus. While reporting the Inter-annotator agreement between the two annotators, we show that the annotators share similar understanding of the annotation guidelines and of the linguistic phenomena present in the language. The present size of this Propbank is around 180,000 tokens which is double-propbanked by the two annotators for simple predicates. Another 100,000 tokens have been annotated for complex predicates of Urdu.

pdf abs
Comprehensive and Consistent PropBank Light Verb Annotation
Claire Bonial | Martha Palmer
Proceedings of the Tenth International Conference on Language Resources and Evaluation (LREC'16)

Recent efforts have focused on expanding the annotation coverage of PropBank from verb relations to adjective and noun relations, as well as light verb constructions (e.g., make an offer, take a bath). While each new relation type has presented unique annotation challenges, ensuring consistent and comprehensive annotation of light verb constructions has proved particularly challenging, given that light verb constructions are semi-productive, difficult to define, and there are often borderline cases. This research describes the iterative process of developing PropBank annotation guidelines for light verb constructions, the current guidelines, and a comparison to related resources.

pdf
Leveraging VerbNet to build Corpus-Specific Verb Clusters
Daniel Peterson | Jordan Boyd-Graber | Martha Palmer | Daisuke Kawahara
Proceedings of the Fifth Joint Conference on Lexical and Computational Semantics

pdf bib
Proceedings of the Fourth Workshop on Events
Martha Palmer | Ed Hovy | Teruko Mitamura | Tim O’Gorman
Proceedings of the Fourth Workshop on Events

pdf
Multimodal Use of an Upper-Level Event Ontology
Claire Bonial | David Tahmoush | Susan Windisch Brown | Martha Palmer
Proceedings of the Fourth Workshop on Events

pdf
Richer Event Description: Integrating event coreference with temporal, causal and bridging annotation
Tim O’Gorman | Kristin Wright-Bettner | Martha Palmer
Proceedings of the 2nd Workshop on Computing News Storylines (CNS 2016)

pdf bib
Identifying and Categorizing Disaster-Related Tweets
Kevin Stowe | Michael J. Paul | Martha Palmer | Leysia Palen | Kenneth Anderson
Proceedings of the Fourth International Workshop on Natural Language Processing for Social Media

pdf abs
Linguistic features for Hindi light verb construction identification
Ashwini Vaidya | Sumeet Agarwal | Martha Palmer
Proceedings of COLING 2016, the 26th International Conference on Computational Linguistics: Technical Papers

Light verb constructions (LVC) in Hindi are highly productive. If we can distinguish a case such as nirnay lenaa ‘decision take; decide’ from an ordinary verb-argument combination kaagaz lenaa ‘paper take; take (a) paper’,it has been shown to aid NLP applications such as parsing (Begum et al., 2011) and machine translation (Pal et al., 2011). In this paper, we propose an LVC identification system using language specific features for Hindi which shows an improvement over previous work(Begum et al., 2011). To build our system, we carry out a linguistic analysis of Hindi LVCs using Hindi Treebank annotations and propose two new features that are aimed at capturing the diversity of Hindi LVCs in the corpus. We find that our model performs robustly across a diverse range of LVCs and our results underscore the importance of semantic features, which is in keeping with the findings for English. Our error analysis also demonstrates that our classifier can be used to further refine LVC annotations in the Hindi Treebank and make them more consistent across the board.

2015

pdf bib
Proceedings of the 3rd Workshop on EVENTS: Definition, Detection, Coreference, and Representation
Eduard Hovy | Teruko Mitamura | Martha Palmer
Proceedings of the 3rd Workshop on EVENTS: Definition, Detection, Coreference, and Representation

pdf
Improving Chinese-English PropBank Alignment
Shumin Wu | Martha Palmer
Proceedings of the Ninth Workshop on Syntax, Semantics and Structure in Statistical Translation

pdf
A Hierarchy with, of, and for Preposition Supersenses
Nathan Schneider | Vivek Srikumar | Jena D. Hwang | Martha Palmer
Proceedings of the 9th Linguistic Annotation Workshop

pdf bib
Proceedings of the Fourth Joint Conference on Lexical and Computational Semantics
Martha Palmer | Gemma Boleda | Paolo Rosso
Proceedings of the Fourth Joint Conference on Lexical and Computational Semantics

pdf
Identification of Caused Motion Construction
Jena D. Hwang | Martha Palmer
Proceedings of the Fourth Joint Conference on Lexical and Computational Semantics

pdf
Can Selectional Preferences Help Automatic Semantic Role Labeling?
Shumin Wu | Martha Palmer
Proceedings of the Fourth Joint Conference on Lexical and Computational Semantics

2014

pdf abs
PropBank: Semantics of New Predicate Types
Claire Bonial | Julia Bonn | Kathryn Conger | Jena D. Hwang | Martha Palmer
Proceedings of the Ninth International Conference on Language Resources and Evaluation (LREC'14)

This research focuses on expanding PropBank, a corpus annotated with predicate argument structures, with new predicate types; namely, noun, adjective and complex predicates, such as Light Verb Constructions. This effort is in part inspired by a sister project to PropBank, the Abstract Meaning Representation project, which also attempts to capture who is doing what to whom in a sentence, but does so in a way that abstracts away from syntactic structures. For example, alternate realizations of a ‘destroying’ event in the form of either the verb ‘destroy’ or the noun ‘destruction’ would receive the same Abstract Meaning Representation. In order for PropBank to reach the same level of coverage and continue to serve as the bedrock for Abstract Meaning Representation, predicate types other than verbs, which have previously gone without annotation, must be annotated. This research describes the challenges therein, including the development of new annotation practices that walk the line between abstracting away from language-particular syntactic facts to explore deeper semantics, and maintaining the connection between semantics and syntactic structures that has proven to be very valuable for PropBank as a corpus of training data for Natural Language Processing applications.

pdf abs
Not an Interlingua, But Close: Comparison of English AMRs to Chinese and Czech
Nianwen Xue | Ondřej Bojar | Jan Hajič | Martha Palmer | Zdeňka Urešová | Xiuhong Zhang
Proceedings of the Ninth International Conference on Language Resources and Evaluation (LREC'14)

Abstract Meaning Representations (AMRs) are rooted, directional and labeled graphs that abstract away from morpho-syntactic idiosyncrasies such as word category (verbs and nouns), word order, and function words (determiners, some prepositions). Because these syntactic idiosyncrasies account for many of the cross-lingual differences, it would be interesting to see if this representation can serve, e.g., as a useful, minimally divergent transfer layer in machine translation. To answer this question, we have translated 100 English sentences that have existing AMRs into Chinese and Czech to create AMRs for them. A cross-linguistic comparison of English to Chinese and Czech AMRs reveals both cases where the AMRs for the language pairs align well structurally and cases of linguistic divergence. We found that the level of compatibility of AMR between English and Chinese is higher than between English and Czech. We believe this kind of comparison is beneficial to further refining the annotation standards for each of the three languages and will lead to more compatible annotation guidelines between the languages.

pdf abs
Criteria for Identifying and Annotating Caused Motion Constructions in Corpus Data
Jena D. Hwang | Annie Zaenen | Martha Palmer
Proceedings of the Ninth International Conference on Language Resources and Evaluation (LREC'14)

While natural language processing performance has been improved through the recognition that there is a relationship between the semantics of the verb and the syntactic context in which the verb is realized, sentences where the verb does not conform to the expected syntax-semantic patterning behavior remain problematic. For example, in the sentence The crowed laughed the clown off the stage, a verb of non-verbal communication laugh is used in a caused motion construction and gains a motion entailment that is atypical given its inherent lexical semantics. This paper focuses on our efforts at defining the semantic types and varieties of caused motion constructions (CMCs) through an iterative annotation process and establishing annotation guidelines based on these criteria to aid in the production of a consistent and reliable annotation. The annotation will serve as training and test data for classifiers for CMCs, and the CMC definitions developed throughout this study will be used in extending VerbNet to handle representations of sentences in which a verb is used in a syntactic context that is atypical for its lexical semantics.

pdf abs
Mapping CPA Patterns onto OntoNotes Senses
Octavian Popescu | Martha Palmer | Patrick Hanks
Proceedings of the Ninth International Conference on Language Resources and Evaluation (LREC'14)

In this paper we present an alignment experiment between patterns of verb use discovered by Corpus Pattern Analysis (CPA; Hanks 2004, 2008, 2012) and verb senses in OntoNotes (ON; Hovy et al. 2006, Weischedel et al. 2011). We present a probabilistic approach for mapping one resource into the other. Firstly we introduce a basic model, based on conditional probabilities, which determines for any given sentence the best CPA pattern match. On the basis of this model, we propose a joint source channel model (JSCM) that computes the probability of compatibility of semantic types between a verb phrase and a pattern, irrespective of whether the verb phrase is a norm or an exploitation. We evaluate the accuracy of the proposed mapping using cluster similarity metrics based on entropy.

pdf abs
Single Classifier Approach for Verb Sense Disambiguation based on Generalized Features
Daisuke Kawahara | Martha Palmer
Proceedings of the Ninth International Conference on Language Resources and Evaluation (LREC'14)

We present a supervised method for verb sense disambiguation based on VerbNet. Most previous supervised approaches to verb sense disambiguation create a classifier for each verb that reaches a frequency threshold. These methods, however, have a significant practical problem that they cannot be applied to rare or unseen verbs. In order to overcome this problem, we create a single classifier to be applied to rare or unseen verbs in a new text. This single classifier also exploits generalized semantic features of a verb and its modifiers in order to better deal with rare or unseen verbs. Our experimental results show that the proposed method achieves equivalent performance to per-verb classifiers, which cannot be applied to unseen verbs. Our classifier could be utilized to improve the classifications in lexical resources of verbs, such as VerbNet, in a semi-automatic manner and to possibly extend the coverage of these resources to new verbs.

pdf abs
Focusing Annotation for Semantic Role Labeling
Daniel Peterson | Martha Palmer | Shumin Wu
Proceedings of the Ninth International Conference on Language Resources and Evaluation (LREC'14)

Annotation of data is a time-consuming process, but necessary for many state-of-the-art solutions to NLP tasks, including semantic role labeling (SRL). In this paper, we show that language models may be used to select sentences that are more useful to annotate. We simulate a situation where only a portion of the available data can be annotated, and compare language model based selection against a more typical baseline of randomly selected data. The data is ordered using an off-the-shelf language modeling toolkit. We show that the least probable sentences provide dramatic improved system performance over the baseline, especially when only a small portion of the data is annotated. In fact, the lion’s share of the performance can be attained by annotating only 10-20% of the data. This result holds for training a model based on new annotation, as well as when adding domain-specific annotation to a general corpus for domain adaptation.

pdf
A Step-wise Usage-based Method for Inducing Polysemy-aware Verb Classes
Daisuke Kawahara | Daniel W. Peterson | Martha Palmer
Proceedings of the 52nd Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers)

pdf
The VerbCorner Project: Findings from Phase 1 of crowd-sourcing a semantic decomposition of verbs
Joshua K. Hartshorne | Claire Bonial | Martha Palmer
Proceedings of the 52nd Annual Meeting of the Association for Computational Linguistics (Volume 2: Short Papers)

This article discusses the requirements of a formal specification for the annotation of temporal information in clinical narratives. We discuss the implementation and extension of ISO-TimeML for annotating a corpus of clinical notes, known as the THYME corpus. To reflect the information task and the heavily inference-based reasoning demands in the domain, a new annotation guideline has been developed, “the THYME Guidelines to ISO-TimeML (THYME-TimeML)”. To clarify what relations merit annotation, we distinguish between linguistically-derived and inferentially-derived temporal orderings in the text. We also apply a top performing TempEval 2013 system against this new resource to measure the difficulty of adapting systems to the clinical domain. The corpus is available to the community and has been proposed for use in a SemEval 2015 task.

pdf bib
Proceedings of the EACL 2014 Workshop on Computational Approaches to Causality in Language (CAtoCL)
Oleksandr Kolomiyets | Marie-Francine Moens | Martha Palmer | James Pustejovsky | Steven Bethard
Proceedings of the EACL 2014 Workshop on Computational Approaches to Causality in Language (CAtoCL)

pdf
An Approach to Take Multi-Word Expressions
Claire Bonial | Meredith Green | Jenette Preciado | Martha Palmer
Proceedings of the 10th Workshop on Multiword Expressions (MWE)

pdf bib
Proceedings of the Second Workshop on EVENTS: Definition, Detection, Coreference, and Representation
Teruko Mitamura | Eduard Hovy | Martha Palmer
Proceedings of the Second Workshop on EVENTS: Definition, Detection, Coreference, and Representation

pdf
Challenges of Adding Causation to Richer Event Descriptions
Rei Ikuta | Will Styler | Mariah Hamang | Tim O’Gorman | Martha Palmer
Proceedings of the Second Workshop on EVENTS: Definition, Detection, Coreference, and Representation

pdf
SemLink+: FrameNet, VerbNet and Event Ontologies
Martha Palmer | Claire Bonial | Diana McCarthy
Proceedings of Frame Semantics in NLP: A Workshop in Honor of Chuck Fillmore (1929-2014)

pdf
Light verb constructions with ‘do’ and ‘be’ in Hindi: A TAG analysis
Ashwini Vaidya | Owen Rambow | Martha Palmer
Proceedings of Workshop on Lexical and Grammatical Resources for Language Processing

pdf
Inducing Example-based Semantic Frames from a Massive Amount of Verb Uses
Daisuke Kawahara | Daniel Peterson | Octavian Popescu | Martha Palmer
Proceedings of the 14th Conference of the European Chapter of the Association for Computational Linguistics

2013

pdf
Semantic Role Labeling
Martha Palmer | Ivan Titov | Shumin Wu
NAACL HLT 2013 Tutorial Abstracts

pdf
The VerbCorner Project: Toward an Empirically-Based Semantic Decomposition of Verbs
Joshua K. Hartshorne | Claire Bonial | Martha Palmer
Proceedings of the 2013 Conference on Empirical Methods in Natural Language Processing

pdf
Complex Predicates are Multi-Word Expressions
Martha Palmer
Proceedings of the 9th Workshop on Multiword Expressions

pdf
Semantic Roles for Nominal Predicates: Building a Lexical Resource
Ashwini Vaidya | Martha Palmer | Bhuvana Narasimhan
Proceedings of the 9th Workshop on Multiword Expressions

pdf bib
Workshop on Events: Definition, Detection, Coreference, and Representation
Eduard Hovy | Teruko Mitamura | Martha Palmer
Workshop on Events: Definition, Detection, Coreference, and Representation

pdf
Expanding VerbNet with Sketch Engine
Claire Bonial | Orin Hargraves | Martha Palmer
Proceedings of the 6th International Conference on Generative Approaches to the Lexicon (GL2013)

pdf
Renewing and Revising SemLink
Claire Bonial | Kevin Stowe | Martha Palmer
Proceedings of the 2nd Workshop on Linked Data in Linguistics (LDL-2013): Representing and linking lexicons, terminologies and other language data

2012

pdf bib
Question Ranking and Selection in Tutorial Dialogues
Lee Becker | Martha Palmer | Sarel van Vuuren | Wayne Ward
Proceedings of the Seventh Workshop on Building Educational Applications Using NLP

pdf abs
Empty Argument Insertion in the Hindi PropBank
Ashwini Vaidya | Jinho D. Choi | Martha Palmer | Bhuvana Narasimhan
Proceedings of the Eighth International Conference on Language Resources and Evaluation (LREC'12)

This paper examines both linguistic behavior and practical implication of empty argument insertion in the Hindi PropBank. The Hindi PropBank is annotated on the Hindi Dependency Treebank, which contains some empty categories but not the empty arguments of verbs. In this paper, we analyze four kinds of empty arguments, *PRO*, *REL*, *GAP*, *pro*, and suggest effective ways of annotating these arguments. Empty arguments such as *PRO* and *REL* can be inserted deterministically; we present linguistically motivated rules that automatically insert these arguments with high accuracy. On the other hand, it is difficult to find deterministic rules to insert *GAP* and *pro*; for these arguments, we introduce a new annotation scheme that concurrently handles both semantic role labeling and empty category insertion, producing fast and high quality annotation. In addition, we present algorithms for finding antecedents of *REL* and *PRO*, and discuss why finding antecedents for some types of *PRO* is difficult.

pdf abs
Foundations of a Multilayer Annotation Framework for Twitter Communications During Crisis Events
William J. Corvey | Sudha Verma | Sarah Vieweg | Martha Palmer | James H. Martin
Proceedings of the Eighth International Conference on Language Resources and Evaluation (LREC'12)

In times of mass emergency, vast amounts of data are generated via computer-mediated communication (CMC) that are difficult to manually collect and organize into a coherent picture. Yet valuable information is broadcast, and can provide useful insight into time- and safety-critical situations if captured and analyzed efficiently and effectively. We describe a natural language processing component of the EPIC (Empowering the Public with Information in Crisis) Project infrastructure, designed to extract linguistic and behavioral information from tweet text to aid in the task of information integration. The system incorporates linguistic annotation, in the form of Named Entity Tagging, as well as behavioral annotations to capture tweets contributing to situational awareness and analyze the information type of the tweet content. We show classification results and describe future integration of these classifiers in the larger EPIC infrastructure.

pdf
Verb Classification using Distributional Similarity in Syntactic and Semantic Structures
Danilo Croce | Alessandro Moschitti | Roberto Basili | Martha Palmer
Proceedings of the 50th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers)

pdf
Fast and Robust Part-of-Speech Tagging Using Dynamic Model Selection
Jinho D. Choi | Martha Palmer
Proceedings of the 50th Annual Meeting of the Association for Computational Linguistics (Volume 2: Short Papers)

2011

pdf bib
Good Seed Makes a Good Crop: Accelerating Active Learning Using Language Modeling
Dmitriy Dligach | Martha Palmer
Proceedings of the 49th Annual Meeting of the Association for Computational Linguistics: Human Language Technologies

pdf
Getting the Most out of Transition-based Dependency Parsing
Jinho D. Choi | Martha Palmer
Proceedings of the 49th Annual Meeting of the Association for Computational Linguistics: Human Language Technologies

pdf
VerbNet Class Assignment as a WSD Task
Susan Windisch Brown | Dmitriy Dligach | Martha Palmer
Proceedings of the Ninth International Conference on Computational Semantics (IWCS 2011)

pdf
DISCUSS: A dialogue move taxonomy layered over semantic representations
Lee Becker | Wayne Ward | Sarel van Vuuren | Martha Palmer
Proceedings of the Ninth International Conference on Computational Semantics (IWCS 2011)

pdf
Analysis of the Hindi Proposition Bank using Dependency Structure
Ashwini Vaidya | Jinho Choi | Martha Palmer | Bhuvana Narasimhan
Proceedings of the 5th Linguistic Annotation Workshop

pdf
Reducing the Need for Double Annotation
Dmitriy Dligach | Martha Palmer
Proceedings of the 5th Linguistic Annotation Workshop

pdf
A scaleable automated quality assurance technique for semantic representations and proposition banks
K. Bretonnel Cohen | Lawrence Hunter | Martha Palmer
Proceedings of the 5th Linguistic Annotation Workshop

pdf bib
Going Beyond Shallow Semantics
Martha Palmer
Proceedings of the ACL 2011 Workshop on Relational Models of Semantics

pdf
Transition-based Semantic Role Labeling Using Predicate Argument Clustering
Jinho D. Choi | Martha Palmer
Proceedings of the ACL 2011 Workshop on Relational Models of Semantics

pdf
Semantic Mapping Using Automatic Word Alignment and Semantic Role Labeling
Shumin Wu | Martha Palmer
Proceedings of Fifth Workshop on Syntax, Semantics and Structure in Statistical Translation

pdf bib
CoNLL-2011 Shared Task: Modeling Unrestricted Coreference in OntoNotes
Sameer Pradhan | Lance Ramshaw | Mitchell Marcus | Martha Palmer | Ralph Weischedel | Nianwen Xue
Proceedings of the Fifteenth Conference on Computational Natural Language Learning: Shared Task

pdf bib
Statistical Dependency Parsing in Korean: From Corpus Generation To Automatic Parsing
Jinho D. Choi | Martha Palmer
Proceedings of the Second Workshop on Statistical Parsing of Morphologically Rich Languages

2010

pdf
Twitter in Mass Emergency: What NLP Can Contribute
William J. Corvey | Sarah Vieweg | Travis Rood | Martha Palmer
Proceedings of the NAACL HLT 2010 Workshop on Computational Linguistics in a World of Social Media

pdf bib
Towards a Domain Independent Semantics: Enhancing Semantic Representation with Construction Grammar
Jena D. Hwang | Rodney D. Nielsen | Martha Palmer
Proceedings of the NAACL HLT Workshop on Extracting and Using Constructions in Computational Linguistics

pdf
To Annotate More Accurately or to Annotate More
Dmitriy Dligach | Rodney Nielsen | Martha Palmer
Proceedings of the Fourth Linguistic Annotation Workshop

pdf
Retrieving Correct Semantic Boundaries in Dependency Structure
Jinho Choi | Martha Palmer
Proceedings of the Fourth Linguistic Annotation Workshop

pdf
An Overview of the CRAFT Concept Annotation Guidelines
Michael Bada | Miriam Eckert | Martha Palmer | Lawrence Hunter
Proceedings of the Fourth Linguistic Annotation Workshop

pdf
The Revised Arabic PropBank
Wajdi Zaghouani | Mona Diab | Aous Mansouri | Sameer Pradhan | Martha Palmer
Proceedings of the Fourth Linguistic Annotation Workshop

pdf
SemEval-2010 Task 10: Linking Events and Their Participants in Discourse
Josef Ruppenhofer | Caroline Sporleder | Roser Morante | Collin Baker | Martha Palmer
Proceedings of the 5th International Workshop on Semantic Evaluation

pdf
Multilingual Propbank Annotation Tools: Cornerstone and Jubilee
Jinho Choi | Claire Bonial | Martha Palmer
Proceedings of the NAACL HLT 2010 Demonstration Session

pdf abs
Propbank Frameset Annotation Guidelines Using a Dedicated Editor, Cornerstone
Jinho D. Choi | Claire Bonial | Martha Palmer
Proceedings of the Seventh International Conference on Language Resources and Evaluation (LREC'10)

This paper gives guidelines of how to create and update Propbank frameset files using a dedicated editor, Cornerstone. Propbank is a corpus in which the arguments of each verb predicate are annotated with their semantic roles in relation to the predicate. Propbank annotation also requires the choice of a sense ID for each predicate. Thus, for each predicate in Propbank, there exists a corresponding frameset file showing the expected predicate argument structure of each sense related to the predicate. Since most Propbank annotations are based on the predicate argument structure defined in the frameset files, it is important to keep the files consistent, simple to read as well as easy to update. The frameset files are written in XML, which can be difficult to edit when using a simple text editor. Therefore, it is helpful to develop a user-friendly editor such as Cornerstone, specifically customized to create and edit frameset files. Cornerstone runs platform independently, is light enough to run as an X11 application and supports multiple languages such as Arabic, Chinese, English, Hindi and Korean.

We are in the process of creating a multi-representational and multi-layered treebank for Hindi/Urdu (Palmer et al., 2009), which has three main layers: dependency structure, predicate-argument structure (PropBank), and phrase structure. This paper discusses an important issue in treebank design which is often neglected: the use of empty categories (ECs). All three levels of representation make use of ECs. We make a high-level distinction between two types of ECs, trace and silent, on the basis of whether they are postulated to mark displacement or not. Each type is further refined into several subtypes based on the underlying linguistic phenomena which the ECs are introduced to handle. This paper discusses the stages at which we add ECs to the Hindi/Urdu treebank and why. We investigate methodically the different types of ECs and their role in our syntactic and semantic representations. We also examine our decisions whether or not to coindex each type of ECs with other elements in the representation.

pdf abs
Propbank Instance Annotation Guidelines Using a Dedicated Editor, Jubilee
Jinho D. Choi | Claire Bonial | Martha Palmer
Proceedings of the Seventh International Conference on Language Resources and Evaluation (LREC'10)

This paper gives guidelines of how to annotate Propbank instances using a dedicated editor, Jubilee. Propbank is a corpus in which the arguments of each verb predicate are annotated with their semantic roles in relation to the predicate. Propbank annotation also requires the choice of a sense ID for each predicate. Jubilee facilitates this annotation process by displaying several resources of syntactic and semantic information simultaneously: the syntactic structure of a sentence is displayed in the main frame, the available senses with their corresponding argument structures are displayed in another frame, all available Propbank arguments are displayed for the annotators choice, and example annotations of each sense of the predicate are available to the annotator for viewing. Easy access to each of these resources allows the annotator to quickly absorb and apply the necessary syntactic and semantic information pertinent to each predicate for consistent and efficient annotation. Jubilee has been successfully adapted to many Propbank projects in several universities. The tool runs platform independently, is light enough to run as an X11 application and supports multiple languages such as Arabic, Chinese, English, Hindi and Korean.

pdf abs
Number or Nuance: Which Factors Restrict Reliable Word Sense Annotation?
Susan Windisch Brown | Travis Rood | Martha Palmer
Proceedings of the Seventh International Conference on Language Resources and Evaluation (LREC'10)

This study attempts to pinpoint the factors that restrict reliable word sense annotation, focusing on the influence of the number of senses annotators use and the semantic granularity of those senses. Both of these factors may be possible causes of low interannotator agreement (ITA) when tagging with fine-grained word senses, and, consequently, low WSD system performance (Ng et al., 1999; Snyder & Palmer, 2004; Chklovski & Mihalcea, 2002). If number of senses is the culprit, modifying the task to show fewer senses at a time could improve annotator reliability. However, if overly nuanced distinctions are the problem, then more general, coarse-grained distinctions may be necessary for annotator success and may be all that is needed to supply systems with the types of distinctions that people make. We describe three experiments that explore the role of these factors in annotation performance. Our results indicate that of these two factors, only the granularity of the senses restricts interannotator agreement, with broader senses resulting in higher annotation reliability.

LRs remain expensive to create and thus rare relative to demand across languages and technology types. The accidental re-creation of an LR that already exists is a nearly unforgivable waste of scarce resources that is unfortunately not so easy to avoid. The number of catalogs the HLT researcher must search, with their different formats, make it possible to overlook an existing resource. This paper sketches the sources of this problem and outlines a proposal to rectify along with a new vision of LR cataloging that will to facilitates the documentation and exploitation of a much wider range of LRs than previously considered.

pdf abs
Detecting Cross-lingual Semantic Similarity Using Parallel PropBanks
Shumin Wu | Jinho Choi | Martha Palmer
Proceedings of the 9th Conference of the Association for Machine Translation in the Americas: Research Papers

This paper suggests a method for detecting cross-lingual semantic similarity using parallel PropBanks. We begin by improving word alignments for verb predicates generated by GIZA++ by using information available in parallel PropBanks. We applied the Kuhn-Munkres method to measure predicate-argument matching and improved verb predicate alignments by an F-score of 12.6%. Using the enhanced word alignments we checked the set of target verbs aligned to a specific source verb for semantic consistency. For a set of English verbs aligned to a Chinese verb, we checked if the English verbs belong to the same semantic class using an existing lexical database, WordNet. For a set of Chinese verbs aligned to an English verb we manually checked semantic similarity between the Chinese verbs within a set. Our results show that the verb sets we generated have a high correlation with semantic classes. This could potentially lead to an automatic technique for generating semantic classes for verbs.

2009

pdf
Using Language Modeling to Select Useful Annotation Data
Dmitriy Dligach | Martha Palmer
Proceedings of Human Language Technologies: The 2009 Annual Conference of the North American Chapter of the Association for Computational Linguistics, Companion Volume: Student Research Workshop and Doctoral Consortium

pdf
Using Parallel Propbanks to enhance Word-alignments
Jinho Choi | Martha Palmer | Nianwen Xue
Proceedings of the Third Linguistic Annotation Workshop (LAW III)

pdf bib
Knowing a word (sense) by its company
Martha Palmer
Proceedings of the Eight International Conference on Computational Semantics

2008

pdf
Novel Semantic Features for Verb Sense Disambiguation
Dmitriy Dligach | Martha Palmer
Proceedings of ACL-08: HLT, Short Papers

pdf
Extracting a Representation from Text for Semantic Analysis
Rodney D. Nielsen | Wayne Ward | James H. Martin | Martha Palmer
Proceedings of ACL-08: HLT, Short Papers

pdf abs
Annotating Students’ Understanding of Science Concepts
Rodney D. Nielsen | Wayne Ward | James Martin | Martha Palmer
Proceedings of the Sixth International Conference on Language Resources and Evaluation (LREC'08)

This paper summarizes the annotation of fine-grained entailment relationships in the context of student answers to science assessment questions. We annotated a corpus of 15,357 answer pairs with 145,911 fine-grained entailment relationships. We provide the rationale for such fine-grained analysis and discuss its perceived benefits to an Intelligent Tutoring System. The corpus also has potential applications in other areas, such as question answering and multi-document summarization. Annotators achieved 86.2% inter-annotator agreement (Kappa=0.728, corresponding to substantial agreement) annotating the fine-grained facets of reference answers with regard to understanding expressed in student answers and labeling from one of five possible detailed relationship categories. The corpus described in this paper, which is the only one providing such detailed entailment annotations, is available as a public resource for the research community. The corpus is expected to enable application development, not only for intelligent tutoring systems, but also for general textual entailment applications, that is currently not practical.

In this paper, we present the details of creating a pilot Arabic proposition bank (Propbank). Propbanks exist for both English and Chinese. However the morphological and syntactic expression of linguistic phenomena in Arabic yields a very different type of process in creating an Arabic propbank. Hence, we highlight those characteristics of Arabic that make creating a propbank for the language a different challenge compared to the creation of an English Propbank.We believe that many of the lessons learned in dealing with Arabic could generalise to other languages that exhibit equally rich morphology and relatively free word order.

pdf bib
Proceedings of the Third Workshop on Issues in Teaching Computational Linguistics
Martha Palmer | Chris Brew | Fei Xia
Proceedings of the Third Workshop on Issues in Teaching Computational Linguistics

pdf bib
Invited Talk: The Relevance of a Cognitive Model of the Mental Lexicon to Automatic Word Sense Disambiguation
Martha Palmer | Susan Brown
Coling 2008: Proceedings of the workshop on Human Judgements in Computational Linguistics

2007

pdf
SemEval-2007 Task-17: English Lexical Sample, SRL and All Words
Sameer Pradhan | Edward Loper | Dmitriy Dligach | Martha Palmer
Proceedings of the Fourth International Workshop on Semantic Evaluations (SemEval-2007)

pdf
Can Semantic Roles Generalize Across Genres?
Szu-ting Yi | Edward Loper | Martha Palmer
Human Language Technologies 2007: The Conference of the North American Chapter of the Association for Computational Linguistics; Proceedings of the Main Conference

2006

pdf abs
Better Learning and Decoding for Syntax Based SMT Using PSDIG
Yuan Ding | Martha Palmer
Proceedings of the 7th Conference of the Association for Machine Translation in the Americas: Technical Papers

As an approach to syntax based statistical machine translation (SMT), Probabilistic Synchronous Dependency Insertion Grammars (PSDIG), introduced in (Ding and Palmer, 2005), are a version of synchronous grammars defined on dependency trees. In this paper we discuss better learning and decoding algorithms for a PSDIG MT system. We introduce two new grammar learners: (1) an exhaustive learner combining different heuristics, (2) an n-gram based grammar learner. Combining the grammar rules learned from the two learners improved the performance. We introduce a better decoding algorithm which incorporates a tri-gram language model. According to the Bleu metric, the PSDIG MT system performance is significantly better than IBM Model 4, while on par with the state-of-the-art phrase based system Pharaoh (Koehn, 2004). The improved integration of syntax on both source and target languages opens door to more sophisticated SMT processes.

pdf abs
Extending VerbNet with Novel Verb Classes
Karin Kipper | Anna Korhonen | Neville Ryant | Martha Palmer
Proceedings of the Fifth International Conference on Language Resources and Evaluation (LREC’06)

Lexical classifications have proved useful in supporting various natural language processing (NLP) tasks. The largest verb classification for English is Levin's (1993) work which defined groupings of verbs based on syntactic properties. VerbNet - the largest computational verb lexicon currently available for English - provides detailed syntactic-semantic descriptions of Levin classes. While the classes included are extensive enough for some NLP use, they are not comprehensive. Korhonen and Briscoe (2004) have proposed a significant extension of Levin's classification which incorporates 57 novel classes for verbs not covered (comprehensively) by Levin. This paper describes the integration of these classes into VerbNet. The result is the most extensive Levin-style classification for English verbs which can be highly useful for practical applications.

pdf
Aligning Features with Sense Distinction Dimensions
Nianwen Xue | Jinying Chen | Martha Palmer
Proceedings of the COLING/ACL 2006 Main Conference Poster Sessions

pdf
An Empirical Study of the Behavior of Active Learning for Word Sense Disambiguation
Jinying Chen | Andrew Schein | Lyle Ungar | Martha Palmer
Proceedings of the Human Language Technology Conference of the NAACL, Main Conference

pdf
OntoNotes: The 90% Solution
Eduard Hovy | Mitchell Marcus | Martha Palmer | Lance Ramshaw | Ralph Weischedel
Proceedings of the Human Language Technology Conference of the NAACL, Companion Volume: Short Papers

Structural divergence presents a challenge to the use of syntax in statistical machine translation. We address this problem with a new algorithm for alignment of loosely matched non-isomorphic dependency trees. The algorithm selectively relaxes the constraints of the two tree structures while keeping computational complexity polynomial in the length of the sentences. Experimentation with a large Chinese-English corpus shows an improvement in alignment results over the unstructured models of (Brown et al., 1993).

pdf
Annotating the Propositions in the Penn Chinese Treebank
Nianwen Xue | Martha Palmer
Proceedings of the Second SIGHAN Workshop on Chinese Language Processing

2002

pdf
Combining Contextual Features for Word Sense Disambiguation
Hoa Trang Dang | Martha Palmer
Proceedings of the ACL-02 Workshop on Word Sense Disambiguation: Recent Successes and Future Directions

pdf
Development and Evaluation of a Korean Treebank and its Application to NLP
Chung-hye Han | Na-Rae Han | Eon-Suk Ko | Martha Palmer
Proceedings of the Third International Conference on Language Resources and Evaluation (LREC’02)

pdf
Standards & best practice for multilingual computational lexicons: ISLE MILE and more”
Nicoletta Calzolari | Ralph Grishman | Martha Palmer
Proceedings of the Third International Conference on Language Resources and Evaluation (LREC’02)

pdf
From TreeBank to PropBank
Paul Kingsbury | Martha Palmer
Proceedings of the Third International Conference on Language Resources and Evaluation (LREC’02)

pdf
The Necessity of Parsing for Predicate Argument Recognition
Daniel Gildea | Martha Palmer
Proceedings of the 40th Annual Meeting of the Association for Computational Linguistics

pdf
Simple Features for Chinese Word Sense Disambiguation
Hoa Trang Dang | Ching-yi Chia | Martha Palmer | Fu-Dong Chiou
COLING 2002: The 19th International Conference on Computational Linguistics

pdf
Building a Large-Scale Annotated Chinese Corpus
Nianwen Xue | Fu-Dong Chiou | Martha Palmer
COLING 2002: The 19th International Conference on Computational Linguistics

2001

pdf
Penn Korean Treebank : Development and Evaluation
Chung-hye Han | Na-Rae Han | Eon-Suk Ko | Martha Palmer | Heejong Yi
Proceedings of the 16th Pacific Asia Conference on Language, Information and Computation

pdf
Automatic Predicate Argument Analysis of the Penn TreeBank
Martha Palmer | Joseph Rosenzweig | Scott Cotton
Proceedings of the First International Conference on Human Language Technology Research

pdf
Converting Dependency Structures to Phrase Structures
Fei Xia | Martha Palmer
Proceedings of the First International Conference on Human Language Technology Research

pdf
Facilitating Treebank Annotation Using a Statistical Parser
Fu-Dong Chiou | David Chiang | Martha Palmer
Proceedings of the First International Conference on Human Language Technology Research

pdf
English Tasks: All-Words and Verb Lexical Sample
Martha Palmer | Christiane Fellbaum | Scott Cotton | Lauren Delfs | Hoa Trang Dang
Proceedings of SENSEVAL-2 Second International Workshop on Evaluating Word Sense Disambiguation Systems

2000

pdf bib
Representations of Actions as an Interlingua
Karin Christine Kipper | Martha Palmer
NAACL-ANLP 2000 Workshop: Applied Interlinguas: Practical Applications of Interlingual Approaches to NLP

pdf
Comparing Lexicalized Treebank Grammars Extracted from Chinese, Korean, and English Corpora
Fei Xia | Chunghye Han | Martha Palmer | Aravind Joshi
Second Chinese Language Processing Workshop

pdf
A Uniform Method of Grammar Extraction and Its Applications
Fei Xia | Martha Palmer | Aravind Joshi
2000 Joint SIGDAT Conference on Empirical Methods in Natural Language Processing and Very Large Corpora

pdf
Building a class-based verb lexicon using TAGs
Karin Kipper | Hoa Trang Dang | William Schuler | Martha Palmer
Proceedings of the Fifth International Workshop on Tree Adjoining Grammar and Related Frameworks (TAG+5)

pdf
Lexicalized grammar and the description of motion events
Matthew Stone | Tonia Bleam | Christine Doran | Martha Palmer
Proceedings of the Fifth International Workshop on Tree Adjoining Grammar and Related Frameworks (TAG+5)

pdf
Comparing and integrating Tree Adjoining Grammars
Fei Xia | Martha Palmer
Proceedings of the Fifth International Workshop on Tree Adjoining Grammar and Related Frameworks (TAG+5)

pdf abs
Handling structural divergences and recovering dropped arguments in a Korean/English machine translation system
Chung-hye Han | Benoit Lavoie | Martha Palmer | Owen Rambow | Richard Kittredge | Tanya Korelsky | Nari Kim | Myunghee Kim
Proceedings of the Fourth Conference of the Association for Machine Translation in the Americas: Technical Papers

This paper describes an approach for handling structural divergences and recovering dropped arguments in an implemented Korean to English machine translation system. The approach relies on canonical predicate-argument structures (or dependency structures), which provide a suitable pivot representation for the handling of structural divergences and the recovery of dropped arguments. It can also be converted to and from the interface representations of many off-the-shelf parsers and generators.

pdf abs
A machine translation system from English to American Sign Language
Liwei Zhao | Karin Kipper | William Schuler | Christian Vogler | Norman Badler | Martha Palmer
Proceedings of the Fourth Conference of the Association for Machine Translation in the Americas: Technical Papers

Research in computational linguistics, computer graphics and autonomous agents has led to the development of increasingly sophisticated communicative agents over the past few years, bringing new perspective to machine translation research. The engineering of language- based smooth, expressive, natural-looking human gestures can give us useful insights into the design principles that have evolved in natural communication between people. In this paper we prototype a machine translation system from English to American Sign Language (ASL), taking into account not only linguistic but also visual and spatial information associated with ASL signs.

pdf
Integrating compositional semantics into a verb lexicon
Hoa Trang Dang | Karin Kipper | Martha Palmer
COLING 2000 Volume 2: The 18th International Conference on Computational Linguistics

pdf
Semantic Tagging for the Penn Treebank
Martha Palmer | Hoa Trang Dang | Joseph Rosenzweig
Proceedings of the Second International Conference on Language Resources and Evaluation (LREC’00)

1998

pdf
Investigating Regular Sense Extensions based on Intersective Levin Classes
Hoa Trang Dang | Karin Kipper | Martha Palmer | Joseph Rosenzweig
36th Annual Meeting of the Association for Computational Linguistics and 17th International Conference on Computational Linguistics, Volume 1

pdf
Investigating regular sense extensions based on intersective Levin classes
Hoa Trang Dang | Karin Kipper | Martha Palmer | Joseph Rosenzweig
COLING 1998 Volume 1: The 17th International Conference on Computational Linguistics

bib
Proceedings of the Pilot SENSEVAL
Adam Kilgarriff | Martha Palmer
Proceedings of the Pilot SENSEVAL

pdf abs
Rapid prototyping of domain-apecific machine translation systems
Martha Palmer | Owen Rambow | Alexis Nasr
Proceedings of the Third Conference of the Association for Machine Translation in the Americas: Technical Papers

This paper reports on an experiment in assembling a domain-specific machine translation prototype system from off-the-shelf components. The design goals of this experiment were to reuse existing components, to use machine-learning techniques for parser specialization and for transfer lexicon extraction, and to use an expressive, lexicalized formalism for the transfer component.

pdf
Motion verbs and semantic features in TAG
Tonia Bleam | Martha Palmer | K. Vijay-Shanker
Proceedings of the Fourth International Workshop on Tree Adjoining Grammars and Related Frameworks (TAG+4)

pdf
Consistent grammar development using partial-tree descriptions for Lexicalized Tree-Adjoining Grammars
Fei Xia | Martha Palmer | K. Vijay-Shanker | Joseph Rosenzweig
Proceedings of the Fourth International Workshop on Tree Adjoining Grammars and Related Frameworks (TAG+4)