James Allen

Rochester

Also published as: James F. Allen

Other people with similar names: James Allan (UMass Amherst)


2023

In this paper, we investigate whether symbolic semantic representations, extracted from deep semantic parsers, can help reasoning over the states of involved entities in a procedural text. We consider a deep semantic parser (TRIPS) and semantic role labeling as two sources of semantic parsing knowledge. First, we propose PROPOLIS, a symbolic parsing-based procedural reasoning framework. Second, we integrate semantic parsing information into state-of-the-art neural models to conduct procedural reasoning. Our experiments indicate that explicitly incorporating such semantic knowledge improves procedural understanding. This paper presents new metrics for evaluating procedural reasoning tasks that clarify the challenges and identify differences among neural, symbolic, and integrated models.

2020

Progress on deep language understanding is inhibited by the lack of a broad coverage lexicon that connects linguistic behavior to ontological concepts and axioms. We have developed COLLIE-V, a deep lexical resource for verbs, with the coverage of WordNet and syntactic and semantic details that meet or exceed existing resources. Bootstrapping from a hand-built lexicon and ontology, new ontological concepts and lexical entries, together with semantic role preferences and entailment axioms, are automatically derived by combining multiple constraints from parsing dictionary definitions and examples. We evaluated the accuracy of the technique along a number of different dimensions and were able to obtain high accuracy in deriving new concepts and lexical entries. COLLIE-V is publicly available.

2018

The general problem of finding satisfying solutions to constraint-based underspecified representations of quantifier scope is NP-complete. Existing frameworks, including Dominance Graphs, Minimal Recursion Semantics, and Hole Semantics, have struggled to balance expressivity and tractability in order to cover real natural language sentences with efficient algorithms. We address this trade-off with a general principle of coherence, which requires that every variable introduced in the domain of discourse must contribute to the overall semantics of the sentence. We show that every underspecified representation meeting this criterion can be efficiently processed, and that our set of representations subsumes all previously identified tractable sets.
The Story Cloze Test (SCT) is a recent framework for evaluating story comprehension and script learning. There have been a variety of models tackling the SCT so far. Although the original goal behind the SCT was to require systems to perform deep language understanding and commonsense reasoning for successful narrative understanding, some recent models could perform significantly better than the initial baselines by leveraging human-authorship biases discovered in the SCT dataset. In order to shed some light on this issue, we have performed various data analysis and analyzed a variety of top performing models presented for this task. Given the statistics we have aggregated, we have designed a new crowdsourcing scheme that creates a new SCT dataset, which overcomes some of the biases. We benchmark a few models on the new dataset and show that the top-performing model on the original SCT dataset fails to keep up its performance. Our findings further signify the importance of benchmarking NLP systems on various evolving test sets.
While there have been many proposals for theories of semantic roles over the years, these models are mostly justified by intuition and the only evaluation methods have been inter-annotator agreement. We explore three different ideas for providing more rigorous theories of semantic roles. These ideas give rise to more objective criteria for designing role sets, and lend themselves to some experimental evaluation. We illustrate the discussion by examining the semantic roles in TRIPS.
We demonstrate a system for understanding natural language utterances for structure description and placement in a situated blocks world context. By relying on a rich, domain-specific adaptation of a generic ontology and a logical form structure produced by a semantic parser, we obviate the need for an intermediate, domain-specific representation and can produce a reasoner that grounds and reasons over concepts and constraints with real-valued data. This linguistic base enables more flexibility in interpreting natural language expressions invoking intrinsic concepts and features of structures and space. We demonstrate some of the capabilities of a system grounded in deep language understanding and present initial results in a structure learning task.
We present a modular, end-to-end dialogue system for a situated agent to address a multimodal, natural language dialogue task in which the agent learns complex representations of block structure classes through assertions, demonstrations, and questioning. The concept to learn is provided to the user through a set of positive and negative visual examples, from which the user determines the underlying constraints to be provided to the system in natural language. The system in turn asks questions about demonstrated examples and simulates new examples to check its knowledge and verify the user’s description is complete. We find that this task is non-trivial for users and generates natural language that is varied yet understood by our deep language understanding architecture.
The bulk of current research in dialogue systems is focused on fairly simple task models, primarily state-based. Progress on developing dialogue systems for more complex tasks has been limited by the lack generic toolkits to build from. In this paper we report on our development from the ground up of a new dialogue model based on collaborative problem solving. We implemented the model in a dialogue system shell (Cogent) that al-lows developers to plug in problem-solving agents to create dialogue systems in new domains. The Cogent shell has now been used by several independent teams of researchers to develop dialogue systems in different domains, with varied lexicons and interaction style, each with their own problem-solving back-end. We believe this to be the first practical demonstration of the feasibility of a CPS-based dialogue system shell.

2017

Understanding common entities and their attributes is a primary requirement for any system that comprehends natural language. In order to enable learning about common entities, we introduce a novel machine comprehension task, GuessTwo: given a short paragraph comparing different aspects of two real-world semantically-similar entities, a system should guess what those entities are. Accomplishing this task requires deep language understanding which enables inference, connecting each comparison paragraph to different levels of knowledge about world entities and their attributes. So far we have crowdsourced a dataset of more than 14K comparison paragraphs comparing entities from a variety of categories such as fruits and animals. We have designed two schemes for evaluation: open-ended, and binary-choice prediction. For benchmarking further progress in the task, we have collected a set of paragraphs as the test set on which human can accomplish the task with an accuracy of 94.2% on open-ended prediction. We have implemented various models for tackling the task, ranging from semantic-driven to neural models. The semantic-driven approach outperforms the neural models, however, the results indicate that the task is very challenging across the models.
The LSDSem’17 shared task is the Story Cloze Test, a new evaluation for story understanding and script learning. This test provides a system with a four-sentence story and two possible endings, and the system must choose the correct ending to the story. Successful narrative understanding (getting closer to human performance of 100%) requires systems to link various levels of semantics to commonsense knowledge. A total of eight systems participated in the shared task, with a variety of approaches including.
We are developing a broad-coverage deep semantic lexicon for a system that parses sentences into a logical form expressed in a rich ontology that supports reasoning. In this paper we look at verb-particle constructions (VPCs), and the extent to which they can be treated compositionally vs idiomatically. First we distinguish between the different types of VPCs based on their compositionality and then present a set of heuristics for classifying specific instances as compositional or not. We then identify a small set of general sense classes for particles when used compositionally and discuss the resulting lexical representations that are being added to the lexicon. By treating VPCs as compositional whenever possible, we attain broad coverage in a compact way, and also enable interpretations of novel VPC usages not explicitly present in the lexicon.

2016

2015

2014

2013

2012

Annotating natural language sentences with quantifier scoping has proved to be very hard. In order to overcome the challenge, previous work on building scope-annotated corpora has focused on sentences with two explicitly quantified noun phrases (NPs). Furthermore, it does not address the annotation of scopal operators or complex NPs such as plurals and definites. We present the first annotation scheme for quantifier scope disambiguation where there is no restriction on the type or the number of scope-bearing elements in the sentence. We discuss some of the most prominent complex scope phenomena encountered in annotating the corpus, such as plurality and type-token distinction, and present mechanisms to handle those phenomena.

2011

2010

TimeBank (Pustejovsky et al, 2003a), a reference for TimeML (Pustejovsky et al, 2003b) compliant annotation, is widely used temporally annotated corpus in the community. It captures time expressions, events, and relations between events and event and temporal expression; but there is room for improvements in this hand-annotated widely used TimeBank corpus. This work is one such effort to extend the TimeBank corpus. Our first goal is to suggest missing TimeBank events and temporal expressions, i.e. events and temporal expressions that were missed by TimeBank annotators. Along with that this paper also suggests some additions to TimeML language by adding new event features (ontology type), some more SLINKs and also relations between events with their arguments, which we call RLINK (relation link). With our new suggestions we present the TRIOS-TimeBank corpus, an extended TimeBank corpus. We conclude by suggesting our future work to clean the TimeBank corpus even more and automatically generating larger temporally annotated corpus for the community.

2009

2008

We describe a new multimodal corpus currently under development. The corpus consists of videos of task-oriented dialogues that are annotated for speaker’s verbal requests and domain action executions. This resource provides data for new research on language production and comprehension. The corpus can be used to study speakers’ decisions as to how to structure their utterances given the complexity of the message they are trying to convey.

2007

2006

2005

2004

2002

2000

1999

1997

1996

1994

1993

1992

1991

1990

1989

1984

1982

1981

1979

1978