Or Biran


2020

pdf
GLUCOSE: GeneraLized and COntextualized Story Explanations
Nasrin Mostafazadeh | Aditya Kalyanpur | Lori Moon | David Buchanan | Lauren Berkowitz | Or Biran | Jennifer Chu-Carroll
Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing (EMNLP)

When humans read or listen, they make implicit commonsense inferences that frame their understanding of what happened and why. As a step toward AI systems that can build similar mental models, we introduce GLUCOSE, a large-scale dataset of implicit commonsense causal knowledge, encoded as causal mini-theories about the world, each grounded in a narrative context. To construct GLUCOSE, we drew on cognitive psychology to identify ten dimensions of causal explanation, focusing on events, states, motivations, and emotions. Each GLUCOSE entry includes a story-specific causal statement paired with an inference rule generalized from the statement. This paper details two concrete contributions. First, we present our platform for effectively crowdsourcing GLUCOSE data at scale, which uses semi-structured templates to elicit causal explanations. Using this platform, we collected a total of ~670K specific statements and general rules that capture implicit commonsense knowledge about everyday situations. Second, we show that existing knowledge resources and pretrained language models do not include or readily predict GLUCOSE’s rich inferential content. However, when state-of-the-art neural models are trained on this knowledge, they can start to make commonsense inferences on unseen stories that match humans’ mental models.

2017

pdf
Domain-Adaptable Hybrid Generation of RDF Entity Descriptions
Or Biran | Kathleen McKeown
Proceedings of the Eighth International Joint Conference on Natural Language Processing (Volume 1: Long Papers)

RDF ontologies provide structured data on entities in many domains and continue to grow in size and diversity. While they can be useful as a starting point for generating descriptions of entities, they often miss important information about an entity that cannot be captured as simple relations. In addition, generic approaches to generation from RDF cannot capture the unique style and content of specific domains. We describe a framework for hybrid generation of entity descriptions, which combines generation from RDF data with text extracted from a corpus, and extracts unique aspects of the domain from the corpus to create domain-specific generation systems. We show that each component of our approach significantly increases the satisfaction of readers with the text across multiple applications and domains.

pdf
MainiwayAI at IJCNLP-2017 Task 2: Ensembles of Deep Architectures for Valence-Arousal Prediction
Yassine Benajiba | Jin Sun | Yong Zhang | Zhiliang Weng | Or Biran
Proceedings of the IJCNLP 2017, Shared Tasks

This paper introduces Mainiway AI Labs submitted system for the IJCNLP 2017 shared task on Dimensional Sentiment Analysis of Chinese Phrases (DSAP), and related experiments. Our approach consists of deep neural networks with various architectures, and our best system is a voted ensemble of networks. We achieve a Mean Absolute Error of 0.64 in valence prediction and 0.68 in arousal prediction on the test set, both placing us as the 5th ranked team in the competition.

pdf
The Sentimental Value of Chinese Sub-Character Components
Yassine Benajiba | Or Biran | Zhiliang Weng | Yong Zhang | Jin Sun
Proceedings of the 9th SIGHAN Workshop on Chinese Language Processing

Sub-character components of Chinese characters carry important semantic information, and recent studies have shown that utilizing this information can improve performance on core semantic tasks. In this paper, we hypothesize that in addition to semantic information, sub-character components may also carry emotional information, and that utilizing it should improve performance on sentiment analysis tasks. We conduct a series of experiments on four Chinese sentiment data sets and show that we can significantly improve the performance in various tasks over that of a character-level embeddings baseline. We then focus on qualitatively assessing multiple examples and trying to explain how the sub-character components affect the results in each case.

2016

pdf
Mining Paraphrasal Typed Templates from a Plain Text Corpus
Or Biran | Terra Blevins | Kathleen McKeown
Proceedings of the 54th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers)

pdf
An Entity-Focused Approach to Generating Company Descriptions
Gavin Saldanha | Or Biran | Kathleen McKeown | Alfio Gliozzo
Proceedings of the 54th Annual Meeting of the Association for Computational Linguistics (Volume 2: Short Papers)

2015

pdf
Discourse Planning with an N-gram Model of Relations
Or Biran | Kathleen McKeown
Proceedings of the 2015 Conference on Empirical Methods in Natural Language Processing

pdf
PDTB Discourse Parsing as a Tagging Task: The Two Taggers Approach
Or Biran | Kathleen McKeown
Proceedings of the 16th Annual Meeting of the Special Interest Group on Discourse and Dialogue

2013

pdf
Semantic Technologies in IBM Watson
Alfio Gliozzo | Or Biran | Siddharth Patwardhan | Kathleen McKeown
Proceedings of the Fourth Workshop on Teaching NLP and CL

pdf
Classifying Taxonomic Relations between Pairs of Wikipedia Articles
Or Biran | Kathleen McKeown
Proceedings of the Sixth International Joint Conference on Natural Language Processing

pdf
Aggregated Word Pair Features for Implicit Discourse Relation Disambiguation
Or Biran | Kathleen McKeown
Proceedings of the 51st Annual Meeting of the Association for Computational Linguistics (Volume 2: Short Papers)

2012

pdf
Detecting Influencers in Written Online Conversations
Or Biran | Sara Rosenthal | Jacob Andreas | Kathleen McKeown | Owen Rambow
Proceedings of the Second Workshop on Language in Social Media

2011

pdf
Putting it Simply: a Context-Aware Approach to Lexical Simplification
Or Biran | Samuel Brody | Noémie Elhadad
Proceedings of the 49th Annual Meeting of the Association for Computational Linguistics: Human Language Technologies