Regina Barzilay


2023

pdf
Predictive Chemistry Augmented with Text Retrieval
Yujie Qian | Zhening Li | Zhengkai Tu | Connor Coley | Regina Barzilay
Proceedings of the 2023 Conference on Empirical Methods in Natural Language Processing

This paper focuses on using natural language descriptions to enhance predictive models in the chemistry field. Conventionally, chemoinformatics models are trained with extensive structured data manually extracted from the literature. In this paper, we introduce TextReact, a novel method that directly augments predictive chemistry with texts retrieved from the literature. TextReact retrieves text descriptions relevant for a given chemical reaction, and then aligns them with the molecular representation of the reaction. This alignment is enhanced via an auxiliary masked LM objective incorporated in the predictor training. We empirically validate the framework on two chemistry tasks: reaction condition recommendation and one-step retrosynthesis. By leveraging text retrieval, TextReact significantly outperforms state-of-the-art chemoinformatics models trained solely on molecular data.

2021

pdf
Consistent Accelerated Inference via Confident Adaptive Transformers
Tal Schuster | Adam Fisch | Tommi Jaakkola | Regina Barzilay
Proceedings of the 2021 Conference on Empirical Methods in Natural Language Processing

We develop a novel approach for confidently accelerating inference in the large and expensive multilayer Transformers that are now ubiquitous in natural language processing (NLP). Amortized or approximate computational methods increase efficiency, but can come with unpredictable performance costs. In this work, we present CATs – Confident Adaptive Transformers – in which we simultaneously increase computational efficiency, while guaranteeing a specifiable degree of consistency with the original model with high confidence. Our method trains additional prediction heads on top of intermediate layers, and dynamically decides when to stop allocating computational effort to each input using a meta consistency classifier. To calibrate our early prediction stopping rule, we formulate a unique extension of conformal prediction. We demonstrate the effectiveness of this approach on four classification and regression tasks.

pdf
Get Your Vitamin C! Robust Fact Verification with Contrastive Evidence
Tal Schuster | Adam Fisch | Regina Barzilay
Proceedings of the 2021 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies

Typical fact verification models use retrieved written evidence to verify claims. Evidence sources, however, often change over time as more information is gathered and revised. In order to adapt, models must be sensitive to subtle differences in supporting evidence. We present VitaminC, a benchmark infused with challenging cases that require fact verification models to discern and adjust to slight factual changes. We collect over 100,000 Wikipedia revisions that modify an underlying fact, and leverage these revisions, together with additional synthetically constructed ones, to create a total of over 400,000 claim-evidence pairs. Unlike previous resources, the examples in VitaminC are contrastive, i.e., they contain evidence pairs that are nearly identical in language and content, with the exception that one supports a given claim while the other does not. We show that training using this design increases robustness—improving accuracy by 10% on adversarial fact verification and 6% on adversarial natural language inference (NLI). Moreover, the structure of VitaminC leads us to define additional tasks for fact-checking resources: tagging relevant words in the evidence for verifying the claim, identifying factual revisions, and providing automatic edits via factually consistent text generation.

pdf
Nutri-bullets Hybrid: Consensual Multi-document Summarization
Darsh Shah | Lili Yu | Tao Lei | Regina Barzilay
Proceedings of the 2021 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies

We present a method for generating comparative summaries that highlight similarities and contradictions in input documents. The key challenge in creating such summaries is the lack of large parallel training data required for training typical summarization systems. To this end, we introduce a hybrid generation approach inspired by traditional concept-to-text systems. To enable accurate comparison between different sources, the model first learns to extract pertinent relations from input documents. The content planning component uses deterministic operators to aggregate these relations after identifying a subset for inclusion into a summary. The surface realization component lexicalizes this information using a text-infilling language model. By separately modeling content selection and realization, we can effectively train them with limited annotations. We implemented and tested the model in the domain of nutrition and health – rife with inconsistencies. Compared to conventional methods, our framework leads to more faithful, relevant and aggregation-sensitive summarization – while being equally fluent.

pdf
Deciphering Undersegmented Ancient Scripts Using Phonetic Prior
Jiaming Luo | Frederik Hartmann | Enrico Santus | Regina Barzilay | Yuan Cao
Transactions of the Association for Computational Linguistics, Volume 9

Most undeciphered lost languages exhibit two characteristics that pose significant decipherment challenges: (1) the scripts are not fully segmented into words; (2) the closest known language is not determined. We propose a decipherment model that handles both of these challenges by building on rich linguistic constraints reflecting consistent patterns in historical sound change. We capture the natural phonological geometry by learning character embeddings based on the International Phonetic Alphabet (IPA). The resulting generative framework jointly models word segmentation and cognate alignment, informed by phonological constraints. We evaluate the model on both deciphered languages (Gothic, Ugaritic) and an undeciphered one (Iberian). The experiments show that incorporating phonetic geometry leads to clear and consistent gains. Additionally, we propose a measure for language closeness which correctly identifies related languages for Gothic and Ugaritic. For Iberian, the method does not show strong evidence supporting Basque as a related language, concurring with the favored position by the current scholarship.1

2020

pdf
Blank Language Models
Tianxiao Shen | Victor Quach | Regina Barzilay | Tommi Jaakkola
Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing (EMNLP)

We propose Blank Language Model (BLM), a model that generates sequences by dynamically creating and filling in blanks. The blanks control which part of the sequence to expand, making BLM ideal for a variety of text editing and rewriting tasks. The model can start from a single blank or partially completed text with blanks at specified locations. It iteratively determines which word to place in a blank and whether to insert new blanks, and stops generating when no blanks are left to fill. BLM can be efficiently trained using a lower bound of the marginal data likelihood. On the task of filling missing text snippets, BLM significantly outperforms all other baselines in terms of both accuracy and fluency. Experiments on style transfer and damaged ancient text restoration demonstrate the potential of this framework for a wide range of applications.

pdf
CapWAP: Image Captioning with a Purpose
Adam Fisch | Kenton Lee | Ming-Wei Chang | Jonathan Clark | Regina Barzilay
Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing (EMNLP)

The traditional image captioning task uses generic reference captions to provide textual information about images. Different user populations, however, will care about different visual aspects of images. In this paper, we propose a new task, Captioning with A Purpose (CapWAP). Our goal is to develop systems that can be tailored to be useful for the information needs of an intended population, rather than merely provide generic information about an image. In this task, we use question-answer (QA) pairs—a natural expression of information need—from users, instead of reference captions, for both training and post-inference evaluation. We show that it is possible to use reinforcement learning to directly optimize for the intended information need, by rewarding outputs that allow a question answering model to provide correct answers to sampled user questions. We convert several visual question answering datasets into CapWAP datasets, and demonstrate that under a variety of scenarios our purposeful captioning system learns to anticipate and fulfill specific information needs better than its generic counterparts, as measured by QA performance on user questions from unseen images, when using the caption alone as context.

pdf
The Limitations of Stylometry for Detecting Machine-Generated Fake News
Tal Schuster | Roei Schuster | Darsh J. Shah | Regina Barzilay
Computational Linguistics, Volume 46, Issue 2 - June 2020

Recent developments in neural language models (LMs) have raised concerns about their potential misuse for automatically spreading misinformation. In light of these concerns, several studies have proposed to detect machine-generated fake news by capturing their stylistic differences from human-written text. These approaches, broadly termed stylometry, have found success in source attribution and misinformation detection in human-written texts. However, in this work, we show that stylometry is limited against machine-generated misinformation. Whereas humans speak differently when trying to deceive, LMs generate stylistically consistent text, regardless of underlying motive. Thus, though stylometry can successfully prevent impersonation by identifying text provenance, it fails to distinguish legitimate LM applications from those that introduce false information. We create two benchmarks demonstrating the stylistic similarity between malicious and legitimate uses of LMs, utilized in auto-completion and editing-assistance settings.1 Our findings highlight the need for non-stylometry approaches in detecting machine-generated misinformation, and open up the discussion on the desired evaluation benchmarks.

2019

pdf
Neural Decipherment via Minimum-Cost Flow: From Ugaritic to Linear B
Jiaming Luo | Yuan Cao | Regina Barzilay
Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics

In this paper we propose a novel neural approach for automatic decipherment of lost languages. To compensate for the lack of strong supervision signal, our model design is informed by patterns in language change documented in historical linguistics. The model utilizes an expressive sequence-to-sequence model to capture character-level correspondences between cognates. To effectively train the model in unsupervised manner, we innovate the training procedure by formalizing it as a minimum-cost flow problem. When applied to decipherment of Ugaritic, we achieve 5% absolute improvement over state-of-the-art results. We also report first automatic results in deciphering Linear B, a syllabic language related to ancient Greek, where our model correctly translates 67.3% of cognates.

pdf
GraphIE: A Graph-Based Framework for Information Extraction
Yujie Qian | Enrico Santus | Zhijing Jin | Jiang Guo | Regina Barzilay
Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 1 (Long and Short Papers)

Most modern Information Extraction (IE) systems are implemented as sequential taggers and only model local dependencies. Non-local and non-sequential context is, however, a valuable source of information to improve predictions. In this paper, we introduce GraphIE, a framework that operates over a graph representing a broad set of dependencies between textual units (i.e. words or sentences). The algorithm propagates information between connected nodes through graph convolutions, generating a richer representation that can be exploited to improve word-level predictions. Evaluation on three different tasks — namely textual, social media and visual information extraction — shows that GraphIE consistently outperforms the state-of-the-art sequence tagging model by a significant margin.

pdf
Cross-Lingual Alignment of Contextual Word Embeddings, with Applications to Zero-shot Dependency Parsing
Tal Schuster | Ori Ram | Regina Barzilay | Amir Globerson
Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 1 (Long and Short Papers)

We introduce a novel method for multilingual transfer that utilizes deep contextual embeddings, pretrained in an unsupervised fashion. While contextual embeddings have been shown to yield richer representations of meaning compared to their static counterparts, aligning them poses a challenge due to their dynamic nature. To this end, we construct context-independent variants of the original monolingual spaces and utilize their mapping to derive an alignment for the context-dependent spaces. This mapping readily supports processing of a target language, improving transfer by context-aware embeddings. Our experimental results demonstrate the effectiveness of this approach for zero-shot and few-shot learning of dependency parsing. Specifically, our method consistently outperforms the previous state-of-the-art on 6 tested languages, yielding an improvement of 6.8 LAS points on average.

pdf
Inferring Which Medical Treatments Work from Reports of Clinical Trials
Eric Lehman | Jay DeYoung | Regina Barzilay | Byron C. Wallace
Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 1 (Long and Short Papers)

How do we know if a particular medical treatment actually works? Ideally one would consult all available evidence from relevant clinical trials. Unfortunately, such results are primarily disseminated in natural language scientific articles, imposing substantial burden on those trying to make sense of them. In this paper, we present a new task and corpus for making this unstructured published scientific evidence actionable. The task entails inferring reported findings from a full-text article describing randomized controlled trials (RCT) with respect to a given intervention, comparator, and outcome of interest, e.g., inferring if a given article provides evidence supporting the use of aspirin to reduce risk of stroke, as compared to placebo. We present a new corpus for this task comprising 10,000+ prompts coupled with full-text articles describing RCTs. Results using a suite of baseline models — ranging from heuristic (rule-based) approaches to attentive neural architectures — demonstrate the difficulty of the task, which we believe largely owes to the lengthy, technical input texts. To facilitate further work on this important, challenging problem we make the corpus, documentation, a website and leaderboard, and all source code for baselines and evaluation publicly available.

pdf
Towards Debiasing Fact Verification Models
Tal Schuster | Darsh Shah | Yun Jie Serene Yeo | Daniel Roberto Filizzola Ortiz | Enrico Santus | Regina Barzilay
Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing (EMNLP-IJCNLP)

Fact verification requires validating a claim in the context of evidence. We show, however, that in the popular FEVER dataset this might not necessarily be the case. Claim-only classifiers perform competitively with top evidence-aware models. In this paper, we investigate the cause of this phenomenon, identifying strong cues for predicting labels solely based on the claim, without considering any evidence. We create an evaluation set that avoids those idiosyncrasies. The performance of FEVER-trained models significantly drops when evaluated on this test set. Therefore, we introduce a regularization method which alleviates the effect of bias in the training data, obtaining improvements on the newly created test set. This work is a step towards a more sound evaluation of reasoning capabilities in fact verification models.

pdf
Working Hard or Hardly Working: Challenges of Integrating Typology into Neural Dependency Parsers
Adam Fisch | Jiang Guo | Regina Barzilay
Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing (EMNLP-IJCNLP)

This paper explores the task of leveraging typology in the context of cross-lingual dependency parsing. While this linguistic information has shown great promise in pre-neural parsing, results for neural architectures have been mixed. The aim of our investigation is to better understand this state-of-the-art. Our main findings are as follows: 1) The benefit of typological information is derived from coarsely grouping languages into syntactically-homogeneous clusters rather than from learning to leverage variations along individual typological dimensions in a compositional manner; 2) Typology consistent with the actual corpus statistics yields better transfer performance; 3) Typological similarity is only a rough proxy of cross-lingual transferability with respect to parsing.

2018

pdf
Representation Learning for Grounded Spatial Reasoning
Michael Janner | Karthik Narasimhan | Regina Barzilay
Transactions of the Association for Computational Linguistics, Volume 6

The interpretation of spatial references is highly contextual, requiring joint inference over both language and the environment. We consider the task of spatial reasoning in a simulated environment, where an agent can act and receive rewards. The proposed model learns a representation of the world steered by instruction text. This design allows for precise alignment of local neighborhoods with corresponding verbalizations, while also handling global references in the instructions. We train our model with reinforcement learning using a variant of generalized value iteration. The model outperforms state-of-the-art approaches on several metrics, yielding a 45% reduction in goal localization error.

pdf
Deriving Machine Attention from Human Rationales
Yujia Bao | Shiyu Chang | Mo Yu | Regina Barzilay
Proceedings of the 2018 Conference on Empirical Methods in Natural Language Processing

Attention-based models are successful when trained on large amounts of data. In this paper, we demonstrate that even in the low-resource scenario, attention can be learned effectively. To this end, we start with discrete human-annotated rationales and map them into continuous attention. Our central hypothesis is that this mapping is general across domains, and thus can be transferred from resource-rich domains to low-resource ones. Our model jointly learns a domain-invariant representation and induces the desired mapping between rationales and attention. Our empirical results validate this hypothesis and show that our approach delivers significant gains over state-of-the-art baselines, yielding over 15% average error reduction on benchmark datasets.

pdf
Multi-Source Domain Adaptation with Mixture of Experts
Jiang Guo | Darsh Shah | Regina Barzilay
Proceedings of the 2018 Conference on Empirical Methods in Natural Language Processing

We propose a mixture-of-experts approach for unsupervised domain adaptation from multiple sources. The key idea is to explicitly capture the relationship between a target example and different source domains. This relationship, expressed by a point-to-set metric, determines how to combine predictors trained on various domains. The metric is learned in an unsupervised fashion using meta-training. Experimental results on sentiment analysis and part-of-speech tagging demonstrate that our approach consistently outperforms multiple baselines and can robustly handle negative transfer.

2017

pdf bib
Proceedings of the 55th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers)
Regina Barzilay | Min-Yen Kan
Proceedings of the 55th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers)

pdf bib
Proceedings of the 55th Annual Meeting of the Association for Computational Linguistics (Volume 2: Short Papers)
Regina Barzilay | Min-Yen Kan
Proceedings of the 55th Annual Meeting of the Association for Computational Linguistics (Volume 2: Short Papers)

pdf
Unsupervised Learning of Morphological Forests
Jiaming Luo | Karthik Narasimhan | Regina Barzilay
Transactions of the Association for Computational Linguistics, Volume 5

This paper focuses on unsupervised modeling of morphological families, collectively comprising a forest over the language vocabulary. This formulation enables us to capture edge-wise properties reflecting single-step morphological derivations, along with global distributional properties of the entire forest. These global properties constrain the size of the affix set and encourage formation of tight morphological families. The resulting objective is solved using Integer Linear Programming (ILP) paired with contrastive estimation. We train the model by alternating between optimizing the local log-linear model and the global ILP objective. We evaluate our system on three tasks: root detection, clustering of morphological families, and segmentation. Our experiments demonstrate that our model yields consistent gains in all three tasks compared with the best published results.

pdf
Aspect-augmented Adversarial Networks for Domain Adaptation
Yuan Zhang | Regina Barzilay | Tommi Jaakkola
Transactions of the Association for Computational Linguistics, Volume 5

We introduce a neural method for transfer learning between two (source and target) classification tasks or aspects over the same domain. Rather than training on target labels, we use a few keywords pertaining to source and target aspects indicating sentence relevance instead of document class labels. Documents are encoded by learning to embed and softly select relevant sentences in an aspect-dependent manner. A shared classifier is trained on the source encoded documents and labels, and applied to target encoded documents. We ensure transfer through aspect-adversarial training so that encoded documents are, as sets, aspect-invariant. Experimental results demonstrate that our approach outperforms different baselines and model variants on two datasets, yielding an improvement of 27% on a pathology dataset and 5% on a review dataset.

2016

pdf
Making Dependency Labeling Simple, Fast and Accurate
Tianxiao Shen | Tao Lei | Regina Barzilay
Proceedings of the 2016 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies

pdf
Semi-supervised Question Retrieval with Gated Convolutions
Tao Lei | Hrishikesh Joshi | Regina Barzilay | Tommi Jaakkola | Kateryna Tymoshenko | Alessandro Moschitti | Lluís Màrquez
Proceedings of the 2016 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies

pdf
Ten Pairs to Tag – Multilingual POS Tagging via Coarse Mapping between Embeddings
Yuan Zhang | David Gaddy | Regina Barzilay | Tommi Jaakkola
Proceedings of the 2016 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies

pdf
Rationalizing Neural Predictions
Tao Lei | Regina Barzilay | Tommi Jaakkola
Proceedings of the 2016 Conference on Empirical Methods in Natural Language Processing

pdf
Neural Generation of Regular Expressions from Natural Language with Minimal Domain Knowledge
Nicholas Locascio | Karthik Narasimhan | Eduardo DeLeon | Nate Kushman | Regina Barzilay
Proceedings of the 2016 Conference on Empirical Methods in Natural Language Processing

pdf
Learning to refine text based recommendations
Youyang Gu | Tao Lei | Regina Barzilay | Tommi Jaakkola
Proceedings of the 2016 Conference on Empirical Methods in Natural Language Processing

pdf
Improving Information Extraction by Acquiring External Evidence with Reinforcement Learning
Karthik Narasimhan | Adam Yala | Regina Barzilay
Proceedings of the 2016 Conference on Empirical Methods in Natural Language Processing

2015

pdf bib
Language Understanding for Text-based Games using Deep Reinforcement Learning
Karthik Narasimhan | Tejas Kulkarni | Regina Barzilay
Proceedings of the 2015 Conference on Empirical Methods in Natural Language Processing

pdf
Molding CNNs for text: non-linear, non-consecutive convolutions
Tao Lei | Regina Barzilay | Tommi Jaakkola
Proceedings of the 2015 Conference on Empirical Methods in Natural Language Processing

pdf
Hierarchical Low-Rank Tensors for Multilingual Transfer Parsing
Yuan Zhang | Regina Barzilay
Proceedings of the 2015 Conference on Empirical Methods in Natural Language Processing

pdf
Machine Comprehension with Discourse Relations
Karthik Narasimhan | Regina Barzilay
Proceedings of the 53rd Annual Meeting of the Association for Computational Linguistics and the 7th International Joint Conference on Natural Language Processing (Volume 1: Long Papers)

pdf
An Unsupervised Method for Uncovering Morphological Chains
Karthik Narasimhan | Regina Barzilay | Tommi Jaakkola
Transactions of the Association for Computational Linguistics, Volume 3

Most state-of-the-art systems today produce morphological analysis based only on orthographic patterns. In contrast, we propose a model for unsupervised morphological analysis that integrates orthographic and semantic views of words. We model word formation in terms of morphological chains, from base words to the observed words, breaking the chains into parent-child relations. We use log-linear models with morpheme and word-level features to predict possible parents, including their modifications, for each word. The limited set of candidate parents for each word render contrastive estimation feasible. Our model consistently matches or outperforms five state-of-the-art systems on Arabic, English and Turkish.

pdf
Randomized Greedy Inference for Joint Segmentation, POS Tagging and Dependency Parsing
Yuan Zhang | Chengtao Li | Regina Barzilay | Kareem Darwish
Proceedings of the 2015 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies

pdf
High-Order Low-Rank Tensors for Semantic Role Labeling
Tao Lei | Yuan Zhang | Lluís Màrquez | Alessandro Moschitti | Regina Barzilay
Proceedings of the 2015 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies

2014

pdf
Steps to Excellence: Simple Inference with Refined Scoring of Dependency Trees
Yuan Zhang | Tao Lei | Regina Barzilay | Tommi Jaakkola | Amir Globerson
Proceedings of the 52nd Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers)

pdf
Learning to Automatically Solve Algebra Word Problems
Nate Kushman | Yoav Artzi | Luke Zettlemoyer | Regina Barzilay
Proceedings of the 52nd Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers)

pdf
Low-Rank Tensors for Scoring Dependency Structures
Tao Lei | Yu Xin | Yuan Zhang | Regina Barzilay | Tommi Jaakkola
Proceedings of the 52nd Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers)

pdf
Morphological Segmentation for Keyword Spotting
Karthik Narasimhan | Damianos Karakos | Richard Schwartz | Stavros Tsakalidis | Regina Barzilay
Proceedings of the 2014 Conference on Empirical Methods in Natural Language Processing (EMNLP)

pdf
Greed is Good if Randomized: New Inference for Dependency Parsing
Yuan Zhang | Tao Lei | Regina Barzilay | Tommi Jaakkola
Proceedings of the 2014 Conference on Empirical Methods in Natural Language Processing (EMNLP)

pdf
Exploring Compositional Architectures and Word Vector Representations for Prepositional Phrase Attachment
Yonatan Belinkov | Tao Lei | Regina Barzilay | Amir Globerson
Transactions of the Association for Computational Linguistics, Volume 2

Prepositional phrase (PP) attachment disambiguation is a known challenge in syntactic parsing. The lexical sparsity associated with PP attachments motivates research in word representations that can capture pertinent syntactic and semantic features of the word. One promising solution is to use word vectors induced from large amounts of raw text. However, state-of-the-art systems that employ such representations yield modest gains in PP attachment accuracy. In this paper, we show that word vector representations can yield significant PP attachment performance gains. This is achieved via a non-linear architecture that is discriminatively trained to maximize PP attachment accuracy. The architecture is initialized with word vectors trained from unlabeled data, and relearns those to maximize attachment accuracy. We obtain additional performance gains with alternative representations such as dependency-based word vectors. When tested on both English and Arabic datasets, our method outperforms both a strong SVM classifier and state-of-the-art parsers. For instance, we achieve 82.6% PP attachment accuracy on Arabic, while the Turbo and Charniak self-trained parsers obtain 76.7% and 80.8% respectively.

2013

pdf
Using Semantic Unification to Generate Regular Expressions from Natural Language
Nate Kushman | Regina Barzilay
Proceedings of the 2013 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies

pdf
Transfer Learning for Constituency-Based Grammars
Yuan Zhang | Regina Barzilay | Amir Globerson
Proceedings of the 51st Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers)

pdf
From Natural Language Specifications to Program Input Parsers
Tao Lei | Fan Long | Regina Barzilay | Martin Rinard
Proceedings of the 51st Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers)

2012

pdf
Learning to Behave by Reading
Regina Barzilay
Proceedings of the 13th Conference of the European Chapter of the Association for Computational Linguistics

pdf
Multi-Event Extraction Guided by Global Constraints
Roi Reichart | Regina Barzilay
Proceedings of the 2012 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies

pdf bib
Proceedings of the Second Workshop on Semantic Interpretation in an Actionable Context
Dan Goldwasser | Regina Barzilay | Dan Roth
Proceedings of the Second Workshop on Semantic Interpretation in an Actionable Context

pdf
Learning High-Level Planning from Text
S.R.K. Branavan | Nate Kushman | Tao Lei | Regina Barzilay
Proceedings of the 50th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers)

pdf
Selective Sharing for Multilingual Dependency Parsing
Tahira Naseem | Regina Barzilay | Amir Globerson
Proceedings of the 50th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers)

pdf
Unsupervised Morphology Rivals Supervised Morphology for Arabic MT
David Stallard | Jacob Devlin | Michael Kayser | Yoong Keok Lee | Regina Barzilay
Proceedings of the 50th Annual Meeting of the Association for Computational Linguistics (Volume 2: Short Papers)

pdf
Learning to Map into a Universal POS Tagset
Yuan Zhang | Roi Reichart | Regina Barzilay | Amir Globerson
Proceedings of the 2012 Joint Conference on Empirical Methods in Natural Language Processing and Computational Natural Language Learning

2011

pdf
Learning to Win by Reading Manuals in a Monte-Carlo Framework
S.R.K. Branavan | David Silver | Regina Barzilay
Proceedings of the 49th Annual Meeting of the Association for Computational Linguistics: Human Language Technologies

pdf
Content Models with Attitude
Christina Sauper | Aria Haghighi | Regina Barzilay
Proceedings of the 49th Annual Meeting of the Association for Computational Linguistics: Human Language Technologies

pdf
Event Discovery in Social Media Feeds
Edward Benson | Aria Haghighi | Regina Barzilay
Proceedings of the 49th Annual Meeting of the Association for Computational Linguistics: Human Language Technologies

pdf
In-domain Relation Discovery with Meta-constraints via Posterior Regularization
Harr Chen | Edward Benson | Tahira Naseem | Regina Barzilay
Proceedings of the 49th Annual Meeting of the Association for Computational Linguistics: Human Language Technologies

pdf bib
Proceedings of the 2011 Conference on Empirical Methods in Natural Language Processing
Regina Barzilay | Mark Johnson
Proceedings of the 2011 Conference on Empirical Methods in Natural Language Processing

pdf bib
Modeling Syntactic Context Improves Morphological Segmentation
Yoong Keok Lee | Aria Haghighi | Regina Barzilay
Proceedings of the Fifteenth Conference on Computational Natural Language Learning

2010

pdf
A Statistical Model for Lost Language Decipherment
Benjamin Snyder | Regina Barzilay | Kevin Knight
Proceedings of the 48th Annual Meeting of the Association for Computational Linguistics

pdf
Reading between the Lines: Learning to Map High-Level Instructions to Commands
S.R.K. Branavan | Luke Zettlemoyer | Regina Barzilay
Proceedings of the 48th Annual Meeting of the Association for Computational Linguistics

pdf
Incorporating Content Structure into Text Analysis Applications
Christina Sauper | Aria Haghighi | Regina Barzilay
Proceedings of the 2010 Conference on Empirical Methods in Natural Language Processing

pdf
Simple Type-Level Unsupervised POS Tagging
Yoong Keok Lee | Aria Haghighi | Regina Barzilay
Proceedings of the 2010 Conference on Empirical Methods in Natural Language Processing

pdf
Using Universal Linguistic Knowledge to Guide Grammar Induction
Tahira Naseem | Harr Chen | Regina Barzilay | Mark Johnson
Proceedings of the 2010 Conference on Empirical Methods in Natural Language Processing

2009

pdf
Probabilistic Approaches for Modeling Text Structure and their Application to Text-to-Text Generation (Invited Talk)
Regina Barzilay
Proceedings of the 12th European Workshop on Natural Language Generation (ENLG 2009)

pdf
Adding More Languages Improves Unsupervised Multilingual Part-of-Speech Tagging: a Bayesian Non-Parametric Approach
Benjamin Snyder | Tahira Naseem | Jacob Eisenstein | Regina Barzilay
Proceedings of Human Language Technologies: The 2009 Annual Conference of the North American Chapter of the Association for Computational Linguistics

pdf
Global Models of Document Structure using Latent Permutations
Harr Chen | S.R.K. Branavan | Regina Barzilay | David R. Karger
Proceedings of Human Language Technologies: The 2009 Annual Conference of the North American Chapter of the Association for Computational Linguistics

pdf
Unsupervised Multilingual Grammar Induction
Benjamin Snyder | Tahira Naseem | Regina Barzilay
Proceedings of the Joint Conference of the 47th Annual Meeting of the ACL and the 4th International Joint Conference on Natural Language Processing of the AFNLP

pdf
Reinforcement Learning for Mapping Instructions to Actions
S.R.K. Branavan | Harr Chen | Luke Zettlemoyer | Regina Barzilay
Proceedings of the Joint Conference of the 47th Annual Meeting of the ACL and the 4th International Joint Conference on Natural Language Processing of the AFNLP

pdf
Automatically Generating Wikipedia Articles: A Structure-Aware Approach
Christina Sauper | Regina Barzilay
Proceedings of the Joint Conference of the 47th Annual Meeting of the ACL and the 4th International Joint Conference on Natural Language Processing of the AFNLP

2008

pdf bib
Modeling Local Coherence: An Entity-Based Approach
Regina Barzilay | Mirella Lapata
Computational Linguistics, Volume 34, Number 1, March 2008

pdf
Bayesian Unsupervised Topic Segmentation
Jacob Eisenstein | Regina Barzilay
Proceedings of the 2008 Conference on Empirical Methods in Natural Language Processing

pdf
Unsupervised Multilingual Learning for POS Tagging
Benjamin Snyder | Tahira Naseem | Jacob Eisenstein | Regina Barzilay
Proceedings of the 2008 Conference on Empirical Methods in Natural Language Processing

pdf
Learning Document-Level Semantic Properties from Free-Text Annotations
S.R.K. Branavan | Harr Chen | Jacob Eisenstein | Regina Barzilay
Proceedings of ACL-08: HLT

pdf
Unsupervised Multilingual Learning for Morphological Segmentation
Benjamin Snyder | Regina Barzilay
Proceedings of ACL-08: HLT

pdf
Gestural Cohesion for Topic Segmentation
Jacob Eisenstein | Regina Barzilay | Randall Davis
Proceedings of ACL-08: HLT

2007

pdf
Incremental Text Structuring with Online Hierarchical Ranking
Erdong Chen | Benjamin Snyder | Regina Barzilay
Proceedings of the 2007 Joint Conference on Empirical Methods in Natural Language Processing and Computational Natural Language Learning (EMNLP-CoNLL)

pdf
Making Sense of Sound: Unsupervised Topic Segmentation over Acoustic Input
Igor Malioutov | Alex Park | Regina Barzilay | James Glass
Proceedings of the 45th Annual Meeting of the Association of Computational Linguistics

pdf
Generating a Table-of-Contents
S. R. K. Branavan | Pawan Deshpande | Regina Barzilay
Proceedings of the 45th Annual Meeting of the Association of Computational Linguistics

pdf
Multiple Aspect Ranking Using the Good Grief Algorithm
Benjamin Snyder | Regina Barzilay
Human Language Technologies 2007: The Conference of the North American Chapter of the Association for Computational Linguistics; Proceedings of the Main Conference

pdf
Randomized Decoding for Selection-and-Ordering Problems
Pawan Deshpande | Regina Barzilay | David Karger
Human Language Technologies 2007: The Conference of the North American Chapter of the Association for Computational Linguistics; Proceedings of the Main Conference

2006

pdf
Aggregation via Set Partitioning for Natural Language Generation
Regina Barzilay | Mirella Lapata
Proceedings of the Human Language Technology Conference of the NAACL, Main Conference

pdf
Paraphrasing for Automatic Evaluation
David Kauchak | Regina Barzilay
Proceedings of the Human Language Technology Conference of the NAACL, Main Conference

pdf
Minimum Cut Model for Spoken Lecture Segmentation
Igor Malioutov | Regina Barzilay
Proceedings of the 21st International Conference on Computational Linguistics and 44th Annual Meeting of the Association for Computational Linguistics

pdf
Inducing Temporal Graphs
Philip Bramsen | Pawan Deshpande | Yoong Keok Lee | Regina Barzilay
Proceedings of the 2006 Conference on Empirical Methods in Natural Language Processing

2005

pdf bib
Sentence Fusion for Multidocument News Summarization
Regina Barzilay | Kathleen R. McKeown
Computational Linguistics, Volume 31, Number 3, September 2005

pdf
Collective Content Selection for Concept-to-Text Generation
Regina Barzilay | Mirella Lapata
Proceedings of Human Language Technology Conference and Conference on Empirical Methods in Natural Language Processing

pdf
Modeling Local Coherence: An Entity-Based Approach
Regina Barzilay | Mirella Lapata
Proceedings of the 43rd Annual Meeting of the Association for Computational Linguistics (ACL’05)

2004

pdf
Catching the Drift: Probabilistic Content Models, with Applications to Generation and Summarization
Regina Barzilay | Lillian Lee
Proceedings of the Human Language Technology Conference of the North American Chapter of the Association for Computational Linguistics: HLT-NAACL 2004

2003

pdf
Sentence Alignment for Monolingual Comparable Corpora
Regina Barzilay | Noemie Elhadad
Proceedings of the 2003 Conference on Empirical Methods in Natural Language Processing

pdf
Learning to Paraphrase: An Unsupervised Approach Using Multiple-Sequence Alignment
Regina Barzilay | Lillian Lee
Proceedings of the 2003 Human Language Technology Conference of the North American Chapter of the Association for Computational Linguistics

pdf
Columbia’s Newsblaster: New Features and Future Directions
Kathleen McKeown | Regina Barzilay | John Chen | David Elson | David Evans | Judith Klavans | Ani Nenkova | Barry Schiffman | Sergey Sigelman
Companion Volume of the Proceedings of HLT-NAACL 2003 - Demonstrations

2002

pdf
Bootstrapping Lexical Choice via Multiple-Sequence Alignment
Regina Barzilay | Lillian Lee
Proceedings of the 2002 Conference on Empirical Methods in Natural Language Processing (EMNLP 2002)

2001

pdf
Extracting Paraphrases from a Parallel Corpus
Regina Barzilay | Kathleen R. McKeown
Proceedings of the 39th Annual Meeting of the Association for Computational Linguistics

pdf
Sentence Ordering in Multidocument Summarization
Regina Barzilay | Noemie Elhadad | Kathleen R. McKeown
Proceedings of the First International Conference on Human Language Technology Research

1999

pdf
Information Fusion in the Context of Multi-Document Summarization
Regina Barzilay | Kathleen R. McKeown | Michael Elhadad
Proceedings of the 37th Annual Meeting of the Association for Computational Linguistics

1998

pdf
A New Approach to Expert System Explanations
Regina Barzilay | Daryl McCullough | Owen Rambow | Jonathan DeCristofaro | Tanya Korelsky | Benoit Lavoie
Natural Language Generation

1997

pdf
Using Lexical Chains for Text Summarization
Regina Barzilay | Michael Elhadad
Intelligent Scalable Text Summarization

Search
Co-authors