Annie Louis


2022

pdf
Source-summary Entity Aggregation in Abstractive Summarization
José Ángel González | Annie Louis | Jackie Chi Kit Cheung
Proceedings of the 29th International Conference on Computational Linguistics

In a text, entities mentioned earlier can be referred to in later discourse by a more general description. For example, Celine Dion and Justin Bieber can be referred to by Canadian singers or celebrities. In this work, we study this phenomenon in the context of summarization, where entities from a source text are generalized in the summary. We call such instances source-summary entity aggregations. We categorize these aggregations into two types and analyze them in the Cnn/Dailymail corpus, showing that they are reasonably frequent. We then examine how well three state-of-the-art summarization systems can generate such aggregations within summaries. We also develop techniques to encourage them to generate more aggregations. Our results show that there is significant room for improvement in producing semantically correct aggregations.

2021

pdf bib
Proceedings of the 2nd Workshop on Computational Approaches to Discourse
Chloé Braud | Christian Hardmeier | Junyi Jessy Li | Annie Louis | Michael Strube | Amir Zeldes
Proceedings of the 2nd Workshop on Computational Approaches to Discourse

2020

pdf bib
Proceedings of the First Workshop on Computational Approaches to Discourse
Chloé Braud | Christian Hardmeier | Junyi Jessy Li | Annie Louis | Michael Strube
Proceedings of the First Workshop on Computational Approaches to Discourse

pdf
I’d rather just go to bed”: Understanding Indirect Answers
Annie Louis | Dan Roth | Filip Radlinski
Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing (EMNLP)

We revisit a pragmatic inference problem in dialog: Understanding indirect responses to questions. Humans can interpret ‘I’m starving.’ in response to ‘Hungry?’, even without direct cue words such as ‘yes’ and ‘no’. In dialog systems, allowing natural responses rather than closed vocabularies would be similarly beneficial. However, today’s systems are only as sensitive to these pragmatic moves as their language model allows. We create and release the first large-scale English language corpus ‘Circa’ with 34,268 (polar question, indirect answer) pairs to enable progress on this task. The data was collected via elaborate crowdsourcing, and contains utterances with yes/no meaning, as well as uncertain, middle-ground, and conditional responses. We also present BERT-based neural models to predict such categories for a question-answer pair. We find that while transfer learning from entailment works reasonably, performance is not yet sufficient for robust dialog. Our models reach 82-88% accuracy for a 4-class distinction, and 74-85% for 6 classes.

pdf
TESA: A Task in Entity Semantic Aggregation for Abstractive Summarization
Clément Jumel | Annie Louis | Jackie Chi Kit Cheung
Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing (EMNLP)

Human-written texts contain frequent generalizations and semantic aggregation of content. In a document, they may refer to a pair of named entities such as ‘London’ and ‘Paris’ with different expressions: “the major cities”, “the capital cities” and “two European cities”. Yet generation, especially, abstractive summarization systems have so far focused heavily on paraphrasing and simplifying the source content, to the exclusion of such semantic abstraction capabilities. In this paper, we present a new dataset and task aimed at the semantic aggregation of entities. TESA contains a dataset of 5.3K crowd-sourced entity aggregations of Person, Organization, and Location named entities. The aggregations are document-appropriate, meaning that they are produced by annotators to match the situational context of a given news article from the New York Times. We then build baseline models for generating aggregations given a tuple of entities and document context. We finetune on TESA an encoder-decoder language model and compare it with simpler classification methods based on linguistically informed features. Our quantitative and qualitative evaluations show reasonable performance in making a choice from a given list of expressions, but free-form expressions are understandably harder to generate and evaluate.

2019

pdf bib
Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics (Demonstrations)
Waleed Ammar | Annie Louis | Nasrin Mostafazadeh
Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics (Demonstrations)

pdf
Countering the Effects of Lead Bias in News Summarization via Multi-Stage Training and Auxiliary Losses
Matt Grenander | Yue Dong | Jackie Chi Kit Cheung | Annie Louis
Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing (EMNLP-IJCNLP)

Sentence position is a strong feature for news summarization, since the lead often (but not always) summarizes the key points of the article. In this paper, we show that recent neural systems excessively exploit this trend, which although powerful for many inputs, is also detrimental when summarizing documents where important content should be extracted from later parts of the article. We propose two techniques to make systems sensitive to the importance of content in different parts of the article. The first technique employs ‘unbiased’ data; i.e., randomly shuffled sentences of the source document, to pretrain the model. The second technique uses an auxiliary ROUGE-based loss that encourages the model to distribute importance scores throughout a document by mimicking sentence-level ROUGE scores on the training data. We show that these techniques significantly improve the performance of a competitive reinforcement learning based extractive system, with the auxiliary loss being more powerful than pretraining.

2018

pdf
Getting to “Hearer-old”: Charting Referring Expressions Across Time
Ieva Staliūnaitė | Hannah Rohde | Bonnie Webber | Annie Louis
Proceedings of the 2018 Conference on Empirical Methods in Natural Language Processing

When a reader is first introduced to an entity, its referring expression must describe the entity. For entities that are widely known, a single word or phrase often suffices. This paper presents the first study of how expressions that refer to the same entity develop over time. We track thousands of person and organization entities over 20 years of New York Times (NYT). As entities move from hearer-new (first introduction to the NYT audience) to hearer-old (common knowledge) status, we show empirically that the referring expressions along this trajectory depend on the type of the entity, and exhibit linguistic properties related to becoming common knowledge (e.g., shorter length, less use of appositives, more definiteness). These properties can also be used to build a model to predict how long it will take for an entity to reach hearer-old status. Our results reach 10-30% absolute improvement over a majority-class baseline.

pdf
Deep Dungeons and Dragons: Learning Character-Action Interactions from Role-Playing Game Transcripts
Annie Louis | Charles Sutton
Proceedings of the 2018 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 2 (Short Papers)

An essential aspect to understanding narratives is to grasp the interaction between characters in a story and the actions they take. We examine whether computational models can capture this interaction, when both character attributes and actions are expressed as complex natural language descriptions. We propose role-playing games as a testbed for this problem, and introduce a large corpus of game transcripts collected from online discussion forums. Using neural language models which combine character and action descriptions from these stories, we show that we can learn the latent ties. Action sequences are better predicted when the character performing the action is also taken into account, and vice versa for character attributes.

2017

pdf bib
Proceedings of the 2nd Workshop on Linking Models of Lexical, Sentential and Discourse-level Semantics
Michael Roth | Nasrin Mostafazadeh | Nathanael Chambers | Annie Louis
Proceedings of the 2nd Workshop on Linking Models of Lexical, Sentential and Discourse-level Semantics

pdf
LSDSem 2017 Shared Task: The Story Cloze Test
Nasrin Mostafazadeh | Michael Roth | Annie Louis | Nathanael Chambers | James Allen
Proceedings of the 2nd Workshop on Linking Models of Lexical, Sentential and Discourse-level Semantics

The LSDSem’17 shared task is the Story Cloze Test, a new evaluation for story understanding and script learning. This test provides a system with a four-sentence story and two possible endings, and the system must choose the correct ending to the story. Successful narrative understanding (getting closer to human performance of 100%) requires systems to link various levels of semantics to commonsense knowledge. A total of eight systems participated in the shared task, with a variety of approaches including.

pdf bib
Proceedings of the 18th Annual SIGdial Meeting on Discourse and Dialogue
Kristiina Jokinen | Manfred Stede | David DeVault | Annie Louis
Proceedings of the 18th Annual SIGdial Meeting on Discourse and Dialogue

pdf
Exploring Substitutability through Discourse Adverbials and Multiple Judgments
Hannah Rohde | Anna Dickinson | Nathan Schneider | Annie Louis | Bonnie Webber
IWCS 2017 - 12th International Conference on Computational Semantics - Long papers

2016

pdf
Book Reviews: Natural Language Processing for Social Media by Atefeh Farzindar and Diana Inkpen
Annie Louis
Computational Linguistics, Volume 42, Issue 4 - December 2016

pdf
Filling in the Blanks in Understanding Discourse Adverbials: Consistency, Conflict, and Context-Dependence in a Crowdsourced Elicitation Task
Hannah Rohde | Anna Dickinson | Nathan Schneider | Christopher N. L. Clark | Annie Louis | Bonnie Webber
Proceedings of the 10th Linguistic Annotation Workshop held in conjunction with ACL 2016 (LAW-X 2016)

pdf bib
Proceedings of the Workshop on Uphill Battles in Language Processing: Scaling Early Achievements to Robust Methods
Annie Louis | Michael Roth | Bonnie Webber | Michael White | Luke Zettlemoyer
Proceedings of the Workshop on Uphill Battles in Language Processing: Scaling Early Achievements to Robust Methods

2015

pdf
Which Step Do I Take First? Troubleshooting with Bayesian Models
Annie Louis | Mirella Lapata
Transactions of the Association for Computational Linguistics, Volume 3

Online discussion forums and community question-answering websites provide one of the primary avenues for online users to share information. In this paper, we propose text mining techniques which aid users navigate troubleshooting-oriented data such as questions asked on forums and their suggested solutions. We introduce Bayesian generative models of the troubleshooting data and apply them to two interrelated tasks: (a) predicting the complexity of the solutions (e.g., plugging a keyboard in the computer is easier compared to installing a special driver) and (b) presenting them in a ranked order from least to most complex. Experimental results show that our models are on par with human performance on these tasks, while outperforming baselines based on solution length or readability.

pdf
Conversation Trees: A Grammar Model for Topic Structure in Forums
Annie Louis | Shay B. Cohen
Proceedings of the 2015 Conference on Empirical Methods in Natural Language Processing

pdf bib
Proceedings of the First Workshop on Linking Computational Models of Lexical, Sentential and Discourse-level Semantics
Michael Roth | Annie Louis | Bonnie Webber | Tim Baldwin
Proceedings of the First Workshop on Linking Computational Models of Lexical, Sentential and Discourse-level Semantics

pdf
Recovering discourse relations: Varying influence of discourse adverbials
Hannah Rohde | Anna Dickinson | Chris Clark | Annie Louis | Bonnie Webber
Proceedings of the First Workshop on Linking Computational Models of Lexical, Sentential and Discourse-level Semantics

2014

pdf
A Bayesian Method to Incorporate Background Knowledge during Automatic Text Summarization
Annie Louis
Proceedings of the 52nd Annual Meeting of the Association for Computational Linguistics (Volume 2: Short Papers)

pdf bib
Proceedings of the ACL 2014 Student Research Workshop
Ekaterina Kochmar | Annie Louis | Svitlana Volkova | Jordan Boyd-Graber | Bill Byrne
Proceedings of the ACL 2014 Student Research Workshop

pdf
Structured and Unstructured Cache Models for SMT Domain Adaptation
Annie Louis | Bonnie Webber
Proceedings of the 14th Conference of the European Chapter of the Association for Computational Linguistics

pdf
Verbose, Laconic or Just Right: A Simple Computational Model of Content Appropriateness under Length Constraints
Annie Louis | Ani Nenkova
Proceedings of the 14th Conference of the European Chapter of the Association for Computational Linguistics

2013

pdf bib
Proceedings of the 2013 NAACL HLT Student Research Workshop
Annie Louis | Richard Socher | Julia Hockenmaier | Eric K. Ringger
Proceedings of the 2013 NAACL HLT Student Research Workshop

pdf bib
Automatically Assessing Machine Summary Content Without a Gold Standard
Annie Louis | Ani Nenkova
Computational Linguistics, Volume 39, Issue 2 - June 2013

pdf
What Makes Writing Great? First Experiments on Article Quality Prediction in the Science Journalism Domain
Annie Louis | Ani Nenkova
Transactions of the Association for Computational Linguistics, Volume 1

Great writing is rare and highly admired. Readers seek out articles that are beautifully written, informative and entertaining. Yet information-access technologies lack capabilities for predicting article quality at this level. In this paper we present first experiments on article quality prediction in the science journalism domain. We introduce a corpus of great pieces of science journalism, along with typical articles from the genre. We implement features to capture aspects of great writing, including surprising, visual and emotional content, as well as general features related to discourse organization and sentence structure. We show that the distinction between great and typical articles can be detected fairly accurately, and that the entire spectrum of our features contribute to the distinction.

2012

pdf
A corpus of general and specific sentences from news
Annie Louis | Ani Nenkova
Proceedings of the Eighth International Conference on Language Resources and Evaluation (LREC'12)

We present a corpus of sentences from news articles that are annotated as general or specific. We employed annotators on Amazon Mechanical Turk to mark sentences from three kinds of news articles―reports on events, finance news and science journalism. We introduce the resulting corpus, with focus on annotator agreement, proportion of general/specific sentences in the articles and results for automatic classification of the two sentence types.

pdf
A Coherence Model Based on Syntactic Patterns
Annie Louis | Ani Nenkova
Proceedings of the 2012 Joint Conference on Empirical Methods in Natural Language Processing and Computational Natural Language Learning

pdf
Automatic Metrics for Genre-specific Text Quality
Annie Louis
Proceedings of the NAACL HLT 2012 Student Research Workshop

pdf
Summarization of Business-Related Tweets: A Concept-Based Approach
Annie Louis | Todd Newman
Proceedings of COLING 2012: Posters

2011

pdf
Automatic identification of general and specific sentences by leveraging discourse annotations
Annie Louis | Ani Nenkova
Proceedings of 5th International Joint Conference on Natural Language Processing

pdf
Text Specificity and Impact on Quality of News Summaries
Annie Louis | Ani Nenkova
Proceedings of the Workshop on Monolingual Text-To-Text Generation

2010

pdf
Automatic Evaluation of Linguistic Quality in Multi-Document Summarization
Emily Pitler | Annie Louis | Ani Nenkova
Proceedings of the 48th Annual Meeting of the Association for Computational Linguistics

pdf
Off-topic essay detection using short prompt texts
Annie Louis | Derrick Higgins
Proceedings of the NAACL HLT 2010 Fifth Workshop on Innovative Use of NLP for Building Educational Applications

pdf
Using entity features to classify implicit discourse relations
Annie Louis | Aravind Joshi | Rashmi Prasad | Ani Nenkova
Proceedings of the SIGDIAL 2010 Conference

pdf
Discourse indicators for content selection in summarization
Annie Louis | Aravind Joshi | Ani Nenkova
Proceedings of the SIGDIAL 2010 Conference

pdf
Creating Local Coherence: An Empirical Assessment
Annie Louis | Ani Nenkova
Human Language Technologies: The 2010 Annual Conference of the North American Chapter of the Association for Computational Linguistics

2009

pdf
Automatically Evaluating Content Selection in Summarization without Human Models
Annie Louis | Ani Nenkova
Proceedings of the 2009 Conference on Empirical Methods in Natural Language Processing

pdf
Performance Confidence Estimation for Automatic Summarization
Annie Louis | Ani Nenkova
Proceedings of the 12th Conference of the European Chapter of the ACL (EACL 2009)

pdf
Automatic sense prediction for implicit discourse relations in text
Emily Pitler | Annie Louis | Ani Nenkova
Proceedings of the Joint Conference of the 47th Annual Meeting of the ACL and the 4th International Joint Conference on Natural Language Processing of the AFNLP

2008

pdf
Can You Summarize This? Identifying Correlates of Input Difficulty for Multi-Document Summarization
Ani Nenkova | Annie Louis
Proceedings of ACL-08: HLT