Eleni Miltsakaki


A Feasibility Study of Answer-Agnostic Question Generation for Education
Liam Dugan | Eleni Miltsakaki | Shriyash Upadhyay | Etan Ginsberg | Hannah Gonzalez | DaHyeon Choi | Chuning Yuan | Chris Callison-Burch
Findings of the Association for Computational Linguistics: ACL 2022

We conduct a feasibility study into the applicability of answer-agnostic question generation models to textbook passages. We show that a significant portion of errors in such systems arise from asking irrelevant or un-interpretable questions and that such errors can be ameliorated by providing summarized input. We find that giving these models human-written summaries instead of the original text results in a significant increase in acceptability of generated questions (33% 83%) as determined by expert annotators. We also find that, in the absence of human-written summaries, automatic summarization can serve as a good middle ground.


Complexity-Weighted Loss and Diverse Reranking for Sentence Simplification
Reno Kriz | João Sedoc | Marianna Apidianaki | Carolina Zheng | Gaurav Kumar | Eleni Miltsakaki | Chris Callison-Burch
Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 1 (Long and Short Papers)

Sentence simplification is the task of rewriting texts so they are easier to understand. Recent research has applied sequence-to-sequence (Seq2Seq) models to this task, focusing largely on training-time improvements via reinforcement learning and memory augmentation. One of the main problems with applying generic Seq2Seq models for simplification is that these models tend to copy directly from the original sentence, resulting in outputs that are relatively long and complex. We aim to alleviate this issue through the use of two main techniques. First, we incorporate content word complexities, as predicted with a leveled word complexity model, into our loss function during training. Second, we generate a large set of diverse candidate simplifications at test time, and rerank these to promote fluency, adequacy, and simplicity. Here, we measure simplicity through a novel sentence complexity model. These extensions allow our models to perform competitively with state-of-the-art systems while generating simpler sentences. We report standard automatic and human evaluation metrics.


Simplification Using Paraphrases and Context-Based Lexical Substitution
Reno Kriz | Eleni Miltsakaki | Marianna Apidianaki | Chris Callison-Burch
Proceedings of the 2018 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 1 (Long Papers)

Lexical simplification involves identifying complex words or phrases that need to be simplified, and recommending simpler meaning-preserving substitutes that can be more easily understood. We propose a complex word identification (CWI) model that exploits both lexical and contextual features, and a simplification mechanism which relies on a word-embedding lexical substitution model to replace the detected complex words with simpler paraphrases. We compare our CWI and lexical simplification models to several baselines, and evaluate the performance of our simplification system against human judgments. The results show that our models are able to detect complex words with higher accuracy than other commonly used methods, and propose good simplification substitutes in context. They also highlight the limited contribution of context features for CWI, which nonetheless improve simplification compared to context-unaware models.


Do NLP and machine learning improve traditional readability formulas?
Thomas François | Eleni Miltsakaki
Proceedings of the First Workshop on Predicting and Improving Text Readability for target reader populations


Antelogue: Pronoun Resolution for Text and Dialogue
Eleni Miltsakaki
Coling 2010: Demonstrations

Corpus-based Semantics of Concession: Where do Expectations Come from?
Livio Robaldo | Eleni Miltsakaki | Alessia Bianchini
Proceedings of the Seventh International Conference on Language Resources and Evaluation (LREC'10)

In this paper, we discuss our analysis and resulting new annotations of Penn Discourse Treebank (PDTB) data tagged as Concession. Concession arises whenever one of the two arguments creates an expectation, and the other ones denies it. In Natural Languages, typical discourse connectives conveying Concession are 'but', 'although', 'nevertheless', etc. Extending previous theoretical accounts, our corpus analysis reveals that concessive interpretations are due to different sources of expectation, each giving rise to critical inferences about the relationship of the involved eventualities. We identify four different sources of expectation: Causality, Implication, Correlation, and Implicature. The reliability of these categories is supported by a high inter-annotator agreement score, computed over a sample of one thousand tokens of explicit connectives annotated as Concession in PDTB. Following earlier work of (Hobbs, 1998) and (Davidson, 1967) notion of reification, we extend the logical account of Concession originally proposed in (Robaldo et al., 2008) to provide refined formal descriptions for the first three mentioned sources of expectations in Concessive relations.


Matching Readers’ Preferences and Reading Skills with Appropriate Web Texts
Eleni Miltsakaki
Proceedings of the Demonstrations Session at EACL 2009


The Penn Discourse TreeBank 2.0.
Rashmi Prasad | Nikhil Dinesh | Alan Lee | Eleni Miltsakaki | Livio Robaldo | Aravind Joshi | Bonnie Webber
Proceedings of the Sixth International Conference on Language Resources and Evaluation (LREC'08)

We present the second version of the Penn Discourse Treebank, PDTB-2.0, describing its lexically-grounded annotations of discourse relations and their two abstract object arguments over the 1 million word Wall Street Journal corpus. We describe all aspects of the annotation, including (a) the argument structure of discourse relations, (b) the sense annotation of the relations, and (c) the attribution of discourse relations and each of their arguments. We list the differences between PDTB-1.0 and PDTB-2.0. We present representative statistics for several aspects of the annotation in the corpus.

Real Time Web Text Classification and Analysis of Reading Difficulty
Eleni Miltsakaki | Audrey Troutt
Proceedings of the Third Workshop on Innovative Use of NLP for Building Educational Applications

Refining the Meaning of Sense Labels in PDTB: “Concession”
Livio Robaldo | Eleni Miltsakaki | Jerry R. Hobbs
Semantics in Text Processing. STEP 2008 Conference Proceedings


Attribution and the (Non-)Alignment of Syntactic and Discourse Arguments of Connectives
Nikhil Dinesh | Alan Lee | Eleni Miltsakaki | Rashmi Prasad | Aravind Joshi | Bonnie Webber
Proceedings of the Workshop on Frontiers in Corpus Annotations II: Pie in the Sky


The Penn Discourse Treebank
Eleni Miltsakaki | Rashmi Prasad | Aravind Joshi | Bonnie Webber
Proceedings of the Fourth International Conference on Language Resources and Evaluation (LREC’04)

Annotation and Data Mining of the Penn Discourse TreeBank
Rashmi Prasad | Eleni Miltsakaki | Aravind Joshi | Bonnie Webber
Proceedings of the Workshop on Discourse Annotation

Annotating Discourse Connectives and Their Arguments
Eleni Miltsakaki | Aravind Joshi | Rashmi Prasad | Bonnie Webber
Proceedings of the Workshop Frontiers in Corpus Annotation at HLT-NAACL 2004


Anaphoric arguments of discourse connectives: Semantic properties of antecedents versus non-antecedents
Eleni Miltsakaki | Cassandre Creswell | Katherine Forbes | Aravind Joshi | Bonnie Webber
Proceedings of the 2003 EACL Workshop on The Computational Treatment of Anaphora


Toward an Aposynthesis of Topic Continuity and Intrasentential Anaphora
Eleni Miltsakaki
Computational Linguistics, Volume 28, Number 3, September 2002


The Role of Centering Theory’s Rough-Shift in the Teaching and Evaluation of Writing Skills
Eleni Miltsakaki | Karen Kukich
Proceedings of the 38th Annual Meeting of the Association for Computational Linguistics