Alexis Palmer


2023

pdf bib
Proceedings of the Sixth Workshop on the Use of Computational Methods in the Study of Endangered Languages
Atticus Harrigan | Aditi Chaudhary | Shruti Rijhwani | Sarah Moeller | Antti Arppe | Alexis Palmer | Ryan Henke | Daisy Rosenblum
Proceedings of the Sixth Workshop on the Use of Computational Methods in the Study of Endangered Languages

pdf
Mapping AMR to UMR: Resources for Adapting Existing Corpora for Cross-Lingual Compatibility
Julia Bonn | Skatje Myers | Jens E. L. Van Gysel | Lukas Denk | Meagan Vigus | Jin Zhao | Andrew Cowell | William Croft | Jan Hajič | James H. Martin | Alexis Palmer | Martha Palmer | James Pustejovsky | Zdenka Urešová | Rosa Vallejos | Nianwen Xue
Proceedings of the 21st International Workshop on Treebanks and Linguistic Theories (TLT, GURT/SyntaxFest 2023)

This paper presents detailed mappings between the structures used in Abstract Meaning Representation (AMR) and those used in Uniform Meaning Representation (UMR). These structures include general semantic roles, rolesets, and concepts that are largely shared between AMR and UMR, but with crucial differences. While UMR annotation of new low-resource languages is ongoing, AMR-annotated corpora already exist for many languages, and these AMR corpora are ripe for conversion to UMR format. Rather than focusing on semantic coverage that is new to UMR (which will likely need to be dealt with manually), this paper serves as a resource (with illustrated mappings) for users looking to understand the fine-grained adjustments that have been made to the representation techniques for semantic categoriespresent in both AMR and UMR.

2022

pdf bib
Proceedings of the Fifth Workshop on the Use of Computational Methods in the Study of Endangered Languages
Sarah Moeller | Antonios Anastasopoulos | Antti Arppe | Aditi Chaudhary | Atticus Harrigan | Josh Holden | Jordan Lachler | Alexis Palmer | Shruti Rijhwani | Lane Schwartz
Proceedings of the Fifth Workshop on the Use of Computational Methods in the Study of Endangered Languages

pdf
AmericasNLI: Evaluating Zero-shot Natural Language Understanding of Pretrained Multilingual Models in Truly Low-resource Languages
Abteen Ebrahimi | Manuel Mager | Arturo Oncevay | Vishrav Chaudhary | Luis Chiruzzo | Angela Fan | John Ortega | Ricardo Ramos | Annette Rios | Ivan Vladimir Meza Ruiz | Gustavo Giménez-Lugo | Elisabeth Mager | Graham Neubig | Alexis Palmer | Rolando Coto-Solano | Thang Vu | Katharina Kann
Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers)

Pretrained multilingual models are able to perform cross-lingual transfer in a zero-shot setting, even for languages unseen during pretraining. However, prior work evaluating performance on unseen languages has largely been limited to low-level, syntactic tasks, and it remains unclear if zero-shot learning of high-level, semantic tasks is possible for unseen languages. To explore this question, we present AmericasNLI, an extension of XNLI (Conneau et al., 2018) to 10 Indigenous languages of the Americas. We conduct experiments with XLM-R, testing multiple zero-shot and translation-based approaches. Additionally, we explore model adaptation via continued pretraining and provide an analysis of the dataset by considering hypothesis-only models. We find that XLM-R’s zero-shot performance is poor for all 10 languages, with an average performance of 38.48%. Continued pretraining offers improvements, with an average accuracy of 43.85%. Surprisingly, training on poorly translated data by far outperforms all other methods with an accuracy of 49.12%.

pdf
Machine Translation Between High-resource Languages in a Language Documentation Setting
Katharina Kann | Abteen Ebrahimi | Kristine Stenzel | Alexis Palmer
Proceedings of the first workshop on NLP applications to field linguistics

Language documentation encompasses translation, typically into the dominant high-resource language in the region where the target language is spoken. To make data accessible to a broader audience, additional translation into other high-resource languages might be needed. Working within a project documenting Kotiria, we explore the extent to which state-of-the-art machine translation (MT) systems can support this second translation – in our case from Portuguese to English. This translation task is challenging for multiple reasons: (1) the data is out-of-domain with respect to the MT system’s training data, (2) much of the data is conversational, (3) existing translations include non-standard and uncommon expressions, often reflecting properties of the documented language, and (4) the data includes borrowings from other regional languages. Despite these challenges, existing MT systems perform at a usable level, though there is still room for improvement. We then conduct a qualitative analysis and suggest ways to improve MT between high-resource languages in a language documentation setting.

pdf bib
Proceedings of the 16th International Workshop on Semantic Evaluation (SemEval-2022)
Guy Emerson | Natalie Schluter | Gabriel Stanovsky | Ritesh Kumar | Alexis Palmer | Nathan Schneider | Siddharth Singh | Shyam Ratan
Proceedings of the 16th International Workshop on Semantic Evaluation (SemEval-2022)

pdf
Contrast Sets for Stativity of English Verbs in Context
Daniel Chen | Alexis Palmer
Proceedings of the 29th International Conference on Computational Linguistics

For the task of classifying verbs in context as dynamic or stative, current models approach human performance, but only for particular data sets. To better understand the performance of such models, and how well they are able to generalize beyond particular test sets, we apply the contrast set (Gardner et al., 2020) methodology to stativity classification. We create nearly 300 contrastive pairs by perturbing test set instances just enough to change their labels from one class to the other, while preserving coherence, meaning, and well-formedness. Contrastive evaluation shows that a model with near-human performance on an in-distribution test set degrades substantially when applied to transformed examples, showing that the stative vs. dynamic classification task is more complex than the model performance might otherwise suggest. Code and data are freely available.

2021

pdf
Orthographic vs. Semantic Representations for Unsupervised Morphological Paradigm Clustering
E. Margaret Perkoff | Josh Daniels | Alexis Palmer
Proceedings of the 18th SIGMORPHON Workshop on Computational Research in Phonetics, Phonology, and Morphology

This paper presents two different systems for unsupervised clustering of morphological paradigms, in the context of the SIGMORPHON 2021 Shared Task 2. The goal of this task is to correctly cluster words in a given language by their inflectional paradigm, without any previous knowledge of the language and without supervision from labeled data of any sort. The words in a single morphological paradigm are different inflectional variants of an underlying lemma, meaning that the words share a common core meaning. They also - usually - show a high degree of orthographical similarity. Following these intuitions, we investigate KMeans clustering using two different types of word representations: one focusing on orthographical similarity and the other focusing on semantic similarity.Additionally, we discuss the merits of randomly initialized centroids versus pre-defined centroids for clustering. Pre-defined centroids are identified based on either a standard longest common substring algorithm or a connected graph method built off of longest common substring. For all development languages, the character-based embeddings perform similarly to the baseline, and the semantic embeddings perform well below the baseline.Analysis of the systems’ errors suggests that clustering based on orthographic representations is suitable for a wide range of morphological mechanisms, particularly as part of a larger system.

pdf bib
Proceedings of the First Workshop on Natural Language Processing for Indigenous Languages of the Americas
Manuel Mager | Arturo Oncevay | Annette Rios | Ivan Vladimir Meza Ruiz | Alexis Palmer | Graham Neubig | Katharina Kann
Proceedings of the First Workshop on Natural Language Processing for Indigenous Languages of the Americas

pdf
Findings of the AmericasNLP 2021 Shared Task on Open Machine Translation for Indigenous Languages of the Americas
Manuel Mager | Arturo Oncevay | Abteen Ebrahimi | John Ortega | Annette Rios | Angela Fan | Ximena Gutierrez-Vasques | Luis Chiruzzo | Gustavo Giménez-Lugo | Ricardo Ramos | Ivan Vladimir Meza Ruiz | Rolando Coto-Solano | Alexis Palmer | Elisabeth Mager-Hois | Vishrav Chaudhary | Graham Neubig | Ngoc Thang Vu | Katharina Kann
Proceedings of the First Workshop on Natural Language Processing for Indigenous Languages of the Americas

This paper presents the results of the 2021 Shared Task on Open Machine Translation for Indigenous Languages of the Americas. The shared task featured two independent tracks, and participants submitted machine translation systems for up to 10 indigenous languages. Overall, 8 teams participated with a total of 214 submissions. We provided training sets consisting of data collected from various sources, as well as manually translated sentences for the development and test sets. An official baseline trained on this data was also provided. Team submissions featured a variety of architectures, including both statistical and neural models, and for the majority of languages, many teams were able to considerably improve over the baseline. The best performing systems achieved 12.97 ChrF higher than baseline, when averaged across languages.

pdf bib
Proceedings of the 4th Workshop on the Use of Computational Methods in the Study of Endangered Languages Volume 1 (Papers)
Antti Arppe | Jeff Good | Atticus Harrigan | Mans Hulden | Jordan Lachler | Sarah Moeller | Alexis Palmer | Miikka Silfverberg | Lane Schwartz
Proceedings of the 4th Workshop on the Use of Computational Methods in the Study of Endangered Languages Volume 1 (Papers)

pdf bib
Proceedings of the 15th International Workshop on Semantic Evaluation (SemEval-2021)
Alexis Palmer | Nathan Schneider | Natalie Schluter | Guy Emerson | Aurelie Herbelot | Xiaodan Zhu
Proceedings of the 15th International Workshop on Semantic Evaluation (SemEval-2021)

2020

pdf
It’s not a Non-Issue: Negation as a Source of Error in Machine Translation
Md Mosharaf Hossain | Antonios Anastasopoulos | Eduardo Blanco | Alexis Palmer
Findings of the Association for Computational Linguistics: EMNLP 2020

As machine translation (MT) systems progress at a rapid pace, questions of their adequacy linger. In this study we focus on negation, a universal, core property of human language that significantly affects the semantics of an utterance. We investigate whether translating negation is an issue for modern MT systems using 17 translation directions as test bed. Through thorough analysis, we find that indeed the presence of negation can significantly impact downstream quality, in some cases resulting in quality reductions of more than 60%. We also provide a linguistically motivated analysis that directly explains the majority of our findings. We release our annotations and code to replicate our analysis here: https://github.com/mosharafhossain/negation-mt.

pdf
A Summary of the First Workshop on Language Technology for Language Documentation and Revitalization
Graham Neubig | Shruti Rijhwani | Alexis Palmer | Jordan MacKenzie | Hilaria Cruz | Xinjian Li | Matthew Lee | Aditi Chaudhary | Luke Gessler | Steven Abney | Shirley Anugrah Hayati | Antonios Anastasopoulos | Olga Zamaraeva | Emily Prud’hommeaux | Jennette Child | Sara Child | Rebecca Knowles | Sarah Moeller | Jeffrey Micher | Yiyuan Li | Sydney Zink | Mengzhou Xia | Roshan S Sharma | Patrick Littell
Proceedings of the 1st Joint Workshop on Spoken Language Technologies for Under-resourced languages (SLTU) and Collaboration and Computing for Under-Resourced Languages (CCURL)

Despite recent advances in natural language processing and other language technology, the application of such technology to language documentation and conservation has been limited. In August 2019, a workshop was held at Carnegie Mellon University in Pittsburgh, PA, USA to attempt to bring together language community members, documentary linguists, and technologists to discuss how to bridge this gap and create prototypes of novel and practical language revitalization technologies. The workshop focused on developing technologies to aid language documentation and revitalization in four areas: 1) spoken language (speech transcription, phone to orthography decoding, text-to-speech and text-speech forced alignment), 2) dictionary extraction and management, 3) search tools for corpora, and 4) social media (language learning bots and social media analysis). This paper reports the results of this workshop, including issues discussed, and various conceived and implemented technologies for nine languages: Arapaho, Cayuga, Inuktitut, Irish Gaelic, Kidaw’ida, Kwak’wala, Ojibwe, San Juan Quiahije Chatino, and Seneca.

pdf bib
Proceedings of the Fourteenth Workshop on Semantic Evaluation
Aurelie Herbelot | Xiaodan Zhu | Alexis Palmer | Nathan Schneider | Jonathan May | Ekaterina Shutova
Proceedings of the Fourteenth Workshop on Semantic Evaluation

pdf
UNTLing at SemEval-2020 Task 11: Detection of Propaganda Techniques in English News Articles
Maia Petee | Alexis Palmer
Proceedings of the Fourteenth Workshop on Semantic Evaluation

Our system for the PropEval task explores the ability of semantic features to detect and label propagandistic rhetorical techniques in English news articles. For Subtask 2, labeling identified propagandistic fragments with one of fourteen technique labels, our system attains a micro-averaged F1 of 0.40; in this paper, we take a detailed look at the fourteen labels and how well our semantically-focused model detects each of them. We also propose strategies to fill the gaps.

pdf
UNT Linguistics at SemEval-2020 Task 12: Linear SVC with Pre-trained Word Embeddings as Document Vectors and Targeted Linguistic Features
Jared Fromknecht | Alexis Palmer
Proceedings of the Fourteenth Workshop on Semantic Evaluation

This paper outlines our approach to Tasks A & B for the English Language track of SemEval-2020 Task 12: OffensEval 2: Multilingual Offensive Language Identification in Social Media. We use a Linear SVM with document vectors computed from pre-trained word embeddings, and we explore the effectiveness of lexical, part of speech, dependency, and named entity (NE) features. We manually annotate a subset of the training data, which we use for error analysis and to tune a threshold for mapping training confidence values to labels. While document vectors are consistently the most informative features for both tasks, testing on the development set suggests that dependency features are an effective addition for Task A, and NE features for Task B.

pdf
Predicting the Focus of Negation: Model and Error Analysis
Md Mosharaf Hossain | Kathleen Hamilton | Alexis Palmer | Eduardo Blanco
Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics

The focus of a negation is the set of tokens intended to be negated, and a key component for revealing affirmative alternatives to negated utterances. In this paper, we experiment with neural networks to predict the focus of negation. Our main novelty is leveraging a scope detector to introduce the scope of negation as an additional input to the network. Experimental results show that doing so obtains the best results to date. Additionally, we perform a detailed error analysis providing insights into the main error categories, and analyze errors depending on whether the model takes into account scope and context information.

pdf
WikiPossessions: Possession Timeline Generation as an Evaluation Benchmark for Machine Reading Comprehension of Long Texts
Dhivya Chinnappa | Alexis Palmer | Eduardo Blanco
Proceedings of the Twelfth Language Resources and Evaluation Conference

This paper presents WikiPossessions, a new benchmark corpus for the task of temporally-oriented possession (TOP), or tracking objects as they change hands over time. We annotate Wikipedia articles for 90 different well-known artifacts paintings, diamonds, and archaeological artifacts), producing 799 artifact-possessor relations with associated attributes. For each article, we also produce a full possession timeline. The full version of the task combines straightforward entity-relation extraction with complex temporal reasoning, as well as verification of textual support for the relevant types of knowledge. Specifically, to complete the full TOP task for a given article, a system must do the following: a) identify possessors; b) anchor possessors to times/events; c) identify temporal relations between each temporal anchor and the possession relation it corresponds to; d) assign certainty scores to each possessor and each temporal relation; and e) assemble individual possession events into a global possession timeline. In addition to the corpus, we release evaluation scripts and a baseline model for the task.

2019

pdf
Sigmorphon 2019 Task 2 system description paper: Morphological analysis in context for many languages, with supervision from only a few
Brad Aiken | Jared Kelly | Alexis Palmer | Suleyman Olcay Polat | Taraka Rama | Rodney Nielsen
Proceedings of the 16th Workshop on Computational Research in Phonetics, Phonology, and Morphology

This paper presents the UNT HiLT+Ling system for the Sigmorphon 2019 shared Task 2: Morphological Analysis and Lemmatization in Context. Our core approach focuses on the morphological tagging task; part-of-speech tagging and lemmatization are treated as secondary tasks. Given the highly multilingual nature of the task, we propose an approach which makes minimal use of the supplied training data, in order to be extensible to languages without labeled training data for the morphological inflection task. Specifically, we use a parallel Bible corpus to align contextual embeddings at the verse level. The aligned verses are used to build cross-language translation matrices, which in turn are used to map between embedding spaces for the various languages. Finally, we use sets of inflected forms, primarily from a high-resource language, to induce vector representations for individual UniMorph tags. Morphological analysis is performed by matching vector representations to embeddings for individual tokens. While our system results are dramatically below the average system submitted for the shared task evaluation campaign, our method is (we suspect) unique in its minimal reliance on labeled training data.

pdf bib
Proceedings of the 3rd Workshop on the Use of Computational Methods in the Study of Endangered Languages Volume 1 (Papers)
Antti Arppe | Jeff Good | Mans Hulden | Jordan Lachler | Alexis Palmer | Lane Schwartz | Miikka Silfverberg
Proceedings of the 3rd Workshop on the Use of Computational Methods in the Study of Endangered Languages Volume 1 (Papers)

pdf
A Corpus of Negations and their Underlying Positive Interpretations
Zahra Sarabi | Erin Killian | Eduardo Blanco | Alexis Palmer
Proceedings of the Eighth Joint Conference on Lexical and Computational Semantics (*SEM 2019)

Negation often conveys implicit positive meaning. In this paper, we present a corpus of negations and their underlying positive interpretations. We work with negations from Simple Wikipedia, automatically generate potential positive interpretations, and then collect manual annotations that effectively rewrite the negation in positive terms. This procedure yields positive interpretations for approximately 77% of negations, and the final corpus includes over 5,700 negations and over 5,900 positive interpretations. We also present baseline results using seq2seq neural models.

pdf bib
Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics: Tutorial Abstracts
Preslav Nakov | Alexis Palmer
Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics: Tutorial Abstracts

2018

pdf bib
Classifying Semantic Clause Types With Recurrent Neural Networks: Analysis of Attention, Context & Genre Characteristics
Maria Becker | Michael Staniek | Vivi Nastase | Alexis Palmer | Anette Frank
Traitement Automatique des Langues 2018 Volume 59 Numéro 2

pdf
Determining Event Durations: Models and Error Analysis
Alakananda Vempala | Eduardo Blanco | Alexis Palmer
Proceedings of the 2018 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 2 (Short Papers)

This paper presents models to predict event durations. We introduce aspectual features that capture deeper linguistic information than previous work, and experiment with neural networks. Our analysis shows that tense, aspect and temporal structure of the clause provide useful clues, and that an LSTM ensemble captures relevant context around the event.

2017

pdf bib
Proceedings of the 2nd Workshop on the Use of Computational Methods in the Study of Endangered Languages
Antti Arppe | Jeff Good | Mans Hulden | Jordan Lachler | Alexis Palmer | Lane Schwartz
Proceedings of the 2nd Workshop on the Use of Computational Methods in the Study of Endangered Languages

pdf
Illegal is not a Noun: Linguistic Form for Detection of Pejorative Nominalizations
Alexis Palmer | Melissa Robinson | Kristy K. Phillips
Proceedings of the First Workshop on Abusive Language Online

This paper focuses on a particular type of abusive language, targeting expressions in which typically neutral adjectives take on pejorative meaning when used as nouns - compare ‘gay people’ to ‘the gays’. We first collect and analyze a corpus of hand-curated, expert-annotated pejorative nominalizations for four target adjectives: female, gay, illegal, and poor. We then collect a second corpus of automatically-extracted and POS-tagged, crowd-annotated tweets. For both corpora, we find support for the hypothesis that some adjectives, when nominalized, take on negative meaning. The targeted constructions are non-standard yet widely-used, and part-of-speech taggers mistag some nominal forms as adjectives. We implement a tool called NomCatcher to correct these mistaggings, and find that the same tool is effective for identifying new adjectives subject to transformation via nominalization into abusive language.

pdf
Modeling Communicative Purpose with Functional Style: Corpus and Features for German Genre and Register Analysis
Thomas Haider | Alexis Palmer
Proceedings of the Workshop on Stylistic Variation

While there is wide acknowledgement in NLP of the utility of document characterization by genre, it is quite difficult to determine a definitive set of features or even a comprehensive list of genres. This paper addresses both issues. First, with prototype semantics, we develop a hierarchical taxonomy of discourse functions. We implement the taxonomy by developing a new text genre corpus of contemporary German to perform a text based comparative register analysis. Second, we extract a host of style features, both deep and shallow, aiming beyond linguistically motivated features at situational correlates in texts. The feature sets are used for supervised text genre classification, on which our models achieve high accuracy. The combination of the corpus typology and feature sets allows us to characterize types of communicative purpose in a comparative setup, by qualitative interpretation of style feature loadings of a regularized discriminant analysis. Finally, to determine the dependence of genre on topics (which are arguably the distinguishing factor of sub-genre), we compare and combine our style models with Latent Dirichlet Allocation features across different corpus settings with unstable topics.

pdf
Classifying Semantic Clause Types: Modeling Context and Genre Characteristics with Recurrent Neural Networks and Attention
Maria Becker | Michael Staniek | Vivi Nastase | Alexis Palmer | Anette Frank
Proceedings of the 6th Joint Conference on Lexical and Computational Semantics (*SEM 2017)

Detecting aspectual properties of clauses in the form of situation entity types has been shown to depend on a combination of syntactic-semantic and contextual features. We explore this task in a deep-learning framework, where tuned word representations capture lexical, syntactic and semantic features. We introduce an attention mechanism that pinpoints relevant context not only for the current instance, but also for the larger context. Apart from implicitly capturing task relevant features, the advantage of our neural model is that it avoids the need to reproduce linguistic features for other languages and is thus more easily transferable. We present experiments for English and German that achieve competitive performance. We present a novel take on modeling and exploiting genre information and showcase the adaptation of our system from one language to another.

2016

pdf
Situation entity types: automatic classification of clause-level aspect
Annemarie Friedrich | Alexis Palmer | Manfred Pinkal
Proceedings of the 54th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers)

pdf bib
Modal Sense Classification At Large: Paraphrase-Driven Sense Projection, Semantically Enriched Classification Models and Cross-Genre Evaluations
Ana Marasović | Mengfei Zhou | Alexis Palmer | Anette Frank
Linguistic Issues in Language Technology, Volume 14, 2016 - Modality: Logic, Semantics, Annotation, and Machine Learning

Modal verbs have different interpretations depending on their context. Their sense categories – epistemic, deontic and dynamic – provide important dimensions of meaning for the interpretation of discourse. Previous work on modal sense classification achieved relatively high performance using shallow lexical and syntactic features drawn from small-size annotated corpora. Due to the restricted empirical basis, it is difficult to assess the particular difficulties of modal sense classification and the generalization capacity of the proposed models. In this work we create large-scale, high-quality annotated corpora for modal sense classification using an automatic paraphrase-driven projection approach. Using the acquired corpora, we investigate the modal sense classification task from different perspectives.

pdf
Investigating Active Learning for Short-Answer Scoring
Andrea Horbach | Alexis Palmer
Proceedings of the 11th Workshop on Innovative Use of NLP for Building Educational Applications

pdf
Predicting the Direction of Derivation in English Conversion
Max Kisselew | Laura Rimell | Alexis Palmer | Sebastian Padó
Proceedings of the 14th SIGMORPHON Workshop on Computational Research in Phonetics, Phonology, and Morphology

pdf
Argumentative texts and clause types
Maria Becker | Alexis Palmer | Anette Frank
Proceedings of the Third Workshop on Argument Mining (ArgMining2016)

2015

pdf
Obtaining a Better Understanding of Distributional Models of German Derivational Morphology
Max Kisselew | Sebastian Padó | Alexis Palmer | Jan Šnajder
Proceedings of the 11th International Conference on Computational Semantics

pdf
Annotating genericity: a survey, a scheme, and a corpus
Annemarie Friedrich | Alexis Palmer | Melissa Peate Sørensen | Manfred Pinkal
Proceedings of the 9th Linguistic Annotation Workshop

pdf
Using Shallow Syntactic Features to Measure Influences of L1 and Proficiency Level in EFL Writings
Andrea Horbach | Jonathan Poitz | Alexis Palmer
Proceedings of the fourth workshop on NLP for computer-assisted language learning

pdf bib
Linking discourse modes and situation entity types in a cross-linguistic corpus study
Kleio-Isidora Mavridou | Annemarie Friedrich | Melissa Peate Sørensen | Alexis Palmer | Manfred Pinkal
Proceedings of the First Workshop on Linking Computational Models of Lexical, Sentential and Discourse-level Semantics

pdf
Semantically Enriched Models for Modal Sense Classification
Mengfei Zhou | Anette Frank | Annemarie Friedrich | Alexis Palmer
Proceedings of the First Workshop on Linking Computational Models of Lexical, Sentential and Discourse-level Semantics

2014

pdf
Automatic prediction of aspectual class of verbs in context
Annemarie Friedrich | Alexis Palmer
Proceedings of the 52nd Annual Meeting of the Association for Computational Linguistics (Volume 2: Short Papers)

pdf
lex4all: A language-independent tool for building and evaluating pronunciation lexicons for small-vocabulary speech recognition
Anjana Vakil | Max Paulus | Alexis Palmer | Michaela Regneri
Proceedings of 52nd Annual Meeting of the Association for Computational Linguistics: System Demonstrations

pdf
SeedLing: Building and Using a Seed corpus for the Human Language Project
Guy Emerson | Liling Tan | Susanne Fertmann | Alexis Palmer | Michaela Regneri
Proceedings of the 2014 Workshop on the Use of Computational Methods in the Study of Endangered Languages

pdf
Short-Term Projects, Long-Term Benefits: Four Student NLP Projects for Low-Resource Languages
Alexis Palmer | Michaela Regneri
Proceedings of the 2014 Workshop on the Use of Computational Methods in the Study of Endangered Languages

pdf
Paraphrase Detection for Short Answer Scoring
Nikolina Koleva | Andrea Horbach | Alexis Palmer | Simon Ostermann | Manfred Pinkal
Proceedings of the third workshop on NLP for computer-assisted language learning

pdf
Situation Entity Annotation
Annemarie Friedrich | Alexis Palmer
Proceedings of LAW VIII - The 8th Linguistic Annotation Workshop

pdf
LQVSumm: A Corpus of Linguistic Quality Violations in Multi-Document Summarization
Annemarie Friedrich | Marina Valeeva | Alexis Palmer
Proceedings of the Ninth International Conference on Language Resources and Evaluation (LREC'14)

We present LQVSumm, a corpus of about 2000 automatically created extractive multi-document summaries from the TAC 2011 shared task on Guided Summarization, which we annotated with several types of linguistic quality violations. Examples for such violations include pronouns that lack antecedents or ungrammatical clauses. We give details on the annotation scheme and show that inter-annotator agreement is good given the open-ended nature of the task. The annotated summaries have previously been scored for Readability on a numeric scale by human annotators in the context of the TAC challenge; we show that the number of instances of violations of linguistic quality of a summary correlates with these intuitively assigned numeric scores. On a system-level, the average number of violations marked in a system’s summaries achieves higher correlation with the Readability scores than current supervised state-of-the-art methods for assigning a single readability score to a summary. It is our hope that our corpus facilitates the development of methods that not only judge the linguistic quality of automatically generated summaries as a whole, but which also allow for detecting, labeling, and fixing particular violations in a text.

pdf
Finding a Tradeoff between Accuracy and Rater’s Workload in Grading Clustered Short Answers
Andrea Horbach | Alexis Palmer | Magdalena Wolska
Proceedings of the Ninth International Conference on Language Resources and Evaluation (LREC'14)

n this paper we investigate the potential of answer clustering for semi-automatic scoring of short answer questions for German as a foreign language. We use surface features like word and character n-grams to cluster answers to listening comprehension exercises per question and simulate having human graders only label one answer per cluster and then propagating this label to all other members of the cluster. We investigate various ways to select this single item to be labeled and find that choosing the item closest to the centroid of a cluster leads to improved (simulated) grading accuracy over random item selection. Averaged over all questions, we can reduce a teacher’s workload to labeling only 40% of all different answers for a question, while still maintaining a grading accuracy of more than 85%.

2013

pdf
Using the text to evaluate short answers for reading comprehension exercises
Andrea Horbach | Alexis Palmer | Manfred Pinkal
Second Joint Conference on Lexical and Computational Semantics (*SEM), Volume 1: Proceedings of the Main Conference and the Shared Task: Semantic Textual Similarity

2012

pdf
Visualising Typological Relationships: Plotting WALS with Heat Maps
Richard Littauer | Rory Turnbull | Alexis Palmer
Proceedings of the EACL 2012 Joint Workshop of LINGVIS & UNCLH

2011

pdf
Enhancing Active Learning for Semantic Role Labeling via Compressed Dependency Trees
Chenhua Chen | Alexis Palmer | Caroline Sporleder
Proceedings of 5th International Joint Conference on Natural Language Processing

pdf
Robust Semantic Analysis for Unseen Data in FrameNet
Alexis Palmer | Afra Alishahi | Caroline Sporleder
Proceedings of the International Conference Recent Advances in Natural Language Processing 2011

2010

pdf
Bringing Active Learning to Life
Ines Rehbein | Josef Ruppenhofer | Alexis Palmer
Proceedings of the 23rd International Conference on Computational Linguistics (Coling 2010)

pdf
Evaluating FrameNet-style semantic parsing: the role of coverage gaps in FrameNet
Alexis Palmer | Caroline Sporleder
Coling 2010: Posters

2009

pdf
How well does active learning actually work? Time-based evaluation of cost-reduction strategies for language documentation.
Jason Baldridge | Alexis Palmer
Proceedings of the 2009 Conference on Empirical Methods in Natural Language Processing

pdf
Evaluating Automation Strategies in Language Documentation
Alexis Palmer | Taesun Moon | Jason Baldridge
Proceedings of the NAACL HLT 2009 Workshop on Active Learning for Natural Language Processing

2007

pdf
IGT-XML: An XML Format for Interlinearized Glossed Text
Alexis Palmer | Katrin Erk
Proceedings of the Linguistic Annotation Workshop

pdf
A Sequencing Model for Situation Entity Classification
Alexis Palmer | Elias Ponvert | Jason Baldridge | Carlota Smith
Proceedings of the 45th Annual Meeting of the Association of Computational Linguistics

2004

pdf
Utilization of Multiple Language Resources for Robust Grammar-Based Tense and Aspect Classification
Alexis Palmer | Jonas Kuhn | Carlota Smith
Proceedings of the Fourth International Conference on Language Resources and Evaluation (LREC’04)

This paper reports on an ongoing project that uses varied language resources and advanced NLP tools for a linguistic classification task in discourse semantics. The system we present is designed to assign a "situation entity" class label to each predicator in English text. The project goal is to achieve the best-possible identification of situation entities in naturally-occurring written texts by implementing a robust system that will deal with real corpus material, rather than just with constructed textbook examples of discourse. In this paper we focus on the combination of multiple information sources, which we see as being vital for a robust classification system. We use a deep syntactic grammar of English to identify morphological, syntactic, and discourse clues, and we use various lexical databases for fine-grained semantic properties of the predicators. Experiments performed to date show that enhancing the output of the grammar with information from lexical resources improves recall but lowers precision in the situation entity classification task.
Search
Co-authors