Rodney Nielsen

Also published as: Rodney D. Nielsen

2019

Sigmorphon 2019 Task 2 system description paper: Morphological analysis in context for many languages, with supervision from only a few
Brad Aiken | Jared Kelly | Alexis Palmer | Suleyman Olcay Polat | Taraka Rama | Rodney Nielsen
Proceedings of the 16th Workshop on Computational Research in Phonetics, Phonology, and Morphology

This paper presents the UNT HiLT+Ling system for the Sigmorphon 2019 shared Task 2: Morphological Analysis and Lemmatization in Context. Our core approach focuses on the morphological tagging task; part-of-speech tagging and lemmatization are treated as secondary tasks. Given the highly multilingual nature of the task, we propose an approach which makes minimal use of the supplied training data, in order to be extensible to languages without labeled training data for the morphological inflection task. Specifically, we use a parallel Bible corpus to align contextual embeddings at the verse level. The aligned verses are used to build cross-language translation matrices, which in turn are used to map between embedding spaces for the various languages. Finally, we use sets of inflected forms, primarily from a high-resource language, to induce vector representations for individual UniMorph tags. Morphological analysis is performed by matching vector representations to embeddings for individual tokens. While our system results are dramatically below the average system submitted for the shared task evaluation campaign, our method is (we suspect) unique in its minimal reliance on labeled training data.

2018

pdf bib

A Corpus of Metaphor Novelty Scores for Syntactically-Related Word Pairs
Natalie Parde | Rodney Nielsen
Proceedings of the Eleventh International Conference on Language Resources and Evaluation (LREC 2018)

pdf bib

Annotating Educational Questions for Student Response Analysis
Andreea Godea | Rodney Nielsen
Proceedings of the Eleventh International Conference on Language Resources and Evaluation (LREC 2018)

pdf bib

Annotating Reflections for Health Behavior Change Therapy
Nishitha Guntakandla | Rodney Nielsen
Proceedings of the Eleventh International Conference on Language Resources and Evaluation (LREC 2018)

pdf bib abs

Detecting Sarcasm is Extremely Easy ;-)
Natalie Parde | Rodney Nielsen
Proceedings of the Workshop on Computational Semantics beyond Events and Roles

Detecting sarcasm in text is a particularly challenging problem in computational semantics, and its solution may vary across different types of text. We analyze the performance of a domain-general sarcasm detection system on datasets from two very different domains: Twitter, and Amazon product reviews. We categorize the errors that we identify with each, and make recommendations for addressing these issues in NLP systems in the future.

pdf bib abs

Automatically Generating Questions about Novel Metaphors in Literature
Natalie Parde | Rodney Nielsen
Proceedings of the 11th International Conference on Natural Language Generation

The automatic generation of stimulating questions is crucial to the development of intelligent cognitive exercise applications. We developed an approach that generates appropriate Questioning the Author queries based on novel metaphors in diverse syntactic relations in literature. We show that the generated questions are comparable to human-generated questions in terms of naturalness, sensibility, and depth, and score slightly higher than human-generated questions in terms of clarity. We also show that questions generated about novel metaphors are rated as cognitively deeper than questions generated about non- or conventional metaphors, providing evidence that metaphor novelty can be leveraged to promote cognitive exercise.

2017

pdf bib abs

Finding Patterns in Noisy Crowds: Regression-based Annotation Aggregation for Crowdsourced Data
Natalie Parde | Rodney Nielsen
Proceedings of the 2017 Conference on Empirical Methods in Natural Language Processing

Crowdsourcing offers a convenient means of obtaining labeled data quickly and inexpensively. However, crowdsourced labels are often noisier than expert-annotated data, making it difficult to aggregate them meaningfully. We present an aggregation approach that learns a regression model from crowdsourced annotations to predict aggregated labels for instances that have no expert adjudications. The predicted labels achieve a correlation of 0.594 with expert labels on our data, outperforming the best alternative aggregation method by 11.9%. Our approach also outperforms the alternatives on third-party datasets.

2016

pdf bib abs

Dialogue Act Classification in Domain-Independent Conversations Using a Deep Recurrent Neural Network
Hamed Khanpour | Nishitha Guntakandla | Rodney Nielsen
Proceedings of COLING 2016, the 26th International Conference on Computational Linguistics: Technical Papers

In this study, we applied a deep LSTM structure to classify dialogue acts (DAs) in open-domain conversations. We found that the word embeddings parameters, dropout regularization, decay rate and number of layers are the parameters that have the largest effect on the final system accuracy. Using the findings of these experiments, we trained a deep LSTM network that outperforms the state-of-the-art on the Switchboard corpus by 3.11%, and MRDA by 2.2%.

pdf bib abs

Automatic Generation and Classification of Minimal Meaningful Propositions in Educational Systems
Andreea Godea | Florin Bulgarov | Rodney Nielsen
Proceedings of COLING 2016, the 26th International Conference on Computational Linguistics: Technical Papers

Truly effective and practical educational systems will only be achievable when they have the ability to fully recognize deep relationships between a learner’s interpretation of a subject and the desired conceptual understanding. In this paper, we take important steps in this direction by introducing a new representation of sentences – Minimal Meaningful Propositions (MMPs), which will allow us to significantly improve the mapping between a learner’s answer and the ideal response. Using this technique, we make significant progress towards highly scalable and domain independent educational systems, that will be able to operate without human intervention. Even though this is a new task, we show very good results both for the extraction of MMPs and for classification with respect to their importance.

This paper summarizes the annotation of fine-grained entailment relationships in the context of student answers to science assessment questions. We annotated a corpus of 15,357 answer pairs with 145,911 fine-grained entailment relationships. We provide the rationale for such fine-grained analysis and discuss its perceived benefits to an Intelligent Tutoring System. The corpus also has potential applications in other areas, such as question answering and multi-document summarization. Annotators achieved 86.2% inter-annotator agreement (Kappa=0.728, corresponding to substantial agreement) annotating the fine-grained facets of reference answers with regard to understanding expressed in student answers and labeling from one of five possible detailed relationship categories. The corpus described in this paper, which is the only one providing such detailed entailment annotations, is available as a public resource for the research community. The corpus is expected to enable application development, not only for intelligent tutoring systems, but also for general textual entailment applications, that is currently not practical.

pdf bib

Extracting a Representation from Text for Semantic Analysis
Rodney D. Nielsen | Wayne Ward | James H. Martin | Martha Palmer
Proceedings of ACL-08: HLT, Short Papers

pdf bib

Classification Errors in a Domain-Independent Assessment System
Rodney D. Nielsen | Wayne Ward | James H. Martin
Proceedings of the Third Workshop on Innovative Use of NLP for Building Educational Applications