Farah Nadeem


2019

pdf
Automated Essay Scoring with Discourse-Aware Neural Models
Farah Nadeem | Huy Nguyen | Yang Liu | Mari Ostendorf
Proceedings of the Fourteenth Workshop on Innovative Use of NLP for Building Educational Applications

Automated essay scoring systems typically rely on hand-crafted features to predict essay quality, but such systems are limited by the cost of feature engineering. Neural networks offer an alternative to feature engineering, but they typically require more annotated data. This paper explores network structures, contextualized embeddings and pre-training strategies aimed at capturing discourse characteristics of essays. Experiments on three essay scoring tasks show benefits from all three strategies in different combinations, with simpler architectures being more effective when less training data is available.

pdf bib
Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics: Student Research Workshop
Sudipta Kar | Farah Nadeem | Laura Burdick | Greg Durrett | Na-Rae Han
Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics: Student Research Workshop

2018

pdf
Estimating Linguistic Complexity for Science Texts
Farah Nadeem | Mari Ostendorf
Proceedings of the Thirteenth Workshop on Innovative Use of NLP for Building Educational Applications

Evaluation of text difficulty is important both for downstream tasks like text simplification, and for supporting educators in classrooms. Existing work on automated text complexity analysis uses linear models with engineered knowledge-driven features as inputs. While this offers interpretability, these models have lower accuracy for shorter texts. Traditional readability metrics have the additional drawback of not generalizing to informational texts such as science. We propose a neural approach, training on science and other informational texts, to mitigate both problems. Our results show that neural methods outperform knowledge-based linear models for short texts, and have the capacity to generalize to genres not present in the training data.

2017

pdf
Language Based Mapping of Science Assessment Items to Skills
Farah Nadeem | Mari Ostendorf
Proceedings of the 12th Workshop on Innovative Use of NLP for Building Educational Applications

Knowledge of the association between assessment questions and the skills required to solve them is necessary for analysis of student learning. This association, often represented as a Q-matrix, is either hand-labeled by domain experts or learned as latent variables given a large student response data set. As a means of automating the match to formal standards, this paper uses neural text classification methods, leveraging the language in the standards documents to identify online text for a proxy training task. Experiments involve identifying the topic and crosscutting concepts of middle school science questions leveraging multi-task training. Results show that it is possible to automatically build a Q-matrix without student response data and using a modest number of hand-labeled questions.