Erik Andersen


2020

We present the first Universal Dependencies treebank for Hittite. This paper expands on earlier efforts at Hittite corpus creation (Molina and Molin, 2016; Molina, 2016) and discussions of annotation guidelines for Hittite within the UD framework (Inglese, 2015; Inglese et al., 2018). We build on the expertise of the above works to create a small corpus which we hope will serve as a stepping-stone to more expansive UD treebanking for Hittite.

2016

Language students are most engaged while reading texts at an appropriate difficulty level. However, existing methods of evaluating text difficulty focus mainly on vocabulary and do not prioritize grammatical features, hence they do not work well for language learners with limited knowledge of grammar. In this paper, we introduce grammatical templates, the expert-identified units of grammar that students learn from class, as an important feature of text difficulty evaluation. Experimental classification results show that grammatical template features significantly improve text difficulty prediction accuracy over baseline readability features by 7.4%. Moreover,we build a simple and human-understandable text difficulty evaluation approach with 87.7% accuracy, using only 5 grammatical template features.