Amy LaCross


Fixing paper assignments

  1. Please select all papers that belong to the same person.
  2. Indicate below which author they should be assigned to.
Provide a valid ORCID iD here. This will be used to match future papers to this author.
Provide the name of the school or the university where the author has received or will receive their highest degree (e.g., Ph.D. institution for researchers, or current affiliation for students). This will be used to form the new author page ID, if needed.

TODO: "submit" and "cancel" buttons here


2010

pdf bib
How Specialized are Specialized Corpora? Behavioral Evaluation of Corpus Representativeness for Maltese.
Jerid Francom | Amy LaCross | Adam Ussishkin
Proceedings of the Seventh International Conference on Language Resources and Evaluation (LREC'10)

In this paper we bring to light a novel intersection between corpus linguistics and behavioral data that can be employed as an evaluation metric for resources for low-density languages, drawing on well-established psycholinguistic factors. Using the low-density language Maltese as a test case, we highlight the challenges that face researchers developing resources for languages with sparsely available data and identify a key empirical link between corpus and psycholinguistic research as a tool to evaluate corpus resources. Specifically, we compare two robust variables identified in the psycholinguistic literature: word frequency (as measured in a corpus) and word familiarity (as measured in a rating task). We then apply statistical methods to evaluate the extent to which familiarity ratings predict corpus frequency for verbs in the Maltese corpus from three angles: 1) token frequency, 2) frequency distributions and 3) morpho-syntactic type (binyan). This research provides a multidisciplinary approach to corpus development and evaluation, in particular for less-resourced languages that lack a wide access to diverse language data.