Sunghye Cho


2022

pdf
Reflections on 30 Years of Language Resource Development and Sharing
Christopher Cieri | Mark Liberman | Sunghye Cho | Stephanie Strassel | James Fiumara | Jonathan Wright
Proceedings of the Thirteenth Language Resources and Evaluation Conference

The Linguistic Data Consortium was founded in 1992 to solve the problem that limitations in access to shareable data was impeding progress in Human Language Technology research and development. At the time, DARPA had adopted the common task research management paradigm to impose additional rigor on their programs by also providing shared objectives, data and evaluation methods. Early successes underscored the promise of this paradigm but also the need for a standing infrastructure to host and distribute the shared data. During LDC’s initial five year grant, it became clear that the demand for linguistic data could not easily be met by the existing providers and that a dedicated data center could add capacity first for data collection and shortly thereafter for annotation. The expanding purview required expansions of LDC’s technical infrastructure including systems support and software development. An open question for the center would be its role in other kinds of research beyond data development. Over its 30 years history, LDC has performed multiple roles ranging from neutral, independent data provider to multisite programs, to creator of exploratory data in tight collaboration with system developers, to research group focused on data intensive investigations.

pdf
Identifying stable speech-language markers of autism in children: Preliminary evidence from a longitudinal telephony-based study
Sunghye Cho | Riccardo Fusaroli | Maggie Rose Pelella | Kimberly Tena | Azia Knox | Aili Hauptmann | Maxine Covello | Alison Russell | Judith Miller | Alison Hulink | Jennifer Uzokwe | Kevin Walker | James Fiumara | Juhi Pandey | Christopher Chatham | Christopher Cieri | Robert Schultz | Mark Liberman | Julia Parish-morris
Proceedings of the Eighth Workshop on Computational Linguistics and Clinical Psychology

This study examined differences in linguistic features produced by autistic and neurotypical (NT) children during brief picture descriptions, and assessed feature stability over time. Weekly speech samples from well-characterized participants were collected using a telephony system designed to improve access for geographically isolated and historically marginalized communities. Results showed stable group differences in certain acoustic features, some of which may potentially serve as key outcome measures in future treatment studies. These results highlight the importance of eliciting semi-structured speech samples in a variety of contexts over time, and adds to a growing body of research showing that fine-grained naturalistic communication features hold promise for intervention research.