Shi Yu


2022

pdf
Speech Aerodynamics Database, Tools and Visualisation
Shi Yu | Clara Ponchard | Roland Trouville | Sergio Hassid | Didier Demolin
Proceedings of the Thirteenth Language Resources and Evaluation Conference

Aerodynamic processes underlie the characteristics of the acoustic signal of speech sounds. The aerodynamics of speech give insights on acoustic outcome and help explain the mechanisms of speech production. This database was designed during an ARC project ”Dynamique des systèmes phonologiques” in which the study of aerodynamic constraints on speech production was an important target. Data were recorded between 1996 and 1999 at the Erasmus Hospital (Hôpital Erasme) of Université Libre de Bruxelles, Belgium and constitute one of the few datasets available on direct measurement of subglottal pressure and other aerodynamic parameters. The goal was to obtain a substantial amount of data with simultaneous recording, in various context, of the speech acoustic signal, subglottal pressure (Ps), intraoral pressure (Po), oral airflow (Qo) and nasal airflow (Qn). This database contains recordings of 2 English, 1 Amharic, and 7 French speakers and is provided with data conversion and visualisation tools. Another aim of this project was to obtain some reference values of the aerodynamics of speech production for female and male speakers uttering different types of segments and sentences in French.

pdf
MIC: A Multi-task Interactive Curation Tool
Shi Yu | Mingfeng Yang | Jerrod Parker | Stephen Brock
Proceedings of the 2022 Conference on Empirical Methods in Natural Language Processing: System Demonstrations

This paper introduces MIC, a Multi-task Interactive Curation tool, a human-machine collaborative curation tool for multiple NLP tasks. The tool aims to borrow recent advances in literature to solve pain-points in real NLP tasks. Firstly, it supports multiple projects with multiple users which enables collaborative annotations. Secondly, MIC allows easy integration of pre-trained models, rules, and dictionaries to auto label the text and speed up the labeling process. Thirdly, MIC supports annotation at different scales (span of characters and words, tokens and lines, or document) and different types (free text, sentence labels, entity labels, and relationship triplets) with easy GUI operations.

2021

pdf
Named Entity Recognition through Deep Representation Learning and Weak Supervision
Jerrod Parker | Shi Yu
Findings of the Association for Computational Linguistics: ACL-IJCNLP 2021

2018

pdf
Sign Languages and the Online World Online Dictionaries & Lexicostatistics
Shi Yu | Carlo Geraci | Natasha Abner
Proceedings of the Eleventh International Conference on Language Resources and Evaluation (LREC 2018)