2022
pdf
abs
Calibrating Zero-shot Cross-lingual (Un-)structured Predictions
Zhengping Jiang
|
Anqi Liu
|
Benjamin Van Durme
Proceedings of the 2022 Conference on Empirical Methods in Natural Language Processing
We investigate model calibration in the setting of zero-shot cross-lingual transfer with large-scale pre-trained language models. The level of model calibration is an important metric for evaluating the trustworthiness of predictive models. There exists an essential need for model calibration when natural language models are deployed in critical tasks. We study different post-training calibration methods in structured and unstructured prediction tasks. We find that models trained with data from the source language become less calibrated when applied to the target language and that calibration errors increase with intrinsic task difficulty and relative sparsity of training data. Moreover, we observe a potential connection between the level of calibration error and an earlier proposed measure of the distance from English to other languages. Finally, our comparison demonstrates that among other methods Temperature Scaling (TS) generalizes well to distant languages, but TS fails to calibrate more complex confidence estimation in structured predictions compared to more expressive alternatives like Gaussian Process Calibration.
2021
pdf
abs
Segmenting Subtitles for Correcting ASR Segmentation Errors
David Wan
|
Chris Kedzie
|
Faisal Ladhak
|
Elsbeth Turcan
|
Petra Galuscakova
|
Elena Zotkina
|
Zhengping Jiang
|
Peter Bell
|
Kathleen McKeown
Proceedings of the 16th Conference of the European Chapter of the Association for Computational Linguistics: Main Volume
Typical ASR systems segment the input audio into utterances using purely acoustic information, which may not resemble the sentence-like units that are expected by conventional machine translation (MT) systems for Spoken Language Translation. In this work, we propose a model for correcting the acoustic segmentation of ASR models for low-resource languages to improve performance on downstream tasks. We propose the use of subtitles as a proxy dataset for correcting ASR acoustic segmentation, creating synthetic acoustic utterances by modeling common error modes. We train a neural tagging model for correcting ASR acoustic segmentation and show that it improves downstream performance on MT and audio-document cross-language information retrieval (CLIR).
pdf
abs
Automatic Detection and Prediction of Psychiatric Hospitalizations From Social Media Posts
Zhengping Jiang
|
Jonathan Zomick
|
Sarah Ita Levitan
|
Mark Serper
|
Julia Hirschberg
Proceedings of the Seventh Workshop on Computational Linguistics and Clinical Psychology: Improving Access
We address the problem of predicting psychiatric hospitalizations using linguistic features drawn from social media posts. We formulate this novel task and develop an approach to automatically extract time spans of self-reported psychiatric hospitalizations. Using this dataset, we build predictive models of psychiatric hospitalization, comparing feature sets, user vs. post classification, and comparing model performance using a varying time window of posts. Our best model achieves an F1 of .718 using 7 days of posts. Our results suggest that this is a useful framework for collecting hospitalization data, and that social media data can be leveraged to predict acute psychiatric crises before they occur, potentially saving lives and improving outcomes for individuals with mental illness.
2020
pdf
abs
Uncertain Natural Language Inference
Tongfei Chen
|
Zhengping Jiang
|
Adam Poliak
|
Keisuke Sakaguchi
|
Benjamin Van Durme
Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics
We introduce Uncertain Natural Language Inference (UNLI), a refinement of Natural Language Inference (NLI) that shifts away from categorical labels, targeting instead the direct prediction of subjective probability assessments. We demonstrate the feasibility of collecting annotations for UNLI by relabeling a portion of the SNLI dataset under a probabilistic scale, where items even with the same categorical label differ in how likely people judge them to be true given a premise. We describe a direct scalar regression modeling approach, and find that existing categorically-labeled NLI data can be used in pre-training. Our best models correlate well with humans, demonstrating models are capable of more subtle inferences than the categorical bin assignment employed in current NLI tasks.
pdf
abs
Detection of Mental Health from Reddit via Deep Contextualized Representations
Zhengping Jiang
|
Sarah Ita Levitan
|
Jonathan Zomick
|
Julia Hirschberg
Proceedings of the 11th International Workshop on Health Text Mining and Information Analysis
We address the problem of automatic detection of psychiatric disorders from the linguistic content of social media posts. We build a large scale dataset of Reddit posts from users with eight disorders and a control user group. We extract and analyze linguistic characteristics of posts and identify differences between diagnostic groups. We build strong classification models based on deep contextualized word representations and show that they outperform previously applied statistical models with simple linguistic features by large margins. We compare user-level and post-level classification performance, as well as an ensembled multiclass model.
pdf
abs
Subtitles to Segmentation: Improving Low-Resource Speech-to-TextTranslation Pipelines
David Wan
|
Zhengping Jiang
|
Chris Kedzie
|
Elsbeth Turcan
|
Peter Bell
|
Kathy McKeown
Proceedings of the workshop on Cross-Language Search and Summarization of Text and Speech (CLSSTS2020)
In this work, we focus on improving ASR output segmentation in the context of low-resource language speech-to-text translation. ASR output segmentation is crucial, as ASR systems segment the input audio using purely acoustic information and are not guaranteed to output sentence-like segments. Since most MT systems expect sentences as input, feeding in longer unsegmented passages can lead to sub-optimal performance. We explore the feasibility of using datasets of subtitles from TV shows and movies to train better ASR segmentation models. We further incorporate part-of-speech (POS) tag and dependency label information (derived from the unsegmented ASR outputs) into our segmentation model. We show that this noisy syntactic information can improve model accuracy. We evaluate our models intrinsically on segmentation quality and extrinsically on downstream MT performance, as well as downstream tasks including cross-lingual information retrieval (CLIR) tasks and human relevance assessments. Our model shows improved performance on downstream tasks for Lithuanian and Bulgarian.
2018
pdf
abs
CSReader at SemEval-2018 Task 11: Multiple Choice Question Answering as Textual Entailment
Zhengping Jiang
|
Qi Sun
Proceedings of the 12th International Workshop on Semantic Evaluation
In this document we present an end-to-end machine reading comprehension system that solves multiple choice questions with a textual entailment perspective. Since some of the knowledge required is not explicitly mentioned in the text, we try to exploit commonsense knowledge by using pretrained word embeddings during contextual embeddings and by dynamically generating a weighted representation of related script knowledge. In the model two kinds of prediction structure are ensembled, and the final accuracy of our system is 10 percent higher than the naiive baseline.
2006
pdf
Semantic Role Labeling of NomBank: A Maximum Entropy Approach
Zheng Ping Jiang
|
Hwee Tou Ng
Proceedings of the 2006 Conference on Empirical Methods in Natural Language Processing