Xinyi Liu


2024

pdf
MEDs for PETs: Multilingual Euphemism Disambiguation for Potentially Euphemistic Terms
Patrick Lee | Alain Chirino Trujillo | Diana Cuevas Plancarte | Olumide Ojo | Xinyi Liu | Iyanuoluwa Shode | Yuan Zhao | Anna Feldman | Jing Peng
Findings of the Association for Computational Linguistics: EACL 2024

Euphemisms are found across the world’s languages, making them a universal linguistic phenomenon. As such, euphemistic data may have useful properties for computational tasks across languages. In this study, we explore this premise by training a multilingual transformer model (XLM-RoBERTa) to disambiguate potentially euphemistic terms (PETs) in multilingual and cross-lingual settings. In line with current trends, we demonstrate that zero-shot learning across languages takes place. We also show cases where multilingual models perform better on the task compared to monolingual models by a statistically significant margin, indicating that multilingual data presents additional opportunities for models to learn about cross-lingual, computational properties of euphemisms. In a follow-up analysis, we focus on universal euphemistic “categories” such as death and bodily functions among others. We test to see whether cross-lingual data of the same domain is more important than within-language data of other domains to further understand the nature of the cross-lingual transfer.

pdf
JN666 at SemEval-2024 Task 7: NumEval: Numeral-Aware Language Understanding and Generation
Xinyi Liu | Xintong Liu | Hengyang Lu
Proceedings of the 18th International Workshop on Semantic Evaluation (SemEval-2024)

This paper is submitted for SemEval-2027 task 7: Enhancing the Model’s Understanding and Generation of Numerical Values. The dataset for this task is NQuAD, which requires us to select the most suitable option number from four numerical options to fill in the blank in a news article based on the context. Based on the BertForMultipleChoice model, we proposed two new models, MC BERT and SSC BERT, and improved the model’s numerical understanding ability by pre-training the model on numerical comparison tasks. Ultimately, our best-performing model achieved an accuracy rate of 79.40%, which is 9.45% higher than the accuracy rate of NEMo.