Zixin Tang


2025

pdf bib
Using Contextually Aligned Online Reviews to Measure LLMs’ Performance Disparities Across Language Varieties
Zixin Tang | Chieh-Yang Huang | Tsung-che Li | Ho Yin Sam Ng | Hen-Hsen Huang | Ting-Hao Kenneth Huang
Proceedings of the 2025 Conference of the Nations of the Americas Chapter of the Association for Computational Linguistics: Human Language Technologies (Volume 2: Short Papers)

A language can have different varieties. These varieties can affect the performance of natural language processing (NLP) models, including large language models (LLMs), which are often trained on data from widely spoken varieties. This paper introduces a novel and cost-effective approach to benchmark model performance across language varieties. We argue that international online review platforms,such as Booking.com, can serve as effective data sources for constructing datasets that capture comments in different language varieties from similar real-world scenarios, like reviews for the same hotel with the same rating using the same language (e.g., Mandarin Chinese) but different language varieties (e.g., Taiwan Mandarin, Mainland Mandarin). To prove this concept, we constructed a contextually aligned dataset comprising reviews in Taiwan Mandarin and Mainland Mandarin and tested six LLMs in a sentiment analysis task. Our results show that LLMs consistently underperform in Taiwan Mandarin.

2024

pdf bib
Learning to Write Rationally: How Information Is Distributed in Non-native Speakers’ Essays
Zixin Tang | Janet G. van Hell
Proceedings of the 2024 Conference on Empirical Methods in Natural Language Processing

People tend to distribute information evenly in language production for better and clearer communication. In this study, we compared essays written by second language (L2) learners with various native language (L1) backgrounds to investigate how they distribute information in their non-native L2 production. Analyses of surprisal and constancy of entropy rate indicated that writers with higher L2 proficiency can reduce the expected uncertainty of language production while still conveying informative content. However, the uniformity of information distribution showed less variability among different groups of L2 speakers, suggesting that this feature may be universal in L2 essay writing and less affected by L2 writers’ variability in L1 background and L2 proficiency.

2021

pdf bib
Are BERTs Sensitive to Native Interference in L2 Production?
Zixin Tang | Prasenjit Mitra | David Reitter
Proceedings of the Second Workshop on Insights from Negative Results in NLP

With the essays part from The International Corpus Network of Asian Learners of English (ICNALE) and the TOEFL11 corpus, we fine-tuned neural language models based on BERT to predict English learners’ native languages. Results showed neural models can learn to represent and detect such native language impacts, but multilingually trained models have no advantage in doing so.