2025
pdf
bib
abs
Quantifying Semantic Emergence in Language Models
Hang Chen
|
Xinyu Yang
|
Jiaying Zhu
|
Wenya Wang
Proceedings of the 63rd Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers)
Large language models (LLMs) are widely recognized for their exceptional capacity to capture semantics meaning. Yet, there remains no established metric to quantify this capability. In this work, we introduce a quantitative metric, Information Emergence (IE), designed to measure LLMs’ ability to extract semantics from input tokens. We formalize “semantics” as the meaningful information abstracted from a sequence of tokens and quantify this by comparing the entropy reduction observed for a sequence of tokens (macro-level) and individual tokens (micro-level). To achieve this, we design a lightweight estimator to compute the mutual information at each transformer layer, which is agnostic to different tasks and language model architectures. We apply IE in both synthetic in-context learning (ICL) scenarios and natural sentence contexts. Experiments demonstrate informativeness and patterns about semantics. While some of these patterns confirm the conventional prior linguistic knowledge, the rest are relatively unexpected, which may provide new insights.
pdf
bib
abs
Debiasing the Fine-Grained Classification Task in LLMs with Bias-Aware PEFT
Daiying Zhao
|
Xinyu Yang
|
Hang Chen
Proceedings of the 63rd Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers)
Fine-grained classification via LLMs is susceptible to more complex label biases compared to traditional classification tasks. Existing bias mitigation strategies, such as retraining, post-hoc adjustment, and parameter-efficient fine-tuning (PEFT) are primarily effective for simple classification biases, such as stereotypes, but fail to adequately address prediction propensity and discriminative ability biases. In this paper, we analyze these two bias phenomena and observe their progressive accumulation from intermediate to deeper layers within LLMs. To mitigate this issue, we propose a bias-aware optimization framework that incorporates two distinct label balance constraints with a PEFT strategy targeting an intermediate layer. Our approach adjusts less than 1% of the model’s parameters while effectively curbing bias amplification in deeper layers. Extensive experiments conducted across 12 datasets and 5 LLMs demonstrate that our method consistently outperforms or matches the performance of full-parameter fine-tuning and LoRA, achieving superior results with lower perplexity.
2023
pdf
bib
abs
How to Enhance Causal Discrimination of Utterances: A Case on Affective Reasoning
Hang Chen
|
Xinyu Yang
|
Jing Luo
|
Wenjing Zhu
Proceedings of the 2023 Conference on Empirical Methods in Natural Language Processing
Our investigation into the Affective Reasoning in Conversation (ARC) task highlights the challenge of causal discrimination. Almost all existing models, including large language models (LLMs), excel at capturing semantic correlations within utterance embeddings but fall short in determining the specific causal relationships. To overcome this limitation, we propose the incorporation of i.i.d. noise terms into the conversation process, thereby constructing a structural causal model (SCM). It explores how distinct causal relationships of fitted embeddings can be discerned through independent conditions. To facilitate the implementation of deep learning, we introduce the cogn frameworks to handle unstructured conversation data, and employ an autoencoder architecture to regard the unobservable noise as learnable “implicit causes.” Moreover, we curate a synthetic dataset that includes i.i.d. noise. Through comprehensive experiments, we validate the effectiveness and interpretability of our approach. Our code is available in https://github.com/Zodiark-ch/mater-of-our-EMNLP2023-paper.
2019
pdf
bib
abs
What does the language of foods say about us?
Hoang Van
|
Ahmad Musa
|
Hang Chen
|
Stephen Kobourov
|
Mihai Surdeanu
Proceedings of the Tenth International Workshop on Health Text Mining and Information Analysis (LOUHI 2019)
In this work we investigate the signal contained in the language of food on social media. We experiment with a dataset of 24 million food-related tweets, and make several observations. First,thelanguageoffoodhaspredictive power. We are able to predict if states in the United States (US) are above the medianratesfortype2diabetesmellitus(T2DM), income, poverty, and education – outperforming previous work by 4–18%. Second, we investigate the effect of socioeconomic factors (income, poverty, and education) on predicting state-level T2DM rates. Socioeconomic factors do improve T2DM prediction, with the greatestimprovementcomingfrompovertyinformation(6%),but,importantly,thelanguage of food adds distinct information that is not captured by socioeconomics. Third, we analyze how the language of food has changed over a five-year period (2013 – 2017), which is indicative of the shift in eating habits in the US during that period. We find several food trends, and that the language of food is used differently by different groups such as differentgenders. Last,weprovideanonlinevisualization tool for real-time queries and semantic analysis.