Xiaoyue Wang (王笑月)

Xiaoyue Wang

Also published as: 笑月王

2025

pdf bib abs
Beyond Generation: Leveraging LLM Creativity to Overcome Label Bias in Classification
Xiaoyue Wang | Xin Liu
Findings of the Association for Computational Linguistics: ACL 2025

Large Language Models (LLMs) exhibit impressive capabilities in In-Context Learning (ICL) but are prone to label bias—an undesirable tendency to favor certain answers. Existing calibration methods mitigate bias by leveraging in-domain data, yet such data is often unavailable in real-world scenarios. To address this limitation, we propose SDC (Synthetic Data Calibration), a simple-yet-effective approach that generates synthetic in-domain data from a few in-context demonstrations and utilizes it for calibration. By approximating the benefits of real in-domain data, SDC effectively reduces label bias without requiring access to actual domain-specific inputs. Experimental evaluations on 279 classification and multiple-choice tasks from the Super-NaturalInstructions benchmark. The results show that SDC significantly reduces label bias, achieving an average Bias Score reduction of 57.5%, and outperforming all competitive baselines. Moreover, when combined with Leave-One-Out Calibration (LOOC), further improves performance, underscoring its effectiveness and generalizability in enhancing the reliability of LLMs.

2024

pdf bib abs
IR2: Information Regularization for Information Retrieval
Jianyou Wang | Kaicheng Wang | Xiaoyue Wang | Weili Cao | Ramamohan Paturi | Leon Bergen
Proceedings of the 2024 Joint International Conference on Computational Linguistics, Language Resources and Evaluation (LREC-COLING 2024)

Effective information retrieval (IR) in settings with limited training data, particularly for complex queries, remains a challenging task. This paper introduces IR2, Information Regularization for Information Retrieval, a technique for reducing overfitting during synthetic data generation. This approach, representing a novel application of regularization techniques in synthetic data creation for IR, is tested on three recent IR tasks characterized by complex queries: DORIS-MAE, ArguAna, and WhatsThatBook. Experimental results indicate that our regularization techniques not only outperform previous synthetic query generation methods on the tasks considered but also reduce cost by up to 50%. Furthermore, this paper categorizes and explores three regularization methods at different stages of the query synthesis pipeline—input, prompt, and output—each offering varying degrees of performance improvement compared to models where no regularization is applied. This provides a systematic approach for optimizing synthetic data generation in data-limited, complex-query IR scenarios. All code, prompts and synthetic data are available at https://github.com/Info-Regularization/Information-Regularization.

2023

pdf bib abs
IBADR: an Iterative Bias-Aware Dataset Refinement Framework for Debiasing NLU models
Xiaoyue Wang | Xin Liu | Lijie Wang | Yaoxiang Wang | Jinsong Su | Hua Wu
Proceedings of the 2023 Conference on Empirical Methods in Natural Language Processing

As commonly-used methods for debiasing natural language understanding (NLU) models, dataset refinement approaches heavily rely on manual data analysis, and thus maybe unable to cover all the potential biased features. In this paper, we propose IBADR, an Iterative Bias-Aware Dataset Refinement framework, which debiases NLU models without predefining biased features. We maintain an iteratively expanded sample pool. Specifically, at each iteration, we first train a shallow model to quantify the bias degree of samples in the pool. Then, we pair each sample with a bias indicator representing its bias degree, and use these extended samples to train a sample generator. In this way, this generator can effectively learn the correspondence relationship between bias indicators and samples. Furthermore, we employ the generator to produce pseudo samples with fewer biased features by feeding specific bias indicators. Finally, we incorporate the generated pseudo samples into the pool. Experimental results and in-depth analyses on two NLU tasks show that IBADR not only significantly outperforms existing dataset refinement approaches, achieving SOTA, but also is compatible with model-centric methods.

2022

Simile recognition involves two subtasks: simile sentence classification that discriminates whether a sentence contains simile, and simile component extraction that locates the corresponding objects (i.e., tenors and vehicles).Recent work ignores features other than surface strings and suffers from the data hunger issue.We explore expressive features for this task to help achieve more effective data utilization.In particular, we study two types of features: 1) input-side features that include POS tags, dependency trees and word definitions, and 2) decoding features that capture the interdependence among various decoding decisions.We further construct a model named HGSR, which merges the input-side features as a heterogeneous graph and leverages decoding features via distillation.Experiments show that HGSR significantly outperforms the current state-of-the-art systems and carefully designed baselines, verifying the effectiveness of introduced features. We will release our code upon paper acceptance.

2021

2020

pdf bib abs
多模块联合的阅读理解候选句抽取(Evidence sentence extraction for reading comprehension based on multi-module)
Yu Ji (吉宇) | Xiaoyue Wang (王笑月) | Ru Li (李茹) | Shaoru Guo (郭少茹) | Yong Guan (关勇)
Proceedings of the 19th Chinese National Conference on Computational Linguistics

机器阅读理解作为自然语言理解的关键任务,受到国内外学者广泛关注。针对多项选择型阅读理解中无线索标注且涉及多步推理致使候选句抽取困难的问题,本文提出一种基于多模块联合的候选句抽取模型。首先采用部分标注数据微调预训练模型;其次通过TF-IDF递归式抽取多跳推理问题中的候选句;最后结合无监督方式进一步筛选模型预测结果降低冗余性。本文在高考语文选择题及RACE数据集上进行验证,在候选句抽取中,本文方法相比于最优基线模型F1值提升3.44%,在下游答题任务中采用候选句作为模型输入较全文输入时准确率分别提高3.68%和3.6%,上述结果证实本文所提方法有效性。