Zixuan Wang


2025

pdf bib
Cross-Document Cross-Lingual NLI via RST-Enhanced Graph Fusion and Interpretability Prediction
Mengying Yuan | WenHao Wang | Zixuan Wang | Yujie Huang | Kangli Wei | Fei Li | Chong Teng | Donghong Ji
Proceedings of the 2025 Conference on Empirical Methods in Natural Language Processing

Natural Language Inference (NLI) is a fundamental task in natural language processing. While NLI has developed many subdirections such as sentence-level NLI, document-level NLI and cross-lingual NLI, Cross-Document Cross-Lingual NLI (CDCL-NLI) remains largely unexplored. In this paper, we propose a novel paradigm: CDCL-NLI, which extends traditional NLI capabilities to multi-document, multilingual scenarios. To support this task, we construct a high-quality CDCL-NLI dataset including 25,410 instances and spanning 26 languages.To address the limitations of previous methods on CDCL-NLI task, we further propose an innovative method that integrates RST-enhanced graph fusion with interpretability-aware prediction.Our approach leverages RST (Rhetorical Structure Theory) within heterogeneous graph neural networks for cross-document context modeling, and employs a structure-aware semantic alignment based on lexical chains for cross-lingual understanding. For NLI interpretability, we develop an EDU (Elementary Discourse Unit)-level attribution framework that produces extractive explanations.Extensive experiments demonstrate our approach’s superior performance, achieving significant improvements over both conventional NLI models as well as large language models.Our work sheds light on the study of NLI and will bring research interest on cross-document cross-lingual context understanding, hallucination elimination and interpretability inference.Our code and dataset are available at CDCL-NLI-link.

pdf bib
FinEval-KR: A Financial Domain Evaluation Framework for Large Language Models’ Knowledge and Reasoning
Shaoyu Dou | Yutian Shen | Mofan Chen | Zixuan Wang | Jiajie Xu | Qi Guo | Kailai Shao | Chao Chen | Haixiang Hu | Haibo Shi | Min Min | Liwen Zhang
Proceedings of The 10th Workshop on Financial Technology and Natural Language Processing

pdf bib
Cross-Document Cross-Lingual NLI via RST-Enhanced Graph Fusion and Interpretability Prediction
Mengying Yuan | WenHao Wang | Zixuan Wang | Yujie Huang | Kangli Wei | Fei Li | Chong Teng | Donghong Ji
Proceedings of the 5th Workshop on Multilingual Representation Learning (MRL 2025)

Natural Language Inference (NLI) is a fundamental task in natural language processing. While NLI has developed many sub-directions such as sentence-level NLI, document-level NLI and cross-lingual NLI, Cross-Document Cross-Lingual NLI (CDCL-NLI) remains largely unexplored. In this paper, we propose a novel paradigm: CDCL-NLI, which extends traditional NLI capabilities to multi-document, multilingual scenarios. To support this task, we construct a high-quality CDCL-NLI dataset including 25,410 instances and spanning 26 languages. To address the limitations of previous methods on CDCL-NLI task, we further propose an innovative method that integrates RST-enhanced graph fusion with interpretability-aware prediction. Our approach leverages RST (Rhetorical Structure Theory) within heterogeneous graph neural networks for cross-document context modeling, and employs a structure-aware semantic alignment based on lexical chains for cross-lingual understanding. For NLI interpretability, we develop an EDU (Elementary Discourse Unit)-level attribution framework that produces extractive explanations. Extensive experiments demonstrate our approach”s superior performance, achieving significant improvements over both conventional NLI models as well as large language models. Our work sheds light on the study of NLI and will bring research interest on cross-document cross-lingual context understanding, hallucination elimination and interpretability inference. Our dataset and code are available at https://anonymous.4open.science/r/CDCL-NLI-637E/ for peer review.

2020

pdf bib
Tencent submission for WMT20 Quality Estimation Shared Task
Haijiang Wu | Zixuan Wang | Qingsong Ma | Xinjie Wen | Ruichen Wang | Xiaoli Wang | Yulin Zhang | Zhipeng Yao | Siyao Peng
Proceedings of the Fifth Conference on Machine Translation

This paper presents Tencent’s submission to the WMT20 Quality Estimation (QE) Shared Task: Sentence-Level Post-editing Effort for English-Chinese in Task 2. Our system ensembles two architectures, XLM-based and Transformer-based Predictor-Estimator models. For the XLM-based Predictor-Estimator architecture, the predictor produces two types of contextualized token representations, i.e., masked XLM and non-masked XLM; the LSTM-estimator and Transformer-estimator employ two effective strategies, top-K and multi-head attention, to enhance the sentence feature representation. For Transformer-based Predictor-Estimator architecture, we improve a top-performing model by conducting three modifications: using multi-decoding in machine translation module, creating a new model by replacing the transformer-based predictor with XLM-based predictor, and finally integrating two models by a weighted average. Our submission achieves a Pearson correlation of 0.664, ranking first (tied) on English-Chinese.