Ginny Wong
2026
XToM: Exploring the Multilingual Theory of Mind for Large Language Models
Chunkit Chan | Yauwai Yim | Hongchuan Zeng | Zhiying Zou | Xinyuan Cheng | Zhifan Sun | Zheye Deng | Kawai Chung | Yuzhuo Ao | Fan Yixiang | Cheng Jiayang | Ercong Nie | Ginny Wong | Helmut Schmid | Hinrich Schuetze | Simon See | Yangqiu Song
Proceedings of the 64th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers)
Chunkit Chan | Yauwai Yim | Hongchuan Zeng | Zhiying Zou | Xinyuan Cheng | Zhifan Sun | Zheye Deng | Kawai Chung | Yuzhuo Ao | Fan Yixiang | Cheng Jiayang | Ercong Nie | Ginny Wong | Helmut Schmid | Hinrich Schuetze | Simon See | Yangqiu Song
Proceedings of the 64th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers)
Theory of Mind (ToM)—the ability to infer mental states in others—is pivotal for human social cognition. Existing evaluations of ToM in LLMs are largely limited to English, neglecting the linguistic diversity that shapes human cognition. This limitation raises a critical question: can LLMs exhibit Multilingual Theory of Mind—the capacity to reason about mental states across diverse linguistic contexts? To address this gap, we present XToM, a rigorously validated multilingual benchmark that evaluates ToM across five languages and incorporates diverse, contextually rich task scenarios. Using XToM, we systematically evaluate LLMs (e.g., DeepSeek R1), revealing a pronounced dissonance: while models excel in multilingual language understanding, their ToM performance varies across languages. Our findings expose limitations in LLMs’ ability to replicate human-like mentalizing across linguistic contexts.
2025
LogiDynamics: Unraveling the Dynamics of Inductive, Abductive and Deductive Logical Inferences in LLM Reasoning
Tianshi Zheng | Cheng Jiayang | Chunyang Li | Haochen Shi | Zihao Wang | Jiaxin Bai | Yangqiu Song | Ginny Wong | Simon See
Proceedings of the 2025 Conference on Empirical Methods in Natural Language Processing
Tianshi Zheng | Cheng Jiayang | Chunyang Li | Haochen Shi | Zihao Wang | Jiaxin Bai | Yangqiu Song | Ginny Wong | Simon See
Proceedings of the 2025 Conference on Empirical Methods in Natural Language Processing
Modern large language models (LLMs) employ diverse logical inference mechanisms for reasoning, making the strategic optimization of these approaches critical for advancing their capabilities. This paper systematically investigate the **comparative dynamics** of inductive (System 1) versus abductive/deductive (System 2) inference in LLMs. We utilize a controlled analogical reasoning environment, varying modality (textual, visual, symbolic), difficulty, and task format (MCQ / free-text). Our analysis reveals System 2 pipelines generally excel, particularly in visual/symbolic modalities and harder tasks, while System 1 is competitive for textual and easier problems. Crucially, task format significantly influences their relative advantage, with System 1 sometimes outperforming System 2 in free-text rule-execution. These core findings generalize to broader in-context learning. Furthermore, we demonstrate that advanced System 2 strategies like hypothesis selection and iterative refinement can substantially scale LLM reasoning. This study offers foundational insights and actionable guidelines for strategically deploying logical inference to enhance LLM reasoning.
2024
AbsInstruct: Eliciting Abstraction Ability from LLMs through Explanation Tuning with Plausibility Estimation
Zhaowei Wang | Wei Fan | Qing Zong | Hongming Zhang | Sehyun Choi | Tianqing Fang | Xin Liu | Yangqiu Song | Ginny Wong | Simon See
Proceedings of the 62nd Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers)
Zhaowei Wang | Wei Fan | Qing Zong | Hongming Zhang | Sehyun Choi | Tianqing Fang | Xin Liu | Yangqiu Song | Ginny Wong | Simon See
Proceedings of the 62nd Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers)
Abstraction ability is crucial in human intelligence, which can also benefit various tasks in NLP study. Existing work shows that LLMs are deficient in abstract ability, and how to improve it remains unexplored. In this work, we design the framework AbsInstruct to enhance LLMs’ abstraction ability through instruction tuning. The framework builds instructions with in-depth explanations to assist LLMs in capturing the underlying rationale of abstraction. Meanwhile, we introduce a plausibility estimator to select instructions that are more consistent with the abstraction knowledge of LLMs to be aligned. Then, our framework combines abstraction instructions with general-purpose ones to build a hybrid dataset. Extensive experiments and analyses demonstrate that our framework can considerably enhance LLMs’ abstraction ability with strong generalization performance while maintaining their general instruction-following abilities.
2023
Self-Consistent Narrative Prompts on Abductive Natural Language Inference
Chunkit Chan | Xin Liu | Tsz Ho Chan | Jiayang Cheng | Yangqiu Song | Ginny Wong | Simon See
Proceedings of the 13th International Joint Conference on Natural Language Processing and the 3rd Conference of the Asia-Pacific Chapter of the Association for Computational Linguistics (Volume 1: Long Papers)
Chunkit Chan | Xin Liu | Tsz Ho Chan | Jiayang Cheng | Yangqiu Song | Ginny Wong | Simon See
Proceedings of the 13th International Joint Conference on Natural Language Processing and the 3rd Conference of the Asia-Pacific Chapter of the Association for Computational Linguistics (Volume 1: Long Papers)
DiscoPrompt: Path Prediction Prompt Tuning for Implicit Discourse Relation Recognition
Chunkit Chan | Xin Liu | Jiayang Cheng | Zihan Li | Yangqiu Song | Ginny Wong | Simon See
Findings of the Association for Computational Linguistics: ACL 2023
Chunkit Chan | Xin Liu | Jiayang Cheng | Zihan Li | Yangqiu Song | Ginny Wong | Simon See
Findings of the Association for Computational Linguistics: ACL 2023
Implicit Discourse Relation Recognition (IDRR) is a sophisticated and challenging task to recognize the discourse relations between the arguments with the absence of discourse connectives. The sense labels for each discourse relation follow a hierarchical classification scheme in the annotation process (Prasad et al., 2008), forming a hierarchy structure. Most existing works do not well incorporate the hierarchy structure but focus on the syntax features and the prior knowledge of connectives in the manner of pure text classification. We argue that it is more effective to predict the paths inside the hierarchical tree (e.g., “Comparison -> Contrast -> however”) rather than flat labels (e.g., Contrast) or connectives (e.g., however). We propose a prompt-based path prediction method to utilize the interactive information and intrinsic senses among the hierarchy in IDRR. This is the first work that injects such structure information into pre-trained language models via prompt tuning, and the performance of our solution shows significant and consistent improvement against competitive baselines.
TILFA: A Unified Framework for Text, Image, and Layout Fusion in Argument Mining
Qing Zong | Zhaowei Wang | Baixuan Xu | Tianshi Zheng | Haochen Shi | Weiqi Wang | Yangqiu Song | Ginny Wong | Simon See
Proceedings of the 10th Workshop on Argument Mining
Qing Zong | Zhaowei Wang | Baixuan Xu | Tianshi Zheng | Haochen Shi | Weiqi Wang | Yangqiu Song | Ginny Wong | Simon See
Proceedings of the 10th Workshop on Argument Mining
A main goal of Argument Mining (AM) is to analyze an author’s stance. Unlike previous AM datasets focusing only on text, the shared task at the 10th Workshop on Argument Mining introduces a dataset including both texts and images. Importantly, these images contain both visual elements and optical characters. Our new framework, TILFA (A Unified Framework for Text, Image, and Layout Fusion in Argument Mining), is designed to handle this mixed data. It excels at not only understanding text but also detecting optical characters and recognizing layout details in images. Our model significantly outperforms existing baselines, earning our team, KnowComp, the 1st place in the leaderboard of Argumentative Stance Classification subtask in this shared task.
Wasserstein-Fisher-Rao Embedding: Logical Query Embeddings with Local Comparison and Global Transport
Zihao Wang | Weizhi Fei | Hang Yin | Yangqiu Song | Ginny Wong | Simon See
Findings of the Association for Computational Linguistics: ACL 2023
Zihao Wang | Weizhi Fei | Hang Yin | Yangqiu Song | Ginny Wong | Simon See
Findings of the Association for Computational Linguistics: ACL 2023
Answering complex queries on knowledge graphs is important but particularly challenging because of the data incompleteness. Query embedding methods address this issue by learningbased models and simulating logical reasoning with set operators. Previous works focus on specific forms of embeddings, but scoring functions between embeddings are underexplored. In contrast to existing scorning functions motivated by local comparison or global transport, this work investigates the local and global trade-off with unbalanced optimal transport theory. Specifically, we embed sets as bounded measures in R endowed with a scoring function motivated by the Wasserstein-Fisher-Rao metric. Such a design also facilitates closed-form set operators in the embedding space. Moreover, we introduce a convolution-based algorithm for linear time computation and a block diagonal kernel to enforce the trade-off. Results show that WFRE is capable of outperforming existing query embedding methods on standard datasets, evaluation sets with combinatorially complex queries, and hierarchical knowledge graphs. Ablation study shows that finding a better local and global trade-off is essential for performance improvement.
COLA: Contextualized Commonsense Causal Reasoning from the Causal Inference Perspective
Zhaowei Wang | Quyet V. Do | Hongming Zhang | Jiayao Zhang | Weiqi Wang | Tianqing Fang | Yangqiu Song | Ginny Wong | Simon See
Proceedings of the 61st Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers)
Zhaowei Wang | Quyet V. Do | Hongming Zhang | Jiayao Zhang | Weiqi Wang | Tianqing Fang | Yangqiu Song | Ginny Wong | Simon See
Proceedings of the 61st Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers)
Detecting commonsense causal relations (causation) between events has long been an essential yet challenging task. Given that events are complicated, an event may have different causes under various contexts. Thus, exploiting context plays an essential role in detecting causal relations. Meanwhile, previous works about commonsense causation only consider two events and ignore their context, simplifying the task formulation. This paper proposes a new task to detect commonsense causation between two events in an event sequence (i.e., context), called contextualized commonsense causal reasoning. We also design a zero-shot framework: COLA (Contextualized Commonsense Causality Reasoner) to solve the task from the causal inference perspective. This framework obtains rich incidental supervision from temporality and balances covariates from multiple timestamps to remove confounding effects. Our extensive experiments show that COLA can detect commonsense causality more accurately than baselines.
2022
SubeventWriter: Iterative Sub-event Sequence Generation with Coherence Controller
Zhaowei Wang | Hongming Zhang | Tianqing Fang | Yangqiu Song | Ginny Wong | Simon See
Proceedings of the 2022 Conference on Empirical Methods in Natural Language Processing
Zhaowei Wang | Hongming Zhang | Tianqing Fang | Yangqiu Song | Ginny Wong | Simon See
Proceedings of the 2022 Conference on Empirical Methods in Natural Language Processing
In this paper, we propose a new task of sub-event generation for an unseen process to evaluate the understanding of the coherence of sub-event actions and objects. To solve the problem, we design SubeventWriter, a sub-event sequence generation framework with a coherence controller. Given an unseen process, the framework can iteratively construct the sub-event sequence by generating one sub-event at each iteration. We also design a very effective coherence controller to decode more coherent sub-events. As our extensive experiments and analysis indicate, SubeventWriter can generate more reliable and meaningful sub-event sequences for unseen processes.
Complex Hyperbolic Knowledge Graph Embeddings with Fast Fourier Transform
Huiru Xiao | Xin Liu | Yangqiu Song | Ginny Wong | Simon See
Proceedings of the 2022 Conference on Empirical Methods in Natural Language Processing
Huiru Xiao | Xin Liu | Yangqiu Song | Ginny Wong | Simon See
Proceedings of the 2022 Conference on Empirical Methods in Natural Language Processing
The choice of geometric space for knowledge graph (KG) embeddings can have significant effects on the performance of KG completion tasks. The hyperbolic geometry has been shown to capture the hierarchical patterns due to its tree-like metrics, which addressed the limitations of the Euclidean embedding models. Recent explorations of the complex hyperbolic geometry further improved the hyperbolic embeddings for capturing a variety of hierarchical structures. However, the performance of the hyperbolic KG embedding models for non-transitive relations is still unpromising, while the complex hyperbolic embeddings do not deal with multi-relations. This paper aims to utilize the representation capacity of the complex hyperbolic geometry in multi-relational KG embeddings. To apply the geometric transformations which account for different relations and the attention mechanism in the complex hyperbolic space, we propose to use the fast Fourier transform (FFT) as the conversion between the real and complex hyperbolic space. Constructing the attention-based transformations in the complex space is very challenging, while the proposed Fourier transform-based complex hyperbolic approaches provide a simple and effective solution. Experimental results show that our methods outperform the baselines, including the Euclidean and the real hyperbolic embedding models.
Search
Fix author
Co-authors
- Simon See 10
- Yangqiu Song 10
- Zhaowei Wang 4
- Chunkit Chan 3
- Tianqing Fang 3
- Xin Liu 3
- Hongming Zhang 3
- Jiayang Cheng 2
- Cheng Jiayang 2
- Haochen Shi 2
- Weiqi Wang 2
- Zihao Wang 2
- Tianshi Zheng 2
- Qing Zong 2
- Yuzhuo Ao 1
- Jiaxin Bai 1
- Tsz Ho Chan 1
- Xinyuan Cheng 1
- Sehyun Choi 1
- Kawai Chung 1
- Zheye Deng 1
- Quyet V. Do 1
- Wei Fan 1
- Weizhi Fei 1
- Zihan Li 1
- Chunyang Li 1
- Xin Liu 1
- Ercong Nie 1
- Helmut Schmid 1
- Hinrich Schuetze 1
- Zhifan Sun 1
- Huiru Xiao 1
- Baixuan Xu 1
- Yauwai Yim 1
- Hang Yin 1
- Fan Yixiang 1
- Hongchuan Zeng 1
- Jiayao Zhang 1
- Zhiying Zou 1