Xiaoxiao Ma
Other people with similar names: Xiaoxiao Ma
2026
CLARITY: A Framework and Benchmark for Conversational Language Ambiguity and Unanswerability in Interactive NL2SQL Systems
Tabinda Sarwar | Farhad Moghimifar | Cong Duy Vu Hoang | Xiaoxiao Ma | Shawn Chang Xu | Fahimeh Saleh | Poorya Zaremoodi | Avirup Sil | Katrin Kirchhoff
Proceedings of the 64th Annual Meeting of the Association for Computational Linguistics (ACL 2026)
Tabinda Sarwar | Farhad Moghimifar | Cong Duy Vu Hoang | Xiaoxiao Ma | Shawn Chang Xu | Fahimeh Saleh | Poorya Zaremoodi | Avirup Sil | Katrin Kirchhoff
Proceedings of the 64th Annual Meeting of the Association for Computational Linguistics (ACL 2026)
NL2SQL systems deployed in industry settings often encounter ambiguous or unanswerable queries, particularly in interactive scenarios with incomplete user clarification. Existing benchmarks typically assume a single source of ambiguity and rely on user interaction for resolution, overlooking realistic failure modes.We introduce Clarity, a framework for automatically generating an NL2SQL benchmark with multi-faceted ambiguities and diverse user behaviors across both single- and multi-turn settings. Using a constraint-driven pipeline, Clarity transforms executable SQL into ambiguous queries, augmented with grounded conversational continuations and schema-level metadata.Empirical evaluation on Spider and BIRD shows that leading NL2SQL systems, including those based on strong LLMs, suffer significant performance degradation under multi-faceted ambiguity. While these systems often detect ambiguity, they struggle to accurately localize and resolve the underlying schema-level sources. Our results highlight the need for more robust ambiguity detection and resolution in industry-grade NL2SQL systems.
2024
On Fake News Detection with LLM Enhanced Semantics Mining
Xiaoxiao Ma | Yuchen Zhang | Kaize Ding | Jian Yang | Jia Wu | Hao Fan
Proceedings of the 2024 Conference on Empirical Methods in Natural Language Processing
Xiaoxiao Ma | Yuchen Zhang | Kaize Ding | Jian Yang | Jia Wu | Hao Fan
Proceedings of the 2024 Conference on Empirical Methods in Natural Language Processing
Large language models (LLMs) have emerged as valuable tools for enhancing textual features in various text-related tasks. Despite their superiority in capturing the lexical semantics between tokens for text analysis, our preliminary study on two popular LLMs, i.e., ChatGPT and Llama2, showcases that simply applying the news embeddings from LLMs is ineffective for fake news detection. Such embeddings only encapsulate the language styles between tokens. Meanwhile, the high-level semantics among named entities and topics, which reveal the deviating patterns of fake news, have been ignored. Therefore, we propose a topic model together with a set of specially designed prompts to extract topics and real entities from LLMs and model the relations among news, entities, and topics as a heterogeneous graph to facilitate investigating news semantics. We then propose a Generalized Page-Rank model and a consistent learning criteria for mining the local and global semantics centered on each news piece through the adaptive propagation of features across the graph. Our model shows superior performance on five benchmark datasets over seven baseline methods and the efficacy of the key ingredients has been thoroughly validated.