Yuxin Zhang
Papers on this page may belong to the following people: Yuxin Zhang, Yuxin Zhang
2026
RiTeK: A Dataset for Large Language Models Complex Reasoning over Textual Knowledge Graphs in Medicine
Jiatan Huang | Mingchen Li | Zonghai Yao | Dawei Li | Yuxin Zhang | Zhichao Yang | Yongkang Xiao | Feiyun Ouyang | Xiaohan Li | Shuo Han | Hong yu
Findings of the Association for Computational Linguistics: ACL 2026
Jiatan Huang | Mingchen Li | Zonghai Yao | Dawei Li | Yuxin Zhang | Zhichao Yang | Yongkang Xiao | Feiyun Ouyang | Xiaohan Li | Shuo Han | Hong yu
Findings of the Association for Computational Linguistics: ACL 2026
Answering complex real-world questions in the medical domain often requires accurate retrieval from medical Textual Knowledge Graphs (medical TKGs), as the relational path information from TKGs could enhance the inference ability of Large Language Models (LLMs). However, the main bottlenecks lie in the scarcity of existing medical TKGs, the limited expressiveness of their topological structures, and the lack of comprehensive evaluations of current retrievers for medical TKGs. To address these challenges, we first develop a dataset for LLMs Complex Reasoning over medical Textual Knowledge Graphs (RiTeK), covering a broad range of topological structures. Specifically, we synthesize realistic user queries integrating diverse topological structures, relational information, and complex textual descriptions. We conduct a rigorous medical expert evaluation process to assess and validate the quality of our synthesized queries. RiTeK also serves as a comprehensive benchmark dataset for evaluating the capabilities of retrieval systems built upon LLMs. By assessing 11 representative retrievers on this benchmark, we observe that existing methods struggle to perform well, revealing notable limitations in current LLM-driven retrieval approaches. These findings highlight the pressing need for more effective retrieval systems tailored for semi-structured data in the medical domain.
2025
NeuSym-RAG: Hybrid Neural Symbolic Retrieval with Multiview Structuring for PDF Question Answering
Ruisheng Cao | Hanchong Zhang | Tiancheng Huang | Zhangyi Kang | Yuxin Zhang | Liangtai Sun | Hanqi Li | Yuxun Miao | Shuai Fan | Lu Chen | Kai Yu
Proceedings of the 63rd Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers)
Ruisheng Cao | Hanchong Zhang | Tiancheng Huang | Zhangyi Kang | Yuxin Zhang | Liangtai Sun | Hanqi Li | Yuxun Miao | Shuai Fan | Lu Chen | Kai Yu
Proceedings of the 63rd Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers)
The increasing number of academic papers poses significant challenges for researchers to efficiently acquire key details. While retrieval augmented generation (RAG) shows great promise in large language model (LLM) based automated question answering, previous works often isolate neural and symbolic retrieval despite their complementary strengths. Moreover, conventional single-view chunking neglects the rich structure and layout of PDFs, e.g., sections and tables. In this work, we propose NeuSym-RAG, a hybrid neural symbolic retrieval framework which combines both paradigms in an interactive process. By leveraging multi-view chunking and schema-based parsing, NeuSym-RAG organizes semi-structured PDF content into both the relational database and vectorstore, enabling LLM agents to iteratively gather context until sufficient to generate answers. Experiments on three full PDF-based QA datasets, including a self-annotated one AirQA-Real, show that NeuSym-RAG stably defeats both the vector-based RAG and various structured baselines, highlighting its capacity to unify both retrieval schemes and utilize multiple views.
2022
A Multi-Modal Knowledge Graph for Classical Chinese Poetry
Yuqing Li | Yuxin Zhang | Bin Wu | Ji-Rong Wen | Ruihua Song | Ting Bai
Findings of the Association for Computational Linguistics: EMNLP 2022
Yuqing Li | Yuxin Zhang | Bin Wu | Ji-Rong Wen | Ruihua Song | Ting Bai
Findings of the Association for Computational Linguistics: EMNLP 2022
Classical Chinese poetry has a long history and is a precious cultural heritage of humankind. Displaying the classical Chinese poetry in a visual way, helps to cross cultural barriers in different countries, making it enjoyable for all the people. In this paper, we construct a multi-modal knowledge graph for classical Chinese poetry (PKG), in which the visual information of words in the poetry are incorporated. Then a multi-modal pre-training language model, PKG-Bert, is proposed to obtain the poetry representation with visual information, which bridges the semantic gap between different modalities. PKG-Bert achieves the state-of-the-art performance on the poetry-image retrieval task, showing the effectiveness of incorporating the multi-modal knowledge. The large-scale multi-modal knowledge graph of classical Chinese poetry will be released to promote the researches in classical Chinese culture area.