Yiqing Zhang

2026

PubMed Reasoner: Dynamic Reasoning-based Retrieval for Evidence-Grounded Biomedical Question Answering
Yiqing Zhang | Xiaozhong Liu | Fabricio Murai
Proceedings of the 64th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers)

Trustworthy biomedical question answering (QA) systems must not only provide accurate answers but also justify them with current, verifiable evidence. Retrieval-augmented approaches partially address this gap but lack mechanisms to iteratively refine poor queries, whereas self-reflection methods kick in only after full retrieval is completed. In this context, we introduce PubMed Reasoner, a biomedical QA agent composed of three stages: **self-critic query refinement** evaluates MeSH terms for coverage, alignment, and redundancy to enhance PubMed queries based on partial (metadata) retrieval; **reflective retrieval** processes articles in batches until sufficient evidence is gathered; and **evidence-grounded response generation** produces answers with explicit citations. PubMed Reasoner with a GPT-4o backbone achieves **78.32%** accuracy on PubMedQA, slightly surpassing human experts, and showing consistent gains on MMLU Clinical Knowledge. Moreover, LLM-as-judge evaluations prefer our responses across: reasoning soundness, evidence grounding, clinical relevance, and trustworthiness. By orchestrating retrieval-first reasoning over authoritative sources, our approach provides practical assistance to clinicians and biomedical researchers while controlling compute and token costs.

2018

pdf bib abs

PubSE: A Hierarchical Model for Publication Extraction from Academic Homepages
Yiqing Zhang | Jianzhong Qi | Rui Zhang | Chuandong Yin
Proceedings of the 2018 Conference on Empirical Methods in Natural Language Processing

Publication information in a researcher’s academic homepage provides insights about the researcher’s expertise, research interests, and collaboration networks. We aim to extract all the publication strings from a given academic homepage. This is a challenging task because the publication strings in different academic homepages may be located at different positions with different structures. To capture the positional and structural diversity, we propose an end-to-end hierarchical model named PubSE based on Bi-LSTM-CRF. We further propose an alternating training method for training the model. Experiments on real data show that PubSE outperforms the state-of-the-art models by up to 11.8% in F1-score.

Co-authors

Venues

ACL1
EMNLP1

Fix author