Yuhang Jiang


2026

The interest in biomedical relation extraction (RE) continues to persist even in the LLM era owing to RE being a prominent way to build knowledge graphs, which further ground LLM applications, especially in preventing hallucinations. Therapy-disease treatment relations from scientific literature are an important type in RE as they indicate emerging therapeutic hypotheses and off-label usages being explored in the community. An automatically extracted evolving knowledge-base of such relations will be of great utility to researchers because doing it manually is not viable with the exponential growth of biomedical articles. In this paper, toward this end, we introduce a new expert-annotated dataset LitTx for identifying treatment relationships discussed in literature given the lack of such datasets in the recent past. Besides confirmed or implied positive relations, we also introduce a new "conditional treatment" relation type where hedging or a potential relationship is indicated. Our baseline RE models with this new dataset demonstrate promising results, while also revealing clear areas for improvement. To foster innovation and ensure replicability in the biomedical RE community, we release our dataset, code, and annotation guidelines publicly: https://github.com/bionlproc/LitTx_dataset.

2025

Extracting relations from scientific literature is a fundamental task in biomedical NLP because entities and relations among them drive hypothesis generation and knowledge discovery. As literature grows rapidly, relation extraction (RE) is indispensable to curate knowledge graphs to be used as computable structured and symbolic representations. With the rise of LLMs, it is pertinent to examine if it is better to skip tailoring supervised RE methods, save annotation burden, and just use zero shot RE (ZSRE) via LLM API calls. In this paper, we propose a benchmark with seven biomedical RE datasets with interesting characteristics and evaluate three Open AI models (GPT-4, o1, and GPT-OSS-120B) for end-to-end ZSRE. We show that LLM-based ZSRE is inching closer to supervised methods in performances on some datasets but still struggles on complex inputs expressing multiple relations with different predicates. Our error analysis reveals scope for improvements.