David A. Clifton
2026
Autonomous Knowledge Graph Exploration with Adaptive Breadth-Depth Retrieval
Joaquin Polonuer | Lucas Vittor | Iñaki Arango | Ayush Noori | David A. Clifton | Luciano Del Corro | Marinka Zitnik
Proceedings of the 64th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers)
Joaquin Polonuer | Lucas Vittor | Iñaki Arango | Ayush Noori | David A. Clifton | Luciano Del Corro | Marinka Zitnik
Proceedings of the 64th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers)
Retrieving evidence for language model queries from knowledge graphs requires balancing broad search across the graph with multi-hop traversal to follow relational links. Similarity-based retrievers provide coverage but remain shallow, whereas traversal-based methods rely on selecting seed nodes to start exploration, which can fail when queries span multiple entities and relations. We introduce ARK: Adaptive Retriever of Knowledge, a tool-using KG retriever that gives a language model control over this breadth-depth tradeoff using a two-operation toolset: global lexical search over node descriptors and one-hop neighborhood exploration that composes into multi-hop traversal. ARK alternates between breadth-oriented discovery and depth-oriented expansion without depending on a fragile seed selection, a pre-set hop depth, or requiring retrieval training. ARK adapts tool use to queries, using global search for language-heavy queries and neighborhood exploration for relation-heavy queries.On STaRK, ARK reaches 59.1% average Hit@1 and 67.4 average MRR, improving average Hit@1 by up to 31.4% and average MRR by up to 28.0% over retrieval-based and agent-based training-free methods.Finally, we distill ARK’s tool-use trajectories from a large teacher into an 8B model via label-free imitation, improving Hit@1 by +7.0, +26.6, and +13.5 absolute points over the base 8B model on AMAZON, MAG, and PRIME datasets, respectively, while retaining up to 98.5% of the teacher’s Hit@1 rate.
2025
DrAgent: Empowering Large Language Models as Medical Agents for Multi-hop Medical Reasoning
Fenglin Liu | Zheng Li | Hongjian Zhou | Qingyu Yin | Jingfeng Yang | Xin Liu | Zhengyang Wang | Xianfeng Tang | Shiyang Li | Xiang He | Ruijie Wang | Bing Yin | Xiao Gu | Lei Clifton | David A. Clifton
Findings of the Association for Computational Linguistics: EMNLP 2025
Fenglin Liu | Zheng Li | Hongjian Zhou | Qingyu Yin | Jingfeng Yang | Xin Liu | Zhengyang Wang | Xianfeng Tang | Shiyang Li | Xiang He | Ruijie Wang | Bing Yin | Xiao Gu | Lei Clifton | David A. Clifton
Findings of the Association for Computational Linguistics: EMNLP 2025
Although large language models (LLMs) have demonstrated outperforming human experts in medical examinations, it remains challenging to adopt LLMs in real-world clinical decision-making that typically involves multi-hop medical reasoning. Common practices include prompting commercial LLMs and fine-tuning LLMs on medical data. However, in the clinical domain, using commercial LLMs raises privacy concerns regarding sensitive patient data. Fine-tuning competitive medical LLMs for different tasks usually requires extensive data and computing resources, which are difficult to acquire, especially in medical institutions with limited infrastructure. We propose DrAgent, which can build LLMs as agents to deliver accurate medical decision-making and reasoning. In implementation, we take a lightweight LLM as the backbone to collaborate with diverse clinical tools. To make efficient use of data, DrAgent introduces recursive curriculum learning to optimize the LLM in an easy-to-hard progression. The results show that our approach achieves competitive performance on diverse datasets.
2024
Large Language Models Are Poor Clinical Decision-Makers: A Comprehensive Benchmark
Fenglin Liu | Zheng Li | Hongjian Zhou | Qingyu Yin | Jingfeng Yang | Xianfeng Tang | Chen Luo | Ming Zeng | Haoming Jiang | Yifan Gao | Priyanka Nigam | Sreyashi Nag | Bing Yin | Yining Hua | Xuan Zhou | Omid Rohanian | Anshul Thakur | Lei Clifton | David A. Clifton
Proceedings of the 2024 Conference on Empirical Methods in Natural Language Processing
Fenglin Liu | Zheng Li | Hongjian Zhou | Qingyu Yin | Jingfeng Yang | Xianfeng Tang | Chen Luo | Ming Zeng | Haoming Jiang | Yifan Gao | Priyanka Nigam | Sreyashi Nag | Bing Yin | Yining Hua | Xuan Zhou | Omid Rohanian | Anshul Thakur | Lei Clifton | David A. Clifton
Proceedings of the 2024 Conference on Empirical Methods in Natural Language Processing
The adoption of large language models (LLMs) to assist clinicians has attracted remarkable attention. Existing works mainly adopt the close-ended question-answering (QA) task with answer options for evaluation. However, many clinical decisions involve answering open-ended questions without pre-set options. To better understand LLMs in the clinic, we construct a benchmark ClinicBench. We first collect eleven existing datasets covering diverse clinical language generation, understanding, and reasoning tasks. Furthermore, we construct six novel datasets and clinical tasks that are complex but common in real-world practice, e.g., open-ended decision-making, long document processing, and emerging drug analysis. We conduct an extensive evaluation of twenty-two LLMs under both zero-shot and few-shot settings. Finally, we invite medical experts to evaluate the clinical usefulness of LLMs
2023
MiniALBERT: Model Distillation via Parameter-Efficient Recursive Transformers
Mohammadmahdi Nouriborji | Omid Rohanian | Samaneh Kouchaki | David A. Clifton
Proceedings of the 17th Conference of the European Chapter of the Association for Computational Linguistics
Mohammadmahdi Nouriborji | Omid Rohanian | Samaneh Kouchaki | David A. Clifton
Proceedings of the 17th Conference of the European Chapter of the Association for Computational Linguistics
Pre-trained Language Models (LMs) have become an integral part of Natural Language Processing (NLP) in recent years, due to their superior performance in downstream applications. In spite of this resounding success, the usability of LMs is constrained by computational and time complexity, along with their increasing size; an issue that has been referred to as overparameterisation. Different strategies have been proposed in the literature to alleviate these problems, with the aim to create effective compact models that nearly match the performance of their bloated counterparts with negligible performance losses. One of the most popular techniques in this area of research is model distillation. Another potent but underutilised technique is cross-layer parameter sharing. In this work, we combine these two strategies and present MiniALBERT, a technique for converting the knowledge of fully parameterised LMs (such as BERT) into a compact recursive student. In addition, we investigate the application of bottleneck adapters for layer-wise adaptation of our recursive student, and also explore the efficacy of adapter tuning for fine-tuning of compact models. We test our proposed models on a number of general and biomedical NLP tasks to demonstrate their viability and compare them with the state-of-the-art and other existing compact models. All the codes used in the experiments and the pre-trained compact models will be made publicly available.
Search
Fix author
Co-authors
- Lei Clifton 2
- Zheng Li 2
- Fenglin Liu 2
- Omid Rohanian 2
- Xianfeng Tang 2
- Jingfeng Yang 2
- Qingyu Yin 2
- Hongjian Zhou 2
- Iñaki Arango 1
- Luciano Del Corro 1
- Yifan Gao 1
- Xiao Gu 1
- Xiang He 1
- Yining Hua 1
- Haoming Jiang 1
- Samaneh Kouchaki 1
- Shiyang Li 1
- Xin Liu 1
- Chen Luo 1
- Sreyashi Nag 1
- Priyanka Nigam 1
- Ayush Noori 1
- Mohammadmahdi Nouriborji 1
- Joaquin Polonuer 1
- Anshul Thakur 1
- Lucas Vittor 1
- Ruijie Wang 1
- Zhengyang Wang 1
- Bing Yin 1
- Bing Yin 1
- Ming Zeng 1
- Xuan Zhou 1
- Marinka Žitnik 1