Beidi Chen
2026
MedVerse: Efficient and Reliable Medical Reasoning via DAG-Structured Parallel Execution
Jianwen Chen | Xinyu Yang | Peng Xia | Arian Azarang | Yueh Z Lee | Gang Li | Hongtu Zhu | Yun Li | Beidi Chen | Huaxiu Yao
Proceedings of the 64th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers)
Jianwen Chen | Xinyu Yang | Peng Xia | Arian Azarang | Yueh Z Lee | Gang Li | Hongtu Zhu | Yun Li | Beidi Chen | Huaxiu Yao
Proceedings of the 64th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers)
Large language models (LLMs) have demonstrated strong performance and rapid progress in a wide range of medical reasoning tasks.However, their sequential autoregressive decoding forces inherently parallel clinical reasoning, such as differential diagnosis, into a single linear reasoning path, limiting both efficiency and reliability for complex medical problems.To address this, we propose MedVerse, a reasoning framework for complex medical inference that reformulates medical reasoning as a parallelizable directed acyclic graph (DAG) process based on Petri Net theory.The framework adopts a full-stack design across data, model architecture, and system execution.For data creation, we introduce the MedVerse Curator, an automated pipeline that synthesizes knowledge-grounded medical reasoning path and transforms them into Petri Net–structured representations.At the architectural level, we propose a topology-aware attention mechanism with adaptive position indices that supports parallel reasoning while preserving logical consistency.Systematically, we develop a customized inference engine that supports parallel execution without additional overhead.Empirical evaluations show that MedVerse improves strong general-purpose LLMs by up to 8.9%. Compared to specialized medical LLMs, MedVerse achieves comparable performance with improved clinical reliability, while delivering a 1.3× reduction in inference latency and a 1.7× increase in generation throughput, enabled by its parallel decoding capability.
When "Correct" Is Not Safe: Can We Trust Functionally Correct Patches Generated by Code Agents?
Yibo Peng | James Song | Lei Li | Xinyu Yang | Mihai Christodorescu | Ravi Mangal | Corina S. Pasareanu | Haizhong Zheng | Beidi Chen
Proceedings of the 64th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers)
Yibo Peng | James Song | Lei Li | Xinyu Yang | Mihai Christodorescu | Ravi Mangal | Corina S. Pasareanu | Haizhong Zheng | Beidi Chen
Proceedings of the 64th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers)
Code agents are increasingly trusted to autonomously fix bugs on platforms such as GitHub, yet their security evaluation focuses almost exclusively on functional correctness. In this paper, we reveal a novel type of threat to real-world code-agents: functionally correct yet vulnerable (FCV) patches, which pass all test cases but contain vulnerable code. With our proposed FCV-Attack, we demonstrate that SOTA LLMs (e.g., ChatGPT and Claude) and agent scaffolds (e.g., SWE-agent and OpenHands) are all vulnerable to this FCV threat; across 12 agent-model combinations on SWE-Bench, the attack only requires black-box access and a single query to the code agent to perform the attack. For example, for CWE-538 (information exposure vulnerability), the FCV-Attack attains an attack success rate of 40.7% on GPT-5 Mini + OpenHands. Our results reveal an important security threat overlooked by current evaluation paradigms and urge the development of security-aware defenses for code agents.
2024
LayerSkip: Enabling Early Exit Inference and Self-Speculative Decoding
Mostafa Elhoushi | Akshat Shrivastava | Diana Liskovich | Basil Hosmer | Bram Wasti | Liangzhen Lai | Anas Mahmoud | Bilge Acun | Saurabh Agarwal | Ahmed Roman | Ahmed Aly | Beidi Chen | Carole-Jean Wu
Proceedings of the 62nd Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers)
Mostafa Elhoushi | Akshat Shrivastava | Diana Liskovich | Basil Hosmer | Bram Wasti | Liangzhen Lai | Anas Mahmoud | Bilge Acun | Saurabh Agarwal | Ahmed Roman | Ahmed Aly | Beidi Chen | Carole-Jean Wu
Proceedings of the 62nd Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers)
We present LayerSkip, an end-to-end solution to speed-up inference of large language models (LLMs). First, during training we apply layer dropout, with low dropout rates for earlier layers and higher dropout rates for later layers, and an early exit loss where all transformer layers share the same exit. Second, during inference, we show that this training recipe increases the accuracy of early exit at earlier layers, without adding any auxiliary layers or modules to the model. Third, we present a novel self-speculative decoding solution where we exit at early layers and verify and correct with remaining layers of the model. Our proposed self-speculative decoding approach has less memory footprint than other speculative decoding approaches and benefits from shared compute and activations of the draft and verification stages. We run experiments on different Llama model sizes on different types of training: pretraining from scratch, continual pretraining, finetuning on specific data domain, and finetuning on specific task. We implement our inference solution and show speedups of up to 2.16x on summarization for CNN/DM documents, 1.82x on coding, and 2.0x on TOPv2 semantic parsing task. We open source our code at https://github.com/facebookresearch/LayerSkip.
Search
Fix author
Co-authors
- Xinyu Yang 2
- Bilge Acun 1
- Saurabh Agarwal 1
- Ahmed Aly 1
- Arian Azarang 1
- Jianwen Chen 1
- Mihai Christodorescu 1
- Mostafa Elhoushi 1
- Basil Hosmer 1
- Liangzhen Lai 1
- Yueh Z Lee 1
- Gang Li 1
- Yun Li 1
- Lei Li 1
- Diana Liskovich 1
- Anas Mahmoud 1
- Ravi Mangal 1
- Corina S. Pasareanu 1
- Yibo Peng 1
- Ahmed Roman 1
- Akshat Shrivastava 1
- James Song 1
- Bram Wasti 1
- Carole-Jean Wu 1
- Peng Xia 1
- Huaxiu Yao 1
- Haizhong Zheng 1
- Hongtu Zhu 1
Venues
- ACL3