Junyi Chen

2025

Extensive LLM applications demand efficient structured generations, particularly for LR(1) grammars, to produce outputs in specified formats (e.g., JSON). Existing methods primarily parse LR(1) grammars into a pushdown automaton (PDA), leading to runtime execution overhead for context-dependent token processing, especially inefficient under large inference batches.To address these issues, we propose Pre³ that exploits deterministic pushdown automata (DPDA) to optimize the constrained LLM decoding efficiency.First, by **pre**computing **pre**fix-conditioned edges during the **pre**processing, Pre³ enables ahead-of-time edge analysis and thus makes parallel transition processing possible.Futher, leveraging the prefix-conditioned edges, Pre³ introduces a novel approach that transforms LR(1) transition graphs into DPDA, eliminating the need for runtime path exploration and achieving edge transitions with minimal overhead.Pre³ can be seamlessly integrated into standard LLM inference frameworks, improving time per output token (TPOT) by up to 40% and throughput by up to 36% in our experiments. Our code is available at https://github.com/ModelTC/lightllm.

pdf bib abs
You Only Query Twice: Multimodal Rumor Detection via Evidential Evaluation from Dual Perspectives
Junyi Chen | Leyuan Liu | Tian Lan | Fan Zhou | Xiaosong Zhang
Proceedings of the 31st International Conference on Computational Linguistics

Current rumor detectors exhibit limitations in fully exploiting responses to the source tweet as essential public opinions, and in explaining and indicating the reliability of the results obtained. Additionally, the joint utilization of both responses and the multimodal source content for detection presents challenges due to the heterogeneous nature of the data points. In this work, to address the first challenge, we initially prompt the Large Language Model (LLM) with both multimodal source content and the corresponding response set to extract contrasting evidence to enable maximal utilization of informative responses. To overcome the second challenge, we introduce an uncertainty-aware evidential evaluator to assess the evidence intensity from the multimodal source content and dual-sided reasoning, from which the final prediction is derived. As we model the second-order probability, we can effectively indicate the model’s uncertainty (i.e., the reliability) of the results. The reasoning from the correct perspective also serves as a natural language-based explanation. To this end, the third challenge is also addressed as we fully leverage the available resources. Extensive experiments validate the effectiveness, uncertainty awareness in predictions, helpful explainability for human judgment, and superior efficiency of our approach compared to contemporary works utilizing LLMs.

Co-authors

Siyu Wu 1

Venues

acl1
coling1

Fix author