Siyu Wu
2025
Pre3: Enabling Deterministic Pushdown Automata for Faster Structured LLM Generation
Junyi Chen
|
Shihao Bai
|
Zaijun Wang
|
Siyu Wu
|
Chuheng Du
|
Hailong Yang
|
Ruihao Gong
|
Shengzhong Liu
|
Fan Wu
|
Guihai Chen
Proceedings of the 63rd Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers)
Extensive LLM applications demand efficient structured generations, particularly for LR(1) grammars, to produce outputs in specified formats (e.g., JSON). Existing methods primarily parse LR(1) grammars into a pushdown automaton (PDA), leading to runtime execution overhead for context-dependent token processing, especially inefficient under large inference batches.To address these issues, we propose Pre3 that exploits deterministic pushdown automata (DPDA) to optimize the constrained LLM decoding efficiency.First, by **pre**computing **pre**fix-conditioned edges during the **pre**processing, Pre3 enables ahead-of-time edge analysis and thus makes parallel transition processing possible.Futher, leveraging the prefix-conditioned edges, Pre3 introduces a novel approach that transforms LR(1) transition graphs into DPDA, eliminating the need for runtime path exploration and achieving edge transitions with minimal overhead.Pre3 can be seamlessly integrated into standard LLM inference frameworks, improving time per output token (TPOT) by up to 40% and throughput by up to 36% in our experiments. Our code is available at https://github.com/ModelTC/lightllm.
CIKT: A Collaborative and Iterative Knowledge Tracing Framework with Large Language Models
Runze Li
|
Siyu Wu
|
Jun Wang
|
Wei Zhang
Proceedings of the 2025 Conference on Empirical Methods in Natural Language Processing
Knowledge Tracing (KT) aims to model a student’s learning state over time and predict their future performance. However, traditional KT methods often face challenges in explainability, scalability, and effective modeling of complex knowledge dependencies. While Large Language Models (LLMs) present new avenues for KT, their direct application often struggles with generating structured, explainable student representations and lacks mechanisms for continuous, task-specific refinement. To address these gaps, we propose Collaborative Iterative Knowledge Tracing (CIKT), a framework that harnesses LLMs to enhance both prediction accuracy and explainability. CIKT employs a dual-component architecture: an Analyst generates dynamic, explainable user profiles from student historical responses, and a Predictor utilizes these profiles to forecast future performance. The core of CIKT is a synergistic optimization loop. In this loop, the Analyst is iteratively refined based on the predictive accuracy of the Predictor, which conditions on the generated profiles, and the Predictor is subsequently retrained using these enhanced profiles. Evaluated on multiple educational datasets, CIKT demonstrates significant improvements in prediction accuracy, offers enhanced explainability through its dynamically updated user profiles, and exhibits improved scalability. Our work presents a robust and explainable solution for advancing knowledge tracing systems, effectively bridging the gap between predictive performance and model transparency.
Search
Fix author
Co-authors
- Shihao Bai 1
- Junyi Chen 1
- Guihai Chen 1
- Chuheng Du 1
- Ruihao Gong 1
- show all...