Joohyung Lee
2026
LLMs as ASP Programmers: Self-Correction Enables Task-Agnostic Nonmonotonic Reasoning
Adam Ishay | Joohyung Lee
Findings of the Association for Computational Linguistics: ACL 2026
Adam Ishay | Joohyung Lee
Findings of the Association for Computational Linguistics: ACL 2026
Recent large language models (LLMs) have achieved impressive reasoning milestones but continue to struggle with high computational costs, logical inconsistencies, and sharp performance degradation on high-complexity problems. While neuro-symbolic methods attempt to mitigate these issues by coupling LLMs with symbolic reasoners, existing approaches typically rely on monotonic logics (e.g., SMT) that cannot represent defeasible reasoning—essential components of human cognition. We present "LLM+ASP," a framework that translates natural language into Answer Set Programming (ASP), a nonmonotonic formalism based on stable model semantics. Unlike prior LLM+ASP approaches that require manually authored knowledge modules, domain-specific prompts, or evaluation restricted to single problem classes, our framework operates without any per-task engineering and applies uniformly across diverse reasoning tasks. Our system utilizes an automated self-correction loop where structured feedback from the ASP solver enables iterative refinement. Evaluating across six diverse benchmarks, we demonstrate that: (1) stable model semantics allow LLMs to naturally express default rules and exceptions, outperforming SMT-based alternatives by significant margins on nonmonotonic tasks; (2) iterative self-correction is the primary driver of performance, effectively replacing the need for handcrafted domain knowledge; (3) compact in-context reference guides substantially outperform verbose documentation, revealing a “context rot" phenomenon where excessive context hinders constraint adherence.
2025
Sparse Logit Sampling: Accelerating Knowledge Distillation in LLMs
Anshumann | Mohd Abbas Zaidi | Akhil Kedia | Jinwoo Ahn | Taehwak Kwon | Kangwook Lee | Haejun Lee | Joohyung Lee
Proceedings of the 63rd Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers)
Anshumann | Mohd Abbas Zaidi | Akhil Kedia | Jinwoo Ahn | Taehwak Kwon | Kangwook Lee | Haejun Lee | Joohyung Lee
Proceedings of the 63rd Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers)
Knowledge distillation can be a cost-effective technique to distill knowledge in Large Language Models, if the teacher output logits can be pre-computed and cached. However, successfully applying this to pre-training remains largely unexplored. In this work, we prove that naive approaches for sparse knowledge distillation such as caching Top-K probabilities, while intuitive, provide biased estimates of teacher probability distribution to the student, resulting in suboptimal performance and calibration. We propose an importance-sampling-based method ‘Random Sampling Knowledge Distillation’, which provides unbiased estimates, preserves the gradient in expectation, and requires storing significantly sparser logits. Our method enables faster training of student models with marginal overhead (<10%) compared to cross-entropy based training, while maintaining competitive performance compared to full distillation, across a range of model sizes from 300M to 3B.
2023
Coupling Large Language Models with Logic Programming for Robust and General Reasoning from Text
Zhun Yang | Adam Ishay | Joohyung Lee
Findings of the Association for Computational Linguistics: ACL 2023
Zhun Yang | Adam Ishay | Joohyung Lee
Findings of the Association for Computational Linguistics: ACL 2023
While large language models (LLMs), such as GPT-3, appear to be robust and general, their reasoning ability is not at a level to compete with the best models trained for specific natural language reasoning problems. In this study, we observe that a large language model can serve as a highly effective few-shot semantic parser. It can convert natural language sentences into a logical form that serves as input for answer set programs, a logic-based declarative knowledge representation formalism. The combination results in a robust and general system that can handle multiple question-answering tasks without requiring retraining for each new task. It only needs a few examples to guide the LLM’s adaptation to a specific task, along with reusable ASP knowledge modules that can be applied to multiple tasks. We demonstrate that this method achieves state-of-the-art performance on several NLP benchmarks, including bAbI, StepGame, CLUTRR, and gSCAN. Additionally, it successfully tackles robot planning tasks that an LLM alone fails to solve.
2015
Recognizing Social Constructs from Textual Conversation
Somak Aditya | Chitta Baral | Nguyen Ha Vo | Joohyung Lee | Jieping Ye | Zaw Naung | Barry Lumpkin | Jenny Hastings | Richard Scherl | Dawn M. Sweet | Daniela Inclezan
Proceedings of the 2015 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies
Somak Aditya | Chitta Baral | Nguyen Ha Vo | Joohyung Lee | Jieping Ye | Zaw Naung | Barry Lumpkin | Jenny Hastings | Richard Scherl | Dawn M. Sweet | Daniela Inclezan
Proceedings of the 2015 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies