Jinyoung Jo
2025
Large Language Models Exhibit Limited Reasoning Ability on Coding Problems
Jinyoung Jo
|
Jonah Engelmann
|
Sean Choi
Proceedings of the 14th International Joint Conference on Natural Language Processing and the 4th Conference of the Asia-Pacific Chapter of the Association for Computational Linguistics
Claims that large language models (LLMs) have complex reasoning ability have stirred broad interests, and controversies, of academics and non-academics alike. A popular basis for such claims comes from LLMs’ ability to solve coding problems, which involves understanding the problem statement and providing code that solves the problem. Although such abilities are remarkable feats worth praising, we argue that they come from memorization rather than reasoning. We first show that LLMs’ problem-solving ability degrades with increased recency of the problem, likely due to the reduced amount of training data for more recent problems, regardless of the problem difficulty labeled by human experts. Additionally, we show that an LLM often fails to solve the problem when presented with reworded but equivalent problem statements, further suggesting their limited reasoning ability.
2023
SIGMORPHON–UniMorph 2023 Shared Task 0, Part 2: Cognitively Plausible Morphophonological Generalization in Korean
Canaan Breiss
|
Jinyoung Jo
Proceedings of the 20th SIGMORPHON workshop on Computational Research in Phonetics, Phonology, and Morphology
This paper summarises data collection and curation for Part 2 of the 2023 SIGMORPHON-UniMorph Shared Task 0, which focused on modeling speaker knowledge and generalization of a pair of interacting phonological processes in Korean. We briefly describe how modeling the generalization task could be of interest to researchers in both Natural Language Processing and linguistics, and then summarise the traditional description of the phonological processes that are at the center of the modeling challenge. We then describe the criteria we used to select and code cases of process application in two Korean speech corpora, which served as the primary learning data. We also report the technical details of the experiment we carried out that served as the primary test data.