2023
pdf
abs
FinePrompt: Unveiling the Role of Finetuned Inductive Bias on Compositional Reasoning in GPT-4
Jeonghwan Kim
|
Giwon Hong
|
Sung-Hyon Myaeng
|
Joyce Whang
Findings of the Association for Computational Linguistics: EMNLP 2023
Compositional reasoning across texts has been a long-standing challenge in natural language processing. With large language models like GPT-4 taking over the field, prompting techniques such as chain-of-thought (CoT) were proposed to unlock compositional, multi-step reasoning capabilities of LLMs. Despite their success, the prompts demand significant human effort to discover and validate them. Our work draws attention to the idea of transferring task-specific inductive biases from finetuned models to prompts, as a way of improving GPT-4’s compositional reasoning capabilities. To leverage these inductive biases, we formulate prompt templates to ease the transfer of inductive biases. The experimental results on multi-hop question answering and numerical reasoning over text show that our proposed prompt scheme shows competitive zero-shot and few-shot performances compared to existing prompts on complicated reasoning tasks, highlighting the importance of adopting the validated biases of the previous paradigm.
2022
pdf
abs
Exploiting Numerical-Contextual Knowledge to Improve Numerical Reasoning in Question Answering
Jeonghwan Kim
|
Junmo Kang
|
Kyung-min Kim
|
Giwon Hong
|
Sung-Hyon Myaeng
Findings of the Association for Computational Linguistics: NAACL 2022
Numerical reasoning over text is a challenging subtask in question answering (QA) that requires both the understanding of texts and numbers. However, existing language models in these numerical reasoning QA models tend to overly rely on the pre-existing parametric knowledge at inference time, which commonly causes hallucination in interpreting numbers. Our work proposes a novel attention masked reasoning model, the NC-BERT, that learns to leverage the number-related contextual knowledge to alleviate the over-reliance on parametric knowledge and enhance the numerical reasoning capabilities of the QA model. The empirical results suggest that understanding of numbers in their context by reducing the parametric knowledge influence, and refining numerical information in the number embeddings lead to improved numerical reasoning accuracy and performance in DROP, a numerical QA dataset.
pdf
abs
Graph-Induced Transformers for Efficient Multi-Hop Question Answering
Giwon Hong
|
Jeonghwan Kim
|
Junmo Kang
|
Sung-Hyon Myaeng
Proceedings of the 2022 Conference on Empirical Methods in Natural Language Processing
A graph is a suitable data structure to represent the structural information of text. Recently, multi-hop question answering (MHQA) tasks, which require inter-paragraph/sentence linkages, have come to exploit such properties of a graph. Previous approaches to MHQA relied on leveraging the graph information along with the pre-trained language model (PLM) encoders. However, this trend exhibits the following drawbacks: (i) sample inefficiency while training in a low-resource setting; (ii) lack of reusability due to changes in the model structure or input. Our work proposes the Graph-Induced Transformer (GIT) that applies graph-derived attention patterns directly into a PLM, without the need to employ external graph modules. GIT can leverage the useful inductive bias of graphs while retaining the unperturbed Transformer structure and parameters. Our experiments on HotpotQA successfully demonstrate both the sample efficient characteristic of GIT and its capacity to replace the graph modules while preserving model performance.
2021
pdf
abs
Can You Distinguish Truthful from Fake Reviews? User Analysis and Assistance Tool for Fake Review Detection
Jeonghwan Kim
|
Junmo Kang
|
Suwon Shin
|
Sung-Hyon Myaeng
Proceedings of the First Workshop on Bridging Human–Computer Interaction and Natural Language Processing
Customer reviews are useful in providing an indirect, secondhand experience of a product. People often use reviews written by other customers as a guideline prior to purchasing a product. Such behavior signifies the authenticity of reviews in e-commerce platforms. However, fake reviews are increasingly becoming a hassle for both consumers and product owners. To address this issue, we propose You Only Need Gold (YONG), an essential information mining tool for detecting fake reviews and augmenting user discretion. Our experimental results show the poor human performance on fake review detection, substantially improved user capability given our tool, and the ultimate need for user reliance on the tool.
pdf
abs
Leveraging Order-Free Tag Relations for Context-Aware Recommendation
Junmo Kang
|
Jeonghwan Kim
|
Suwon Shin
|
Sung-Hyon Myaeng
Proceedings of the 2021 Conference on Empirical Methods in Natural Language Processing
Tag recommendation relies on either a ranking function for top-k tags or an autoregressive generation method. However, the previous methods neglect one of two seemingly conflicting yet desirable characteristics of a tag set: orderlessness and inter-dependency. While the ranking approach fails to address the inter-dependency among tags when they are ranked, the autoregressive approach fails to take orderlessness into account because it is designed to utilize sequential relations among tokens. We propose a sequence-oblivious generation method for tag recommendation, in which the next tag to be generated is independent of the order of the generated tags and the order of the ground truth tags occurring in training data. Empirical results on two different domains, Instagram and Stack Overflow, show that our method is significantly superior to the previous approaches.
pdf
abs
Have You Seen That Number? Investigating Extrapolation in Question Answering Models
Jeonghwan Kim
|
Giwon Hong
|
Kyung-min Kim
|
Junmo Kang
|
Sung-Hyon Myaeng
Proceedings of the 2021 Conference on Empirical Methods in Natural Language Processing
Numerical reasoning in machine reading comprehension (MRC) has shown drastic improvements over the past few years. While the previous models for numerical MRC are able to interpolate the learned numerical reasoning capabilities, it is not clear whether they can perform just as well on numbers unseen in the training dataset. Our work rigorously tests state-of-the-art models on DROP, a numerical MRC dataset, to see if they can handle passages that contain out-of-range numbers. One of the key findings is that the models fail to extrapolate to unseen numbers. Presenting numbers as digit-by-digit input to the model, we also propose the E-digit number form that alleviates the lack of extrapolation in models and reveals the need to treat numbers differently from regular words in the text. Our work provides a valuable insight into the numerical MRC models and the way to represent number forms in MRC.