2025
pdf
bib
abs
Learning from Hallucinations: Mitigating Hallucinations in LLMs via Internal Representation Intervention
Sora Kadotani
|
Kosuke Nishida
|
Kyosuke Nishida
Proceedings of the 14th International Joint Conference on Natural Language Processing and the 4th Conference of the Asia-Pacific Chapter of the Association for Computational Linguistics
Large language models (LLMs) sometimes hallucinate facts. Recent studies have shown that use of non-factual LLMs (anti-expert) have the potential to improve the factuality of the base LLM. Anti-expert methods penalize the output probabilities of the base LLM with an anti-expert LLM. Anti-expert methods are effective in mitigating hallucinations, but require high computational costs because the two LLMs are run simultaneously. In this paper, we propose an efficient anti-expert method called in-model anti-expert. It mitigated the hallucination problem with a single LLM and intervening to change the internal representations in the direction of improving factuality. Experiments results showed that the proposed method is less costly than the conventional anti-expert method and outperformed existing methods except for the anti-expert method. We confirmed that the proposed method improved GPU memory usage from 2.2x to 1.2x and latency from 1.9x to 1.2x.
2023
pdf
bib
abs
Monolingual Phrase Alignment as Parse Forest Mapping
Sora Kadotani
|
Yuki Arase
Proceedings of the 12th Joint Conference on Lexical and Computational Semantics (*SEM 2023)
We tackle the problem of monolingual phrase alignment conforming to syntactic structures. The existing method formalises the problem as unordered tree mapping; hence, the alignment quality is easily affected by syntactic ambiguities. We address this problem by expanding the method to align parse forests rather than 1-best trees, where syntactic structures and phrase alignment are simultaneously identified. The proposed method achieves efficient alignment by mapping forests on a packed structure. The experimental results indicated that our method improves the phrase alignment quality of the state-of-the-art method by aligning forests rather than 1-best trees.
2021
pdf
bib
abs
Edit Distance Based Curriculum Learning for Paraphrase Generation
Sora Kadotani
|
Tomoyuki Kajiwara
|
Yuki Arase
|
Makoto Onizuka
Proceedings of the 59th Annual Meeting of the Association for Computational Linguistics and the 11th International Joint Conference on Natural Language Processing: Student Research Workshop
Curriculum learning has improved the quality of neural machine translation, where only source-side features are considered in the metrics to determine the difficulty of translation. In this study, we apply curriculum learning to paraphrase generation for the first time. Different from machine translation, paraphrase generation allows a certain level of discrepancy in semantics between source and target, which results in diverse transformations from lexical substitution to reordering of clauses. Hence, the difficulty of transformations requires considering both source and target contexts. Experiments on formality transfer using GYAFC showed that our curriculum learning with edit distance improves the quality of paraphrase generation. Additionally, the proposed method improves the quality of difficult samples, which was not possible for previous methods.