HAWK: Highlighting Entity-aware Knowledge for Alleviating Information Sparsity in Long Contexts

Seonmin Koo; Jinsung Kim; Chanjun Park; Heui-Seok Lim

doi:10.18653/v1/2025.findings-emnlp.708

HAWK: Highlighting Entity-aware Knowledge for Alleviating Information Sparsity in Long Contexts

Seonmin Koo, Jinsung Kim, Chanjun Park, Heuiseok Lim

Abstract

As the textual data given as the context of various tasks lengthens, having necessary information scattered throughout makes it more difficult for large language models (LLMs) to capture relevant details. This challenge is particularly prominent in tasks such as question answering (QA), where key information is often not evenly distributed within the context. This problem of information sparsity has led to the attempts of various approaches, such as direct context adjustment and retrieval-based methods. However, these approaches typically leverage compressed contexts, which increases the risk that key information may be contained in the dropped portions. Therefore, research from the perspective of addressing the information sparsity while not losing key details in contexts is required. To address this issue, we propose Highlighting entity-AWare Knowledge (HAWK) framework. HAWK consists of three main steps: i) entity extraction, ii) entity-aware subcontext selection, and iii) triplet construction. The core mechanism of HAWK is to highlight key information in a context and structuralize it in an entity-aware manner, facilitating knowledge-enhanced generation. Through extensive experiments and comprehensive analysis, HAWK confirms significant improvements in QA tasks with long contexts, achieving up to a 27.6-point F1 score increase and at least an average win rate of 76.75% over existing methods.

Anthology ID:: 2025.findings-emnlp.708
Volume:: Findings of the Association for Computational Linguistics: EMNLP 2025
Month:: November
Year:: 2025
Address:: Suzhou, China
Editors:: Christos Christodoulopoulos, Tanmoy Chakraborty, Carolyn Rose, Violet Peng
Venue:: Findings
SIG:
Publisher:: Association for Computational Linguistics
Note:
Pages:: 13161–13184
Language:
URL:: https://preview.aclanthology.org/author-page-yu-wang-polytechnic/2025.findings-emnlp.708/
DOI:: 10.18653/v1/2025.findings-emnlp.708
Bibkey:
Cite (ACL):: Seonmin Koo, Jinsung Kim, Chanjun Park, and Heuiseok Lim. 2025. HAWK: Highlighting Entity-aware Knowledge for Alleviating Information Sparsity in Long Contexts. In Findings of the Association for Computational Linguistics: EMNLP 2025, pages 13161–13184, Suzhou, China. Association for Computational Linguistics.
Cite (Informal):: HAWK: Highlighting Entity-aware Knowledge for Alleviating Information Sparsity in Long Contexts (Koo et al., Findings 2025)
Copy Citation:
PDF:: https://preview.aclanthology.org/author-page-yu-wang-polytechnic/2025.findings-emnlp.708.pdf
Checklist:: 2025.findings-emnlp.708.checklist.pdf

PDF Cite Search Checklist Fix data