Weiyao Luo
2024
Taking a Deep Breath: Enhancing Language Modeling of Large Language Models with Sentinel Tokens
Weiyao Luo
|
Suncong Zheng
|
Heming Xia
|
Weikang Wang
|
Yan Lei
|
Tianyu Liu
|
Shuang Chen
|
Zhifang Sui
Findings of the Association for Computational Linguistics: EMNLP 2024
Large language models (LLMs) have shown promising efficacy across various tasks, becoming powerful tools in numerous aspects of human life. However, Transformer-based LLMs suffer a performance degradation when modeling long-term contexts due to they discard some information to reduce computational overhead. In this work, we propose a simple yet effective method to enable LLMs to take a deep breath, encouraging them to summarize information contained within discrete text chunks. Specifically, we segment the text into multiple chunks and insert special token <SR> at the end of each chunk. We then modify the attention mask to integrate the chunk’s information into the corresponding <SR> token. This facilitates LLMs to interpret information not only from historical individual tokens but also from the <SR> token, aggregating the chunk’s semantic information. Experiments on language modeling and out-of-domain downstream tasks validate the superiority of our approach.
FaGANet: An Evidence-Based Fact-Checking Model with Integrated Encoder Leveraging Contextual Information
Weiyao Luo
|
Junfeng Ran
|
Zailong Tian
|
Sujian Li
|
Zhifang Sui
Proceedings of the 2024 Joint International Conference on Computational Linguistics, Language Resources and Evaluation (LREC-COLING 2024)
In the face of the rapidly growing spread of false and misleading information in the real world, manual evidence-based fact-checking efforts become increasingly challenging and time-consuming. In order to tackle this issue, we propose FaGANet, an automated and accurate fact-checking model that leverages the power of sentence-level attention and graph attention network to enhance performance. This model adeptly integrates encoder-only models with graph attention network, effectively fusing claims and evidence information for accurate identification of even well-disguised data. Experiment results showcase the significant improvement in accuracy achieved by our FaGANet model, as well as its state-of-the-art performance in the evidence-based fact-checking task. We release our code and data in https://github.com/WeiyaoLuo/FaGANet.
Search
Co-authors
- Heming Xia 1
- Junfeng Ran 1
- Shuang Chen 1
- Sujian Li (李素建) 1
- Suncong Zheng 1
- show all...