Abstract
Recent progress in pretrained Transformer-based language models has shown great success in learning contextual representation of text. However, due to the quadratic self-attention complexity, most of the pretrained Transformers models can only handle relatively short text. It is still a challenge when it comes to modeling very long documents. In this work, we propose to use a graph attention network on top of the available pretrained Transformers model to learn document embeddings. This graph attention network allows us to leverage the high-level semantic structure of the document. In addition, based on our graph document model, we design a simple contrastive learning strategy to pretrain our models on a large amount of unlabeled corpus. Empirically, we demonstrate the effectiveness of our approaches in document classification and document retrieval tasks.- Anthology ID:
- 2021.findings-emnlp.327
- Volume:
- Findings of the Association for Computational Linguistics: EMNLP 2021
- Month:
- November
- Year:
- 2021
- Address:
- Punta Cana, Dominican Republic
- Editors:
- Marie-Francine Moens, Xuanjing Huang, Lucia Specia, Scott Wen-tau Yih
- Venue:
- Findings
- SIG:
- SIGDAT
- Publisher:
- Association for Computational Linguistics
- Note:
- Pages:
- 3874–3884
- Language:
- URL:
- https://aclanthology.org/2021.findings-emnlp.327
- DOI:
- 10.18653/v1/2021.findings-emnlp.327
- Cite (ACL):
- Peng Xu, Xinchi Chen, Xiaofei Ma, Zhiheng Huang, and Bing Xiang. 2021. Contrastive Document Representation Learning with Graph Attention Networks. In Findings of the Association for Computational Linguistics: EMNLP 2021, pages 3874–3884, Punta Cana, Dominican Republic. Association for Computational Linguistics.
- Cite (Informal):
- Contrastive Document Representation Learning with Graph Attention Networks (Xu et al., Findings 2021)
- PDF:
- https://preview.aclanthology.org/ingest-2024-clasp/2021.findings-emnlp.327.pdf
- Data
- IMDb Movie Reviews, MS MARCO, OpenWebText, WebText