Abstract
Term co-occurrence in a sentence or paragraph is a powerful and often overlooked feature for text matching in document retrieval. In our experiments with matching email-style query messages to webpages, such term co-occurrence helped greatly to filter and rank documents, compared to matching document-size bags-of-words. The paper presents the results of the experiments as well as a text-matching model where the query shapes the vector space, a document is modelled by two or three vectors in this vector space, and the query-document similarity score depends on the length of the vectors and the relationships between them.- Anthology ID:
- C16-1222
- Volume:
- Proceedings of COLING 2016, the 26th International Conference on Computational Linguistics: Technical Papers
- Month:
- December
- Year:
- 2016
- Address:
- Osaka, Japan
- Editors:
- Yuji Matsumoto, Rashmi Prasad
- Venue:
- COLING
- SIG:
- Publisher:
- The COLING 2016 Organizing Committee
- Note:
- Pages:
- 2356–2365
- Language:
- URL:
- https://aclanthology.org/C16-1222
- DOI:
- Cite (ACL):
- Eriks Sneiders. 2016. Text Retrieval by Term Co-occurrences in a Query-based Vector Space. In Proceedings of COLING 2016, the 26th International Conference on Computational Linguistics: Technical Papers, pages 2356–2365, Osaka, Japan. The COLING 2016 Organizing Committee.
- Cite (Informal):
- Text Retrieval by Term Co-occurrences in a Query-based Vector Space (Sneiders, COLING 2016)
- PDF:
- https://preview.aclanthology.org/emnlp-22-attachments/C16-1222.pdf