Cross-referencing Using Fine-grained Topic Modeling

Jeffrey Lund, Piper Armstrong, Wilson Fearn, Stephen Cowley, Emily Hales, Kevin Seppi


Abstract
Cross-referencing, which links passages of text to other related passages, can be a valuable study aid for facilitating comprehension of a text. However, cross-referencing requires first, a comprehensive thematic knowledge of the entire corpus, and second, a focused search through the corpus specifically to find such useful connections. Due to this, cross-reference resources are prohibitively expensive and exist only for the most well-studied texts (e.g. religious texts). We develop a topic-based system for automatically producing candidate cross-references which can be easily verified by human annotators. Our system utilizes fine-grained topic modeling with thousands of highly nuanced and specific topics to identify verse pairs which are topically related. We demonstrate that our system can be cost effective compared to having annotators acquire the expertise necessary to produce cross-reference resources unaided.
Anthology ID:
N19-1399
Volume:
Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 1 (Long and Short Papers)
Month:
June
Year:
2019
Address:
Minneapolis, Minnesota
Editors:
Jill Burstein, Christy Doran, Thamar Solorio
Venue:
NAACL
SIG:
Publisher:
Association for Computational Linguistics
Note:
Pages:
3978–3987
Language:
URL:
https://aclanthology.org/N19-1399
DOI:
10.18653/v1/N19-1399
Bibkey:
Cite (ACL):
Jeffrey Lund, Piper Armstrong, Wilson Fearn, Stephen Cowley, Emily Hales, and Kevin Seppi. 2019. Cross-referencing Using Fine-grained Topic Modeling. In Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 1 (Long and Short Papers), pages 3978–3987, Minneapolis, Minnesota. Association for Computational Linguistics.
Cite (Informal):
Cross-referencing Using Fine-grained Topic Modeling (Lund et al., NAACL 2019)
Copy Citation:
PDF:
https://preview.aclanthology.org/fix-dup-bibkey/N19-1399.pdf