Beyond Text: Characterizing Domain Expert Needs in Document Research

Sireesh Gururaja, Nupoor Gandhi, Jeremiah Milbauer, Emma Strubell


Abstract
Working with documents is a key part of almost any knowledge work, from contextualizing research in a literature review to reviewing legal precedent. Recently, as their capabilities have expanded, primarily text-based NLP systems have often been billed as able to assist or even automate this kind of work. But to what extent are these systems able to model these tasks as experts conceptualize and perform them now? In this study, we interview sixteen domain experts across two domains to understand their processes of document research, and compare it to the current state of NLP systems. We find that our participants processes are idiosyncratic, iterative, and rely extensively on the social context of a document in addition its content, and that approaches in NLP and adjacent fields that explicitly center the document as an object, rather than as merely a container for text, tend to better reflect our participants’ priorities. We call on the NLP community to more carefully consider the role of the document in building useful tools that are accessible, personalizable, iterative, and socially aware.
Anthology ID:
2025.findings-acl.244
Volume:
Findings of the Association for Computational Linguistics: ACL 2025
Month:
July
Year:
2025
Address:
Vienna, Austria
Editors:
Wanxiang Che, Joyce Nabende, Ekaterina Shutova, Mohammad Taher Pilehvar
Venues:
Findings | WS
SIG:
Publisher:
Association for Computational Linguistics
Note:
Pages:
4732–4745
Language:
URL:
https://preview.aclanthology.org/acl25-workshop-ingestion/2025.findings-acl.244/
DOI:
Bibkey:
Cite (ACL):
Sireesh Gururaja, Nupoor Gandhi, Jeremiah Milbauer, and Emma Strubell. 2025. Beyond Text: Characterizing Domain Expert Needs in Document Research. In Findings of the Association for Computational Linguistics: ACL 2025, pages 4732–4745, Vienna, Austria. Association for Computational Linguistics.
Cite (Informal):
Beyond Text: Characterizing Domain Expert Needs in Document Research (Gururaja et al., Findings 2025)
Copy Citation:
PDF:
https://preview.aclanthology.org/acl25-workshop-ingestion/2025.findings-acl.244.pdf