Relations of Linguistic Features and Medical Text Preferences are Nontrivial

Davis Bartels, Brandon Colelough, Dina Demner-Fushman


Abstract
We study how simple linguistic features relate to reader preferences in medical question answering. Our dataset contains answers to medical questions ranked in order of quality. We examine eight interpretable features of the answer text: length in words, average words per sentence, percentage of polysyllabic words, medical named entity density, perplexity, coherence, and dependency distance. We find substantial variation across annotators in both the strength and direction of these relationships. Answer length shows some of the strongest associations and predictive signals, but preferences are not consistent across annotators, with some favoring longer answers and others favoring shorter ones. A leave-one-out ablation study shows the relative impact on the predictive accuracy of our models. Overall, these results suggest that linguistic form can influence reader preference in medical text, but that these effects vary across readers and may be more complex than simple linear correlations.
Anthology ID:
2026.bionlp-1.87
Volume:
BioNLP 2026
Month:
July
Year:
2026
Address:
San Diego, California
Editors:
Dina Demner-Fushman, Sophia Ananiadou, Kirk Roberts, Junichi Tsujii
Venues:
BioNLP | WS
SIG:
Publisher:
Association for Computational Linguistics
Note:
Pages:
1080–1088
Language:
URL:
https://preview.aclanthology.org/ingest-acl-workshops/2026.bionlp-1.87/
DOI:
Bibkey:
Cite (ACL):
Davis Bartels, Brandon Colelough, and Dina Demner-Fushman. 2026. Relations of Linguistic Features and Medical Text Preferences are Nontrivial. In BioNLP 2026, pages 1080–1088, San Diego, California. Association for Computational Linguistics.
Cite (Informal):
Relations of Linguistic Features and Medical Text Preferences are Nontrivial (Bartels et al., BioNLP 2026)
Copy Citation:
PDF:
https://preview.aclanthology.org/ingest-acl-workshops/2026.bionlp-1.87.pdf