A Test Collection for Part-of-Speech Tagging and Word Sense Disambiguation

Robert Krovetz


Abstract
We evaluate a focused test collection at the intersection of part-of-speech tagging and word-sense disambiguation. The collection targets words such as train, novel, and lean, where part-of-speech contrasts align with clear meaning differences. We use it to detect regressions across tagger versions, track quantitative and qualitative progress over time, and test robustness to orthographic variation. Experiments with the Stanford and TnT taggers show 68% accuracy, compared with 92% for a recent spaCy transformer model. Earlier taggers erred mainly on noun–verb distinctions; spaCy’s errors more often involve noun–adjective distinctions. Uppercase text roughly doubles error rates for all taggers. We discuss common problems and propose directions for future testing.
Anthology ID:
2026.lrec-main.925
Volume:
Proceedings of the Fifteenth Language Resources and Evaluation Conference
Month:
May
Year:
2026
Address:
Palma de Mallorca, Spain
Editors:
Stelios Piperidis, Núria Bel, Henk van den Heuvel, Nancy Ide, Simon Krek, Antonio Toral
Venue:
LREC
SIG:
Publisher:
ELRA Language Resource Association
Note:
Pages:
11813–11821
Language:
URL:
https://preview.aclanthology.org/ingest-lrec/2026.lrec-main.925/
DOI:
Bibkey:
Cite (ACL):
Robert Krovetz. 2026. A Test Collection for Part-of-Speech Tagging and Word Sense Disambiguation. International Conference on Language Resources and Evaluation, main:11813–11821.
Cite (Informal):
A Test Collection for Part-of-Speech Tagging and Word Sense Disambiguation (Krovetz, LREC 2026)
Copy Citation:
PDF:
https://preview.aclanthology.org/ingest-lrec/2026.lrec-main.925.pdf