Examining Temporality in Document Classification

Xiaolei Huang, Michael J. Paul


Abstract
Many corpora span broad periods of time. Language processing models trained during one time period may not work well in future time periods, and the best model may depend on specific times of year (e.g., people might describe hotels differently in reviews during the winter versus the summer). This study investigates how document classifiers trained on documents from certain time intervals perform on documents from other time intervals, considering both seasonal intervals (intervals that repeat across years, e.g., winter) and non-seasonal intervals (e.g., specific years). We show experimentally that classification performance varies over time, and that performance can be improved by using a standard domain adaptation approach to adjust for changes in time.
Anthology ID:
P18-2110
Volume:
Proceedings of the 56th Annual Meeting of the Association for Computational Linguistics (Volume 2: Short Papers)
Month:
July
Year:
2018
Address:
Melbourne, Australia
Editors:
Iryna Gurevych, Yusuke Miyao
Venue:
ACL
SIG:
Publisher:
Association for Computational Linguistics
Note:
Pages:
694–699
Language:
URL:
https://aclanthology.org/P18-2110
DOI:
10.18653/v1/P18-2110
Bibkey:
Cite (ACL):
Xiaolei Huang and Michael J. Paul. 2018. Examining Temporality in Document Classification. In Proceedings of the 56th Annual Meeting of the Association for Computational Linguistics (Volume 2: Short Papers), pages 694–699, Melbourne, Australia. Association for Computational Linguistics.
Cite (Informal):
Examining Temporality in Document Classification (Huang & Paul, ACL 2018)
Copy Citation:
PDF:
https://preview.aclanthology.org/nschneid-patch-1/P18-2110.pdf
Presentation:
 P18-2110.Presentation.pdf
Video:
 https://preview.aclanthology.org/nschneid-patch-1/P18-2110.mp4
Code
 xiaoleihuang/Domain_Adaptation_ACL2018