Abstract
Language use varies across different demographic factors, such as gender, age, and geographic location. However, most existing document classification methods ignore demographic variability. In this study, we examine empirically how text data can vary across four demographic factors: gender, age, country, and region. We propose a multitask neural model to account for demographic variations via adversarial training. In experiments on four English-language social media datasets, we find that classification performance improves when adapting for user factors.- Anthology ID:
- S19-1015
- Volume:
- Proceedings of the Eighth Joint Conference on Lexical and Computational Semantics (*SEM 2019)
- Month:
- June
- Year:
- 2019
- Address:
- Minneapolis, Minnesota
- Editors:
- Rada Mihalcea, Ekaterina Shutova, Lun-Wei Ku, Kilian Evang, Soujanya Poria
- Venue:
- *SEM
- SIGs:
- SIGSEM | SIGLEX
- Publisher:
- Association for Computational Linguistics
- Note:
- Pages:
- 136–146
- Language:
- URL:
- https://aclanthology.org/S19-1015
- DOI:
- 10.18653/v1/S19-1015
- Cite (ACL):
- Xiaolei Huang and Michael J. Paul. 2019. Neural User Factor Adaptation for Text Classification: Learning to Generalize Across Author Demographics. In Proceedings of the Eighth Joint Conference on Lexical and Computational Semantics (*SEM 2019), pages 136–146, Minneapolis, Minnesota. Association for Computational Linguistics.
- Cite (Informal):
- Neural User Factor Adaptation for Text Classification: Learning to Generalize Across Author Demographics (Huang & Paul, *SEM 2019)
- PDF:
- https://preview.aclanthology.org/naacl-24-ws-corrections/S19-1015.pdf
- Code
- xiaoleihuang/NUFA