Differences in type-token ratio and part-of-speech frequencies in male and female Russian written texts

Tatiana Litvinova, Pavel Seredin, Olga Litvinova, Olga Zagorovskaya

[How to correct problems with metadata yourself]


Abstract
The differences in the frequencies of some parts of speech (POS), particularly function words, and lexical diversity in male and female speech have been pointed out in a number of papers. The classifiers using exclusively context-independent parameters have proved to be highly effective. However, there are still issues that have to be addressed as a lot of studies are performed for English and the genre and topic of texts is sometimes neglected. The aim of this paper is to investigate the association between context-independent parameters of Russian written texts and the gender of their authors and to design predictive re-gression models. A number of correlations were found. The obtained data is in good agreement with the results obtained for other languages. The model based on 5 parameters with the highest correlation coefficients was designed.
Anthology ID:
W17-4909
Volume:
Proceedings of the Workshop on Stylistic Variation
Month:
September
Year:
2017
Address:
Copenhagen, Denmark
Editors:
Julian Brooke, Thamar Solorio, Moshe Koppel
Venue:
Style-Var
SIG:
Publisher:
Association for Computational Linguistics
Note:
Pages:
69–73
Language:
URL:
https://aclanthology.org/W17-4909
DOI:
10.18653/v1/W17-4909
Bibkey:
Cite (ACL):
Tatiana Litvinova, Pavel Seredin, Olga Litvinova, and Olga Zagorovskaya. 2017. Differences in type-token ratio and part-of-speech frequencies in male and female Russian written texts. In Proceedings of the Workshop on Stylistic Variation, pages 69–73, Copenhagen, Denmark. Association for Computational Linguistics.
Cite (Informal):
Differences in type-token ratio and part-of-speech frequencies in male and female Russian written texts (Litvinova et al., Style-Var 2017)
Copy Citation:
PDF:
https://preview.aclanthology.org/teach-a-man-to-fish/W17-4909.pdf