Abstract
Personality profiling is the task of detecting personality traits of authors based on writing style. Several personality typologies exist, however, the Briggs-Myer Type Indicator (MBTI) is particularly popular in the non-scientific community, and many people use it to analyse their own personality and talk about the results online. Therefore, large amounts of self-assessed data on MBTI are readily available on social-media platforms such as Twitter. We present a novel corpus of tweets annotated with the MBTI personality type and gender of their author for six Western European languages (Dutch, German, French, Italian, Portuguese and Spanish). We outline the corpus creation and annotation, show statistics of the obtained data distributions and present first baselines on Myers-Briggs personality profiling and gender prediction for all six languages.- Anthology ID:
- L16-1258
- Volume:
- Proceedings of the Tenth International Conference on Language Resources and Evaluation (LREC'16)
- Month:
- May
- Year:
- 2016
- Address:
- Portorož, Slovenia
- Editors:
- Nicoletta Calzolari, Khalid Choukri, Thierry Declerck, Sara Goggi, Marko Grobelnik, Bente Maegaard, Joseph Mariani, Helene Mazo, Asuncion Moreno, Jan Odijk, Stelios Piperidis
- Venue:
- LREC
- SIG:
- Publisher:
- European Language Resources Association (ELRA)
- Note:
- Pages:
- 1632–1637
- Language:
- URL:
- https://aclanthology.org/L16-1258
- DOI:
- Cite (ACL):
- Ben Verhoeven, Walter Daelemans, and Barbara Plank. 2016. TwiSty: A Multilingual Twitter Stylometry Corpus for Gender and Personality Profiling. In Proceedings of the Tenth International Conference on Language Resources and Evaluation (LREC'16), pages 1632–1637, Portorož, Slovenia. European Language Resources Association (ELRA).
- Cite (Informal):
- TwiSty: A Multilingual Twitter Stylometry Corpus for Gender and Personality Profiling (Verhoeven et al., LREC 2016)
- PDF:
- https://preview.aclanthology.org/landing_page/L16-1258.pdf