Sociolinguistic Corpus of WhatsApp Chats in Spanish among College Students

Alejandro Dorantes, Gerardo Sierra, Tlauhlia Yamín Donohue Pérez, Gemma Bel-Enguix, Mónica Jasso Rosales


Abstract
This work presents the Sociolinguistic Corpus of WhatsApp Chats in Spanish among College Students, a corpus of raw data for general use. Its purpose is to offer data for the study of of language and interactions via Instant Messaging (IM) among bachelors. Our paper consists of an overview of both the corpus’s content and demographic metadata. Furthermore, it presents the current research being conducted with it —namely parenthetical expressions, orality traits, and code-switching. This work also includes a brief outline of similar corpora and recent studies in the field of IM.
Anthology ID:
W18-3501
Volume:
Proceedings of the Sixth International Workshop on Natural Language Processing for Social Media
Month:
July
Year:
2018
Address:
Melbourne, Australia
Editors:
Lun-Wei Ku, Cheng-Te Li
Venue:
SocialNLP
SIG:
Publisher:
Association for Computational Linguistics
Note:
Pages:
1–6
Language:
URL:
https://aclanthology.org/W18-3501
DOI:
10.18653/v1/W18-3501
Bibkey:
Cite (ACL):
Alejandro Dorantes, Gerardo Sierra, Tlauhlia Yamín Donohue Pérez, Gemma Bel-Enguix, and Mónica Jasso Rosales. 2018. Sociolinguistic Corpus of WhatsApp Chats in Spanish among College Students. In Proceedings of the Sixth International Workshop on Natural Language Processing for Social Media, pages 1–6, Melbourne, Australia. Association for Computational Linguistics.
Cite (Informal):
Sociolinguistic Corpus of WhatsApp Chats in Spanish among College Students (Dorantes et al., SocialNLP 2018)
Copy Citation:
PDF:
https://preview.aclanthology.org/nschneid-patch-4/W18-3501.pdf