Nicolas Rey-Villamizar


2016

pdf
Semi-supervised CLPsych 2016 Shared Task System Submission
Nicolas Rey-Villamizar | Prasha Shrestha | Thamar Solorio | Farig Sadeque | Steven Bethard | Ted Pedersen
Proceedings of the Third Workshop on Computational Linguistics and Clinical Psychology

pdf
Overview for the Second Shared Task on Language Identification in Code-Switched Data
Giovanni Molina | Fahad AlGhamdi | Mahmoud Ghoneim | Abdelati Hawwari | Nicolas Rey-Villamizar | Mona Diab | Thamar Solorio
Proceedings of the Second Workshop on Computational Approaches to Code Switching

pdf
Analysis of Anxious Word Usage on Online Health Forums
Nicolas Rey-Villamizar | Prasha Shrestha | Farig Sadeque | Steven Bethard | Ted Pedersen | Arjun Mukherjee | Thamar Solorio
Proceedings of the Seventh International Workshop on Health Text Mining and Information Analysis

pdf
Why Do They Leave: Modeling Participation in Online Depression Forums
Farig Sadeque | Ted Pedersen | Thamar Solorio | Prasha Shrestha | Nicolas Rey-Villamizar | Steven Bethard
Proceedings of the Fourth International Workshop on Natural Language Processing for Social Media

pdf
Age and Gender Prediction on Health Forum Data
Prasha Shrestha | Nicolas Rey-Villamizar | Farig Sadeque | Ted Pedersen | Steven Bethard | Thamar Solorio
Proceedings of the Tenth International Conference on Language Resources and Evaluation (LREC'16)

Health support forums have become a rich source of data that can be used to improve health care outcomes. A user profile, including information such as age and gender, can support targeted analysis of forum data. But users might not always disclose their age and gender. It is desirable then to be able to automatically extract this information from users’ content. However, to the best of our knowledge there is no such resource for author profiling of health forum data. Here we present a large corpus, with close to 85,000 users, for profiling and also outline our approach and benchmark results to automatically detect a user’s age and gender from their forum posts. We use a mix of features from a user’s text as well as forum specific features to obtain accuracy well above the baseline, thus showing that both our dataset and our method are useful and valid.