Wei Quan


Fixing paper assignments

  1. Please select all papers that belong to the same person.
  2. Indicate below which author they should be assigned to.
Provide a valid ORCID iD here. This will be used to match future papers to this author.
Provide the name of the school or the university where the author has received or will receive their highest degree (e.g., Ph.D. institution for researchers, or current affiliation for students). This will be used to form the new author page ID, if needed.

TODO: "submit" and "cancel" buttons here


2019

pdf bib
CLPsych2019 Shared Task: Predicting Suicide Risk Level from Reddit Posts on Multiple Forums
Victor Ruiz | Lingyun Shi | Wei Quan | Neal Ryan | Candice Biernesser | David Brent | Rich Tsui
Proceedings of the Sixth Workshop on Computational Linguistics and Clinical Psychology

We aimed to predict an individual suicide risk level from longitudinal posts on Reddit discussion forums. Through participating in a shared task competition hosted by CLPsych2019, we received two annotated datasets: a training dataset with 496 users (31,553 posts) and a test dataset with 125 users (9610 posts). We submitted results from our three best-performing machine-learning models: SVM, Naïve Bayes, and an ensemble model. Each model provided a user’s suicide risk level in four categories, i.e., no risk, low risk, moderate risk, and severe risk. Among the three models, the ensemble model had the best macro-averaged F1 score 0.379 when tested on the holdout test dataset. The NB model had the best performance in two additional binary-classification tasks, i.e., no risk vs. flagged risk (any risk level other than no risk) with F1 score 0.836 and no or low risk vs. urgent risk (moderate or severe risk) with F1 score 0.736. We conclude that the NB model may serve as a tool for identifying users with flagged or urgent suicide risk based on longitudinal posts on Reddit discussion forums.