Aileen Joan Vicente


Fixing paper assignments

  1. Please select all papers that belong to the same person.
  2. Indicate below which author they should be assigned to.
Provide a valid ORCID iD here. This will be used to match future papers to this author.
Provide the name of the school or the university where the author has received or will receive their highest degree (e.g., Ph.D. institution for researchers, or current affiliation for students). This will be used to form the new author page ID, if needed.

TODO: "submit" and "cancel" buttons here


2024

pdf bib
Language Identification of Philippine Creole Spanish: Discriminating Chavacano From Related Languages
Aileen Joan Vicente | Charibeth Cheng
Proceedings of the Eleventh Workshop on NLP for Similar Languages, Varieties, and Dialects (VarDial 2024)

Chavacano is a Spanish Creole widely spoken in the southern regions of the Philippines. It is one of the many Philippine languages yet to be studied computationally. This paper presents the development of a language identification model of Chavacano to distinguish it from languages that influence its creolization using character convolutional networks. Unlike studies that discriminated similar languages based on geographical proximity, this paper reports a similarity focused on the creolization of a language. We established the similarity of Chavacano and its related languages, Spanish, Portuguese, Cebuano, and Hiligaynon, from the number of common words in the corpus for all languages. We report an accuracy of 93% for the model generated using ten filters with a filter width of 5. The training experiments reveal that increasing the filter width, number of filters, or training epochs is unnecessary even if the accuracy increases because the generated models present irregular learning behavior or may have already been overfitted. This study also demonstrates that the character features extracted from convolutional neural networks, similar to n-grams, are sufficient in identifying Chavacano. Future work on the language identification of Chavacano includes improving classification accuracy for short or code-switched texts for practical applications such as social media sensors for disaster response and management.

2017

pdf bib
#ActuallyDepressed: Characterization of Depressed Tumblr Users’ Online Behavior from Rules Generation Machine Learning Technique
Czarina Rae Cahutay | Aileen Joan Vicente
Proceedings of the 31st Pacific Asia Conference on Language, Information and Computation

pdf bib
Automatic Categorization of Tagalog Documents Using Support Vector Machines
April Dae Bation | Aileen Joan Vicente | Erlyn Manguilimotan
Proceedings of the 31st Pacific Asia Conference on Language, Information and Computation