Elizaveta Loginova-Clouet

Also published as: Elizaveta Clouet, Elizaveta Loginova Clouet


Ubuntu-fr: A Large and Open Corpus for Multi-modal Analysis of Online Written Conversations
Nicolas Hernandez | Soufian Salim | Elizaveta Loginova Clouet
Proceedings of the Tenth International Conference on Language Resources and Evaluation (LREC'16)

We present a large, free, French corpus of online written conversations extracted from the Ubuntu platform’s forums, mailing lists and IRC channels. The corpus is meant to support multi-modality and diachronic studies of online written conversations. We choose to build the corpus around a robust metadata model based upon strong principles, such as the “stand off” annotation principle. We detail the model, we explain how the data was collected and processed - in terms of meta-data, text and conversation - and we detail the corpus’contents through a series of meaningful statistics. A portion of the corpus - about 4,700 sentences from emails, forum posts and chat messages sent in November 2014 - is annotated in terms of dialogue acts and sentiment. We discuss how we adapted our dialogue act taxonomy from the DIT++ annotation scheme and how the data was annotated, before presenting our results as well as a brief qualitative analysis of the annotated data.


LINA: Identifying Comparable Documents from Wikipedia
Emmanuel Morin | Amir Hazem | Florian Boudin | Elizaveta Loginova-Clouet
Proceedings of the Eighth Workshop on Building and Using Comparable Corpora


pdf bib
Splitting of Compound Terms in non-Prototypical Compounding Languages
Elizaveta Clouet | Béatrice Daille
Proceedings of the First Workshop on Computational Approaches to Compound Analysis (ComAComA 2014)


Multilingual Compound Splitting (Segmentation Multilingue des Mots Composés) [in French]
Elizaveta Loginova-Clouet | Béatrice Daille
Proceedings of TALN 2013 (Volume 2: Short Papers)