Amélie Paulus


2012

pdf
Texto4Science: a Quebec French Database of Annotated Short Text Messages
Philippe Langlais | Patrick Drouin | Amélie Paulus | Eugénie Rompré Brodeur | Florent Cottin
Proceedings of the Eighth International Conference on Language Resources and Evaluation (LREC'12)

In October 2009, was launched the Quebec French part of the international sms4science project, called texto4science. Over a period of 10 months, we collected slightly more than 7000 SMSs that we carefully annotated. This database is now ready to be used by the community. The purpose of this article is to relate the efforts put into designing this database and provide some data analysis of the main linguistic phenomenon that we have annotated. We also report on a socio-linguistic survey we conducted within the project.