Rachel Panckhurst
2017
Typologies pour l’annotation de textes non standard en français (Typologies for the annotation of non-standard French texts)
Louise Tarrade
|
Cédric Lopez
|
Rachel Panckhurst
|
Geroges Antoniadis
Actes des 24ème Conférence sur le Traitement Automatique des Langues Naturelles. Volume 2 - Articles courts
La tâche de normalisation automatique des messages issus de la communication électronique médiée requiert une étape préalable consistant à identifier les phénomènes linguistiques. Dans cet article, nous proposons deux typologies pour l’annotation de textes non standard en français, relevant respectivement des niveaux morpho-lexical et morpho-syntaxique. Ces typologies ont été développées en conciliant les typologies existantes et en les faisant évoluer en parallèle d’une annotation manuelle de tweets et de SMS.
2014
Towards Electronic SMS Dictionary Construction: An Alignment-based Approach
Cédric Lopez
|
Reda Bestandji
|
Mathieu Roche
|
Rachel Panckhurst
Proceedings of the Ninth International Conference on Language Resources and Evaluation (LREC'14)
In this paper, we propose a method for aligning text messages (entitled AlignSMS) in order to automatically build an SMS dictionary. An extract of 100 text messages from the 88milSMS corpus (Panckhurst el al., 2013, 2014) was used as an initial test. More than 90,000 authentic text messages in French were collected from the general public by a group of academics in the south of France in the context of the sud4science project (http://www.sud4science.org). This project is itself part of a vast international SMS data collection project, entitled sms4science (http://www.sms4science.org, Fairon et al. 2006, Cougnon, 2014). After corpus collation, pre-processing and anonymisation (Accorsi et al., 2012, Patel et al., 2013), we discuss how raw anonymised text messages can be transcoded into normalised text messages, using a statistical alignment method. The future objective is to set up a hybrid (symbolic/statistic) approach based on both grammar rules and our statistical AlignSMS method.
2008
Classification Procedures for Software Evaluation
Muriel Amar
|
Sophie David
|
Rachel Panckhurst
|
Lisa Whistlecroft
Proceedings of the Sixth International Conference on Language Resources and Evaluation (LREC'08)
We outline a methodological classification for evaluation approaches of software in general. This classification was initiated partly owing to involvement in a biennial European competition (the European Academic Software Award, EASA) which was held for over a decade. The evaluation grid used in EASA gradually became obsolete and inappropriate in recent years, and therefore needed to be revised. In order to do this, it was important to situate the competition in relation to other software evaluation procedures. A methodological perspective for the classification is adopted rather than a conceptual one, since a number of difficulties arise with the latter. We focus on three main questions: What to evaluate? How to evaluate? and Who does evaluate? The classification is therefore hybrid: it allows one to account for the most common evaluation approaches and is also an observatory. Two main approaches are differentiated: system and usage. We conclude that any evaluation always constructs its own object, and the objects to be evaluated only partially determine the evaluation which can be applied to them. Generally speaking, this allows one to begin apprehending what type of knowledge is objectified when one or another approach is chosen.
Search
Co-authors
- Cédric Lopez 2
- Reda Bestandji 1
- Mathieu Roche 1
- Muriel Amar 1
- Sophie David 1
- show all...