Emmanuel Schang


Automatic Speech Recognition and Query By Example for Creole Languages Documentation
Cécile Macaire | Didier Schwab | Benjamin Lecouteux | Emmanuel Schang
Findings of the Association for Computational Linguistics: ACL 2022

We investigate the exploitation of self-supervised models for two Creole languages with few resources: Gwadloupéyen and Morisien. Automatic language processing tools are almost non-existent for these two languages. We propose to use about one hour of annotated data to design an automatic speech recognition system for each language. We evaluate how much data is needed to obtain a query-by-example system that is usable by linguists. Moreover, our experiments show that multilingual self-supervised models are not necessarily the most efficient for Creole languages.


Temporal@ODIL project: Adapting ISO-TimeML to syntactic treebanks for the temporal annotation of spoken speech
Jean-Yves Antoine | Jakub Wasczuk | Anaïs Lefeuvre-Haftermeyer | Lotfi Abouda | Emmanuel Schang | Agata Savary
Proceedings of the 13th Joint ISO-ACL Workshop on Interoperable Semantic Annotation (ISA-13)


Covering various Needs in Temporal Annotation: a Proposal of Extension of ISO TimeML that Preserves Upward Compatibility
Anaïs Lefeuvre-Halftermeyer | Jean-Yves Antoine | Alain Couillault | Emmanuel Schang | Lotfi Abouda | Agata Savary | Denis Maurel | Iris Eshkol | Delphine Battistelli
Proceedings of the Tenth International Conference on Language Resources and Evaluation (LREC'16)

This paper reports a critical analysis of the ISO TimeML standard, in the light of several experiences of temporal annotation that were conducted on spoken French. It shows that the norm suffers from weaknesses that should be corrected to fit a larger variety of needs inNLP and in corpus linguistics. We present our proposition of some improvements of the norm before it will be revised by the ISO Committee in 2017. These modifications concern mainly (1) Enrichments of well identified features of the norm: temporal function of TIMEX time expressions, additional types for TLINK temporal relations; (2) Deeper modifications concerning the units or features annotated: clarification between time and tense for EVENT units, coherence of representation between temporal signals (the SIGNAL unit) and TIMEX modifiers (the MOD feature); (3) A recommendation to perform temporal annotation on top of a syntactic (rather than lexical) layer (temporal annotation on a treebank).


Tense and Time Annotations : a Contribution to TimeML Improvement (Annotation de la temporalité en corpus : contribution à l’amélioration de la norme TimeML) [in French]
Anaïs Lefeuvre | Jean-Yves Antoine | Agata Savary | Emmanuel Schang | Lotfi Abouda | Denis Maurel | Iris Eshkol
Proceedings of TALN 2014 (Volume 2: Short Papers)

ANCOR_Centre, a large free spoken French coreference corpus: description of the resource and reliability measures
Judith Muzerelle | Anaïs Lefeuvre | Emmanuel Schang | Jean-Yves Antoine | Aurore Pelletier | Denis Maurel | Iris Eshkol | Jeanne Villaneau
Proceedings of the Ninth International Conference on Language Resources and Evaluation (LREC'14)

This article presents ANCOR_Centre, a French coreference corpus, available under the Creative Commons Licence. With a size of around 500,000 words, the corpus is large enough to serve the needs of data-driven approaches in NLP and represents one of the largest coreference resources currently available. The corpus focuses exclusively on spoken language, it aims at representing a certain variety of spoken genders. ANCOR_Centre includes anaphora as well as coreference relations which involve nominal and pronominal mentions. The paper describes into details the annotation scheme and the reliability measures computed on the resource.


ANCOR, the first large French speaking corpus of conversational speech annotated in coreference to be freely available (ANCOR, premier corpus de français parlé d’envergure annoté en coréférence et distribué librement) [in French]
Judith Muzerelle | Anaïs Lefeuvre | Jean-Yves Antoine | Emmanuel Schang | Denis Maurel | Jeanne Villaneau | Iris Eshkol
Proceedings of TALN 2013 (Volume 2: Short Papers)


Décrire la morphologie des verbes en ikota au moyen d’une métagrammaire (Describing the Morphology of Verbs in Ikota using a Metagrammar) [in French]
Denys Duchier | Brunelle Magnana Ekoukou | Yannick Parmentier | Simon Petitjean | Emmanuel Schang
JEP-TALN-RECITAL 2012, Workshop TALAf 2012: Traitement Automatique des Langues Africaines (TALAf 2012: African Language Processing)

Describing São Tomense Using a Tree-Adjoining Meta-Grammar
Emmanuel Schang | Denys Duchier | Brunelle Magnana Ekoukou | Yannick Parmentier | Simon Petitjean
Proceedings of the 11th International Workshop on Tree Adjoining Grammars and Related Formalisms (TAG+11)