Alignment Data base for a Sign Language Concordancer

Marion Kaczmarek, Michael Filhol


Abstract
This article deals with elaborating a data base of alignments of parallel Franch-LSF segments. This data base is meant to be searched using a concordancer which we are also designing. We wish to equip Sign Language translators with tools similar to those used in text-to-text translation. To do so, we need language resources to feed them. Already existing Sign Language corpora can be found, but do not match our needs: working around a Sign Language concordancer, the corpus must be a parallel one and provide various examples of vocabulary and grammatical construction. We started with a parallel corpus of 40 short news and 120 SL videos , which we aligned manually by segments of various length. We described the methodology we used, how we define our segments and alignments. The last part concerns how we hope to allow the data base to keep growing in a near future.
Anthology ID:
2020.lrec-1.744
Volume:
Proceedings of the Twelfth Language Resources and Evaluation Conference
Month:
May
Year:
2020
Address:
Marseille, France
Editors:
Nicoletta Calzolari, Frédéric Béchet, Philippe Blache, Khalid Choukri, Christopher Cieri, Thierry Declerck, Sara Goggi, Hitoshi Isahara, Bente Maegaard, Joseph Mariani, Hélène Mazo, Asuncion Moreno, Jan Odijk, Stelios Piperidis
Venue:
LREC
SIG:
Publisher:
European Language Resources Association
Note:
Pages:
6069–6072
Language:
English
URL:
https://aclanthology.org/2020.lrec-1.744
DOI:
Bibkey:
Cite (ACL):
Marion Kaczmarek and Michael Filhol. 2020. Alignment Data base for a Sign Language Concordancer. In Proceedings of the Twelfth Language Resources and Evaluation Conference, pages 6069–6072, Marseille, France. European Language Resources Association.
Cite (Informal):
Alignment Data base for a Sign Language Concordancer (Kaczmarek & Filhol, LREC 2020)
Copy Citation:
PDF:
https://preview.aclanthology.org/emnlp22-frontmatter/2020.lrec-1.744.pdf