Towards a Corsican Basic Language Resource Kit

Laurent Kevers, Stella Retali-Medori


Abstract
The current situation regarding the existence of natural language processing (NLP) resources and tools for Corsican reveals their virtual non-existence. Our inventory contains only a few rare digital resources, lexical or corpus databases, requiring adaptation work. Our objective is to use the Banque de Données Langue Corse project (BDLC) to improve the availability of resources and tools for the Corsican language and, in the long term, provide a complete Basic Language Ressource Kit (BLARK). We have defined a roadmap setting out the actions to be undertaken: the collection of corpora and the setting up of a consultation interface (concordancer), and of a language detection tool, an electronic dictionary and a part-of-speech tagger. The first achievements regarding these topics have already been reached and are presented in this article. Some elements are also available on our project page (http://bdlc.univ-corse.fr/tal/).
Anthology ID:
2020.lrec-1.332
Volume:
Proceedings of the Twelfth Language Resources and Evaluation Conference
Month:
May
Year:
2020
Address:
Marseille, France
Venue:
LREC
SIG:
Publisher:
European Language Resources Association
Note:
Pages:
2726–2735
Language:
English
URL:
https://aclanthology.org/2020.lrec-1.332
DOI:
Bibkey:
Cite (ACL):
Laurent Kevers and Stella Retali-Medori. 2020. Towards a Corsican Basic Language Resource Kit. In Proceedings of the Twelfth Language Resources and Evaluation Conference, pages 2726–2735, Marseille, France. European Language Resources Association.
Cite (Informal):
Towards a Corsican Basic Language Resource Kit (Kevers & Retali-Medori, LREC 2020)
Copy Citation:
PDF:
https://preview.aclanthology.org/ingestion-script-update/2020.lrec-1.332.pdf