French Resources for Extraction and Normalization of Temporal Expressions with HeidelTime

Véronique Moriceau, Xavier Tannier


Abstract
In this paper, we describe the development of French resources for the extraction and normalization of temporal expressions with HeidelTime, a open-source multilingual, cross-domain temporal tagger. HeidelTime extracts temporal expressions from documents and normalizes them according to the TIMEX3 annotation standard. Several types of temporal expressions are extracted: dates, times, durations and temporal sets. French resources have been evaluated in two different ways: on the French TimeBank corpus, a corpus of newspaper articles in French annotated according to the ISO-TimeML standard, and on a user application for automatic building of event timelines. Results on the French TimeBank are quite satisfaying as they are comparable to those obtained by HeidelTime in English and Spanish on newswire articles. Concerning the user application, we used two temporal taggers for the preprocessing of the corpus in order to compare their performance and results show that the performances of our application on French documents are better with HeidelTime. The French resources and evaluation scripts are publicly available with HeidelTime.
Anthology ID:
L14-1382
Volume:
Proceedings of the Ninth International Conference on Language Resources and Evaluation (LREC'14)
Month:
May
Year:
2014
Address:
Reykjavik, Iceland
Editors:
Nicoletta Calzolari, Khalid Choukri, Thierry Declerck, Hrafn Loftsson, Bente Maegaard, Joseph Mariani, Asuncion Moreno, Jan Odijk, Stelios Piperidis
Venue:
LREC
SIG:
Publisher:
European Language Resources Association (ELRA)
Note:
Pages:
3239–3243
Language:
URL:
http://www.lrec-conf.org/proceedings/lrec2014/pdf/45_Paper.pdf
DOI:
Bibkey:
Cite (ACL):
Véronique Moriceau and Xavier Tannier. 2014. French Resources for Extraction and Normalization of Temporal Expressions with HeidelTime. In Proceedings of the Ninth International Conference on Language Resources and Evaluation (LREC'14), pages 3239–3243, Reykjavik, Iceland. European Language Resources Association (ELRA).
Cite (Informal):
French Resources for Extraction and Normalization of Temporal Expressions with HeidelTime (Moriceau & Tannier, LREC 2014)
Copy Citation:
PDF:
http://www.lrec-conf.org/proceedings/lrec2014/pdf/45_Paper.pdf