Preserving Semantic Information from Old Dictionaries: Linking Senses of the ‘Altfranzösisches Wörterbuch’ to WordNet

Achim Stein


Abstract
Historical dictionaries of the pre-digital period are important resources for the study of older languages. Taking the example of the ‘Altfranzösisches Wörterbuch’, an Old French dictionary published from 1925 onwards, this contribution shows how the printed dictionaries can be turned into a more easily accessible and more sustainable lexical database, even though a full-text retro-conversion is too costly. Over 57,000 German sense definitions were identified in uncorrected OCR output. For verbs and nouns, 34,000 senses of more than 20,000 lemmas were matched with GermaNet, a semantic network for German, and, in a second step, linked to synsets of the English WordNet. These results are relevant for the automatic processing of Old French, for the annotation and exploitation of Old French text corpora, and for the philological study of Old French in general.
Anthology ID:
2020.lrec-1.374
Volume:
Proceedings of the Twelfth Language Resources and Evaluation Conference
Month:
May
Year:
2020
Address:
Marseille, France
Venue:
LREC
SIG:
Publisher:
European Language Resources Association
Note:
Pages:
3063–3068
Language:
English
URL:
https://aclanthology.org/2020.lrec-1.374
DOI:
Bibkey:
Cite (ACL):
Achim Stein. 2020. Preserving Semantic Information from Old Dictionaries: Linking Senses of the ‘Altfranzösisches Wörterbuch’ to WordNet. In Proceedings of the Twelfth Language Resources and Evaluation Conference, pages 3063–3068, Marseille, France. European Language Resources Association.
Cite (Informal):
Preserving Semantic Information from Old Dictionaries: Linking Senses of the ‘Altfranzösisches Wörterbuch’ to WordNet (Stein, LREC 2020)
Copy Citation:
PDF:
https://preview.aclanthology.org/paclic-22-ingestion/2020.lrec-1.374.pdf