Organizing and Improving a Database of French Word Formation Using Formal Concept Analysis

Nyoman Juniarta, Olivier Bonami, Nabil Hathout, Fiammetta Namer, Yannick Toussaint


Abstract
We apply Formal Concept Analysis (FCA) to organize and to improve the quality of Démonette2, a French derivational database, through a detection of both missing and spurious derivations in the database. We represent each derivational family as a graph. Given that the subgraph relation exists among derivational families, FCA can group families and represent them in a partially ordered set (poset). This poset is also useful for improving the database. A family is regarded as a possible anomaly (meaning that it may have missing and/or spurious derivations) if its derivational graph is almost, but not completely identical to a large number of other families.
Anthology ID:
2022.lrec-1.422
Volume:
Proceedings of the Thirteenth Language Resources and Evaluation Conference
Month:
June
Year:
2022
Address:
Marseille, France
Editors:
Nicoletta Calzolari, Frédéric Béchet, Philippe Blache, Khalid Choukri, Christopher Cieri, Thierry Declerck, Sara Goggi, Hitoshi Isahara, Bente Maegaard, Joseph Mariani, Hélène Mazo, Jan Odijk, Stelios Piperidis
Venue:
LREC
SIG:
Publisher:
European Language Resources Association
Note:
Pages:
3969–3976
Language:
URL:
https://aclanthology.org/2022.lrec-1.422
DOI:
Bibkey:
Cite (ACL):
Nyoman Juniarta, Olivier Bonami, Nabil Hathout, Fiammetta Namer, and Yannick Toussaint. 2022. Organizing and Improving a Database of French Word Formation Using Formal Concept Analysis. In Proceedings of the Thirteenth Language Resources and Evaluation Conference, pages 3969–3976, Marseille, France. European Language Resources Association.
Cite (Informal):
Organizing and Improving a Database of French Word Formation Using Formal Concept Analysis (Juniarta et al., LREC 2022)
Copy Citation:
PDF:
https://preview.aclanthology.org/nschneid-patch-4/2022.lrec-1.422.pdf