Abstract
The use/use for relationship a thesaurus is usually more complex than the (para-) synonymy recommended in the ISO-2788 standard describing the content of these controlled vocabularies. The fact that a non preferred term can refer to multiple preferred terms (only the latter are relevant in controlled indexing) makes this relationship difficult to use in automatic annotation applications : it generates ambiguity cases. In this paper, we present the CARROT algorithm, meant to rank the output of our Information Extraction pipeline, and how this algorithm can be used to select the relevant preferred term out of different possibilities. This selection is meant to provide suggestions of keywords to human annotators, in order to ease and speed up their daily process and is based on the structure of their thesaurus. We achieve a 95 % success, and discuss these results along with perspectives for this experiment.- Anthology ID:
- 2007.jeptalnrecital-long.18
- Volume:
- Actes de la 14ème conférence sur le Traitement Automatique des Langues Naturelles. Articles longs
- Month:
- June
- Year:
- 2007
- Address:
- Toulouse, France
- Editors:
- Nabil Hathout, Philippe Muller
- Venue:
- JEP/TALN/RECITAL
- SIG:
- Publisher:
- ATALA
- Note:
- Pages:
- 185–194
- Language:
- URL:
- https://aclanthology.org/2007.jeptalnrecital-long.18
- DOI:
- Cite (ACL):
- Véronique Malaisé, Luit Gazendam, and Hennie Brugman. 2007. Disambiguating automatic semantic annotation based on a thesaurus structure. In Actes de la 14ème conférence sur le Traitement Automatique des Langues Naturelles. Articles longs, pages 185–194, Toulouse, France. ATALA.
- Cite (Informal):
- Disambiguating automatic semantic annotation based on a thesaurus structure (Malaisé et al., JEP/TALN/RECITAL 2007)
- PDF:
- https://preview.aclanthology.org/nschneid-patch-5/2007.jeptalnrecital-long.18.pdf