This is an internal, incomplete preview of a proposed change to the ACL Anthology.
For efficiency reasons, we don't generate MODS or Endnote formats, and the preview may be incomplete in other ways, or contain mistakes.
Do not treat this content as an official publication.
WauterBosma
Fixing paper assignments
Please select all papers that belong to the same person.
Indicate below which author they should be assigned to.
With the proliferation of applications sharing information represented in multiple ontologies, the development of automatic methods for robust and accurate ontology matching will be crucial to their success. Connecting and merging already existing semantic networks is perhaps one of the most challenging task related to knowledge engineering. This paper presents a new approach for aligning automatically a very large domain ontology of Species to WordNet in the framework of the KYOTO project. The approach relies on the use of knowledge-based Word Sense Disambiguation algorithm which accurately assigns WordNet synsets to the concepts represented in Species 2000.
A variety of methods exist for extracting terms and relations between terms from a corpus, each of them having strengths and weaknesses. Rather than just using the joint results, we apply different extraction methods in a way that the results of one method are input to another. This gives us the leverage to find terms and relations that otherwise would not be found. Our goal is to create a semantic model of a domain. To that end, we aim to find the complete terminology of the domain, consisting of terms and relations such as hyponymy and meronymy, and connected to generic wordnets and ontologies. Terms are ranked by domain-relevance only as a final step, after terminology extraction is completed. Because term relations are a large part of the semantics of a term, we estimate the relevance from its relation to other terms, in addition to occurrence and document frequencies. In the KYOTO project, we apply language-neutral terminology extraction from a parsed corpus for seven languages.