Chiharu Narawa
2012
Classifying Standard Linguistic Processing Functionalities based on Fundamental Data Operation Types
Yoshihiko Hayashi
|
Chiharu Narawa
Proceedings of the Eighth International Conference on Language Resources and Evaluation (LREC'12)
iIt is often argued that a set of standard linguistic processing functionalities should be identified,with each of them given a formal specification. We would benefit from the formal specifications; for example, the semi-automated composition of a complex language processing workflow could be enabled in due time. This paper extracts a standard set of linguistic processing functionalities and tries to classify them formally. To do this, we first investigated prominent types of language Web services/linguistic processors by surveying a Web-based language service infrastructure and published NLP toolkits. We next induced a set of standard linguistic processing functionalities by carefully investigating each of the linguistic processor types. The standard linguistic processing functionalities was then characterized by the input/output data types, as well as the required data operation types, which were also derived from the investigation. As a result, we came up with an ontological depiction that classifies linguistic processors and linguistic processing functionalities with respect to the fundamental data operation types. We argue that such an ontological depiction can explicitly describe the functional aspects of a linguistic processing functionality.
2010
LAF/GrAF-grounded Representation of Dependency Structures
Yoshihiko Hayashi
|
Thierry Declerck
|
Chiharu Narawa
Proceedings of the Seventh International Conference on Language Resources and Evaluation (LREC'10)
This paper shows that a LAF/GrAF-based annotation schema can be used for the adequate representation of syntactic dependency structures possibly in many languages. We first argue that there are at least two types of textual units that can be annotated with dependency information: words/tokens and chunks/phrases. We especially focus on importance of the latter dependency unit: it is particularly useful for representing Japanese dependency structures, known as Kakari-Uke structure. Based on this consideration, we then discuss a sub-typing of GrAF to represent the corresponding dependency structures. We derive three node types, two edge types, and the associated constraints for properly representing both the token-based and the chunk-based dependency structures. We finally propose a wrapper program that, as a proof of concept, converts output data from different dependency parsers in proprietary XML formats to the GrAF-compliant XML representation. It partially proves the value of an international standard like LAF/GrAF in the Web service context: an existing dependency parser can be, in a sense, standardized, once wrapped by a data format conversion process.
2008
Ontologizing Lexicon Access Functions based on an LMF-based Lexicon Taxonomy
Yoshihiko Hayashi
|
Chiharu Narawa
|
Monica Monachini
|
Claudia Soria
|
Nicoletta Calzolari
Proceedings of the Sixth International Conference on Language Resources and Evaluation (LREC'08)
This paper discusses ontologization of lexicon access functions in the context of a service-oriented language infrastructure, such as the Language Grid. In such a language infrastructure, an access function to a lexical resource, embodied as an atomic Web service, plays a crucially important role in composing a composite Web service tailored to a users specific requirement. To facilitate the composition process involving service discovery, planning and invocation, the language infrastructure should be ontology-based; hence the ontologization of a range of lexicon functions is highly required. In a service-oriented environment, lexical resources however can be classified from a service-oriented perspective rather than from a lexicographically motivated standard. Hence to address the issue of interoperability, the taxonomy for lexical resources should be ground to principled and shared lexicon ontology. To do this, we have ontologized the standardized lexicon modeling framework LMF, and utilized it as a foundation to stipulate the service-oriented lexicon taxonomy and the corresponding ontology for lexicon access functions. This paper also examines a possible solution to fill the gap between the ontological descriptions and the actual Web service API by adopting a W3C recommendation SAWSDL, with which Web service descriptions can be linked with the domain ontology.