José M. García Miguel

Also published as: José M. García-Miguel, José M. Garcia-Miguel


2016

CORILSE is a computerized corpus of Spanish Sign Language (Lengua de Signos Española, LSE). It consists of a set of recordings from different discourse genres by Galician signers living in the city of Vigo. In this paper we describe its annotation system, developed on the basis of pre-existing ones (mostly the model of Auslan corpus). This includes primary annotation of id-glosses for manual signs, annotation of non-manual component, and secondary annotation of grammatical categories and relations, because this corpus is been built for grammatical analysis, in particular argument structures in LSE. Up until this moment the annotation has been basically made by hand, which is a slow and time-consuming task. The need to facilitate this process leads us to engage in the development of automatic or semi-automatic tools for manual and facial recognition. Finally, we also present the web repository that will make the corpus available to different types of users, and will allow its exploitation for research purposes and other applications (e.g. teaching of LSE or design of tasks for signed language assessment).

2012

This paper will present the design of a Galician syntactic corpus with application to intonation modeling. A corpus of around $3000$ sentences was designed with variation in the syntactic structure and the number of accent groups, and recorded by a professional speaker to study the influence on the prosodic structure.

2010

This is an overall description of ADESSE (""Base de datos de verbos, Alternancias de Diátesis y Esquemas Sintactico-Semánticos del Español""), an online database (http://adesse.uvigo.es/) with syntactic and semantic information for all clauses in a corpus of Spanish. The manually annotated corpus has 1.5 million words, 159,000 clauses and 3,450 different verb lemmas. ADESSE is an expanded version of BDS (""Base de datos sintácticos del español actual""), which contains the grammatical features of verbs and verb-arguments in the corpus. ADESSE has added semantic features such as verb sense, verb class and semantic role of arguments to make possible a detailed syntactic and semantic corpus-based characterization of verb valency. Each verb entry in the database is described in terms of valency potential and valency realizations (diatheses). The former includes a set of semantic roles of participants in a particular event type and a classification into a conceptual hierarchy of process types. Valency realizations are described in terms of correspondences of voice, syntactic functions and categories, and semantic roles. Verbs senses are discriminated at two levels: a more abstract level linked to a valency potential, and more specific verb senses taking into account particular lexical instantiations of arguments.