Filipe Nunes

2009

2008

pdf abs
LX-Service: Web Services of Language Technology for Portuguese
António Branco | Francisco Costa | Pedro Martins | Filipe Nunes | João Silva | Sara Silveira
Proceedings of the Sixth International Conference on Language Resources and Evaluation (LREC'08)

In the present paper we report on the development of a cluster of web services of language technology for Portuguese that we named as LXService. These web services permit the direct interaction of client applications with language processing tools via the Internet. This way of making available language technology was motivated by the need of its integration in an eLearning environment. In particular, it was motivated by the development of new multilingual functionalities that were aimed at extending a Learning Management System and that needed to resort to the outcome of some of those tools in a distributed and remote fashion. This specific usage situation happens however to be representative of a typical and recurrent set up in the utilization of language processing tools in different settings and projects. Therefore, the approach reported here offers not only a solution for this specific problem, which immediately motivated it, but contributes also some first steps for what we see as an important paradigm shift in terms of the way language technology can be distributed and find a better way to unleash its full potential and impact.

2006

pdf abs
Open Resources and Tools for the Shallow Processing of Portuguese: The TagShare Project
Florbela Barreto | António Branco | Eduardo Ferreira | Amália Mendes | Maria Fernanda Bacelar do Nascimento | Filipe Nunes | João Ricardo Silva
Proceedings of the Fifth International Conference on Language Resources and Evaluation (LREC’06)

This paper presents the TagShare project and the linguistic resources and tools for the shallow processing of Portuguese developed in its scope. These resources include a 1 million token corpus that has been accurately hand annotated with a variety of linguistic information, as well as several state of the art shallow processing tools capable of automatically producing that type of annotation. At present, the linguistic annotations in the corpus are sentence and paragraph boundaries, token boundaries, morphosyntactic POS categories, values of inflection features, lemmas and namedentities. Hence, the set of tools comprise a sentence chunker, a tokenizer, a POS tagger, nominal and verbal analyzers and lemmatizers, a verbal conjugator, a nominal inflector, and a namedentity recognizer, some of which underline several online services.

Co-authors

Sara Silveira 2

Florbela Barreto 1

Amália Mendes 1

Maria Fernanda Bacelar do Nascimento 1

Filipe Nunes

2009

2008

2006

Co-authors

Venues