Paulo Fernandes
2022
PortiLexicon-UD: a Portuguese Lexical Resource according to Universal Dependencies Model
Lucelene Lopes
|
Magali Duran
|
Paulo Fernandes
|
Thiago Pardo
Proceedings of the Thirteenth Language Resources and Evaluation Conference
This paper presents PortiLexicon-UD, a large and freely available lexicon for Portuguese delivering morphosyntactic information according to the Universal Dependencies model. This lexical resource includes part of speech tags, lemmas, and morphological information for words, with 1,221,218 entries (considering word duplication due to different combination of PoS tag, lemma, and morphological features). We report the lexicon creation process, its computational data structure, and its evaluation over an annotated corpus, showing that it has a high language coverage and good quality data.
2012
A Fast, Memory Efficient, Scalable and Multilingual Dictionary Retriever
Paulo Fernandes
|
Lucelene Lopes
|
Carlos A. Prolo
|
Afonso Sales
|
Renata Vieira
Proceedings of the Eighth International Conference on Language Resources and Evaluation (LREC'12)
This paper presents a novel approach to deal with dictionary retrieval. This new approach is based on a very efficient and scalable theoretical structure called Multi-Terminal Multi-valued Decision Diagrams (MTMDD). Such tool allows the definition of very large, even multilingual, dictionaries without significant increase in memory demands, and also with virtually no additional processing cost. Besides the general idea of the novel approach, this paper presents a description of the technologies involved, and their implementation in a software package called WAGGER. Finally, we also present some examples of usage and possible applications of this dictionary retriever.
Search
Co-authors
- Lucelene Lopes 2
- Carlos A. Prolo 1
- Afonso Sales 1
- Renata Vieira 1
- Magali Sanches Duran 1
- show all...
Venues
- lrec2