Lucas Avanço


2015

pdf
A Qualitative Analysis of a Corpus of Opinion Summaries based on Aspects
Roque López | Thiago Pardo | Lucas Avanço | Pedro Filho | Alessandro Bokan | Paula Cardoso | Márcio Dias | Fernando Nóbrega | Marco Cabezudo | Jackson Souza | Andressa Zacarias | Eloize Seno | Ariani Di Felippo
Proceedings of the 9th Linguistic Annotation Workshop

pdf
A Normalizer for UGC in Brazilian Portuguese
Magali Sanches Duran | Maria das Graças Volpe Nunes | Lucas Avanço
Proceedings of the Workshop on Noisy User-generated Text

2014

pdf
NILC_USP: An Improved Hybrid System for Sentiment Analysis in Twitter Messages
Pedro Balage Filho | Lucas Avanço | Thiago Pardo | Maria das Graças Volpe Nunes
Proceedings of the 8th International Workshop on Semantic Evaluation (SemEval 2014)

pdf
Some Issues on the Normalization of a Corpus of Products Reviews in Portuguese
Magali Sanches Duran | Lucas Avanço | Sandra Aluísio | Thiago Pardo | Maria da Graça Volpe Nunes
Proceedings of the 9th Web as Corpus Workshop (WaC-9)

pdf
A Large Corpus of Product Reviews in Portuguese: Tackling Out-Of-Vocabulary Words
Nathan Hartmann | Lucas Avanço | Pedro Balage | Magali Duran | Maria das Graças Volpe Nunes | Thiago Pardo | Sandra Aluísio
Proceedings of the Ninth International Conference on Language Resources and Evaluation (LREC'14)

Web 2.0 has allowed a never imagined communication boom. With the widespread use of computational and mobile devices, anyone, in practically any language, may post comments in the web. As such, formal language is not necessarily used. In fact, in these communicative situations, language is marked by the absence of more complex syntactic structures and the presence of internet slang, with missing diacritics, repetitions of vowels, and the use of chat-speak style abbreviations, emoticons and colloquial expressions. Such language use poses severe new challenges for Natural Language Processing (NLP) tools and applications, which, so far, have focused on well-written texts. In this work, we report the construction of a large web corpus of product reviews in Brazilian Portuguese and the analysis of its lexical phenomena, which support the development of a lexical normalization tool for, in future work, subsidizing the use of standard NLP products for web opinion mining and summarization purposes.