Proceedings of the 9th Web as Corpus Workshop (WaC-9)

Felix Bildhauer, Roland Schäfer (Editors)



pdf bib
Proceedings of the 9th Web as Corpus Workshop (WaC-9)
Felix Bildhauer | Roland Schäfer

pdf bib
Finding Viable Seed URLs for Web Corpora: A Scouting Approach and Comparative Study of Available Sources
Adrien Barbaresi

pdf bib
Focused Web Corpus Crawling
Roland Schäfer | Adrien Barbaresi | Felix Bildhauer

pdf bib
Less Destructive Cleaning of Web Documents by Using Standoff Annotation
Maik Stührenberg

pdf bib
Some Issues on the Normalization of a Corpus of Products Reviews in Portuguese
Magali Sanches Duran | Lucas Avanço | Sandra Aluísio | Thiago Pardo | Maria da Graça Volpe Nunes

pdf bib
{bs,hr,sr}WaC - Web Corpora of Bosnian, Croatian and Serbian
Nikola Ljubešić | Filip Klubička

pdf bib
The PAISÀ Corpus of Italian Web Texts
Verena Lyding | Egon Stemle | Claudia Borghetti | Marco Brunello | Sara Castagnoli | Felice Dell’Orletta | Henrik Dittmann | Alessandro Lenci | Vito Pirrelli