Workshop on Web as Corpus (2016)


up

pdf (full)
Proceedings of the 10th Web as Corpus Workshop

pdf
Proceedings of the 10th Web as Corpus Workshop
Paul Cook | Stefan Evert | Roland Schäfer | Egon Stemle

pdf
Automatic Classification by Topic Domain for Meta Data Generation, Web Corpus Evaluation, and Corpus Comparison
Roland Schäfer | Felix Bildhauer

pdf
Efficient construction of metadata-enhanced web corpora
Adrien Barbaresi

pdf
Topically-focused Blog Corpora for Multiple Languages
Andrew Salway | Dag Elgesem | Knut Hofland | Øystein Reigem | Lubos Steskal

pdf
The Challenges and Joys of Analysing Ongoing Language Change in Web-based Corpora: a Case Study
Anne Krause

pdf
Using the Web and Social Media as Corpora for Monitoring the Spread of Neologisms. The case of ‘rapefugee’, ‘rapeugee’, and ‘rapugee’.
Quirin Würschinger | Mohammad Fazleh Elahi | Desislava Zhekova | Hans-Jörg Schmid

pdf
EmpiriST 2015: A Shared Task on the Automatic Linguistic Annotation of Computer-Mediated Communication and Web Corpora
Michael Beißwenger | Sabine Bartsch | Stefan Evert | Kay-Michael Würzner

pdf
SoMaJo: State-of-the-art tokenization for German web and social media texts
Thomas Proisl | Peter Uhrig

pdf
UdS-(retrain|distributional|surface): Improving POS Tagging for OOV Words in German CMC and Web Data
Jakob Prange | Andrea Horbach | Stefan Thater

pdf
Babler - Data Collection from the Web to Support Speech Recognition and Keyword Search
Gideon Mendels | Erica Cooper | Julia Hirschberg

pdf
A Global Analysis of Emoji Usage
Nikola Ljubešić | Darja Fišer

pdf
Genre classification for a corpus of academic webpages
Erika Dalan | Serge Sharoff

pdf
On Bias-free Crawling and Representative Web Corpora
Roland Schäfer

pdf
EmpiriST: AIPHES - Robust Tokenization and POS-Tagging for Different Genres
Steffen Remus | Gerold Hintz | Chris Biemann | Christian M. Meyer | Darina Benikova | Judith Eckle-Kohler | Margot Mieskes | Thomas Arnold

pdf
bot.zen @ EmpiriST 2015 - A minimally-deep learning PoS-tagger (trained for German CMC and Web data)
Egon Stemle

pdf
LTL-UDE @ EmpiriST 2015: Tokenization and PoS Tagging of Social Media Text
Tobias Horsmann | Torsten Zesch