Peter Berck


Annotating Speech, Attitude and Perception Reports
Corien Bary | Leopold Hess | Kees Thijs | Peter Berck | Iris Hendrickx
Proceedings of the 11th Linguistic Annotation Workshop

We present REPORTS, an annotation scheme for the annotation of speech, attitude and perception reports. Such a scheme makes it possible to annotate the various text elements involved in such reports (e.g. embedding entity, complement, complement head) and their relations in a uniform way, which in turn facilitates the automatic extraction of information on, for example, complementation and vocabulary distribution. We also present the Ancient Greek corpus RAG (Thucydides’ History of the Peloponnesian War), to which we have applied this scheme using the annotation tool BRAT. We discuss some of the issues, both theoretical and practical, that we encountered, show how the corpus helps in answering specific questions, and conclude that REPORTS fitted in well with our needs.


Memory-based Grammatical Error Correction
Antal van den Bosch | Peter Berck
Proceedings of the Seventeenth Conference on Computational Natural Language Learning: Shared Task


Memory-based text correction for preposition and determiner errors
Antal van den Bosch | Peter Berck
Proceedings of the Seventh Workshop on Building Educational Applications Using NLP


Exploring and Enriching a Language Resource Archive via the Web
Marc Kemps-Snijders | Alex Klassmann | Claus Zinn | Peter Berck | Albert Russel | Peter Wittenburg
Proceedings of the Sixth International Conference on Language Resources and Evaluation (LREC'08)

The “download first, then process paradigm” is still the predominant working method amongst the research community. The web-based paradigm, however, offers many advantages from a tool development and data management perspective as they allow a quick adaptation to changing research environments. Moreover, new ways of combining tools and data are increasingly becoming available and will eventually enable a true web-based workflow approach, thus challenging the “download first, then process” paradigm. The necessary infrastructure for managing, exploring and enriching language resources via the Web will need to be delivered by projects like CLARIN and DARIAH.


ANNEX - a web-based Framework for Exploiting Annotated Media Resources
Peter Berck | Albert Russel
Proceedings of the Fifth International Conference on Language Resources and Evaluation (LREC’06)

Manual annotation of various media streams, time series data and also text sequences is still a very time consuming work that has to be carried out in many areas of linguistics and beyond. Based on many theoretical discussions and practical experiences professional tools have been deployed such as ELAN that support the researcher in his/her work. Most of these annotation tools operate on local computers. However, since more and more language resources are stored in web-accessible archives, researchers want to take profit from the new possibilities. ANNEX was developed to fill this gap, since it allows web-based analysis of complex annotated media streams, i.e., the users don’t have to download resources and don’t have to download and install programs. By simply using a normal web-browser they can start their linguistic work. Yet, due to the architecture of the Internet, ANNEX does not offer the options to create annotations, but this feature will come. However, users have to be aware of the fact that media streaming does not offer that high accuracy as on local computers.

Ontology-based Language Archive Utilization
Peter Berck | Hans-Jörg Bibiko | Marc Kemps-Snijders | Albert Russel | Peter Wittenburg
Proceedings of the Fifth International Conference on Language Resources and Evaluation (LREC’06)

At the MPI for Psycholinguistics a large archive with language resources has been created with contributions from many different individual researchers and research projects. All of these resources, in particular annotated media streams and multimedia lexica, are accessible via the web and can be utilized with the help of web-based utilization frameworks. Therefore, the archive lends itself to motivate users to operate across the boundaries of single corpora and to support cross-language work. This, however, can only be done when the problems of interoperability, in particular at the level of linguistic encoding, can be solved in an efficient way. Two Max-Planck-Institutes are cooperating to build a framework that allows users to easily create their own practical ontologies and if wanted to relate their concepts to central ontologies.


Unsupervised Discovery of Phonological Categories through Supervised Learning of Morphological Rules
Walter Daelemans | Peter Berck | Steven Gillis
COLING 1996 Volume 1: The 16th International Conference on Computational Linguistics

pdf bib
MBT: A Memory-Based Part of Speech Tagger-Generator
Walter Daelemans | Jakub Zavrel | Peter Berck | Steven Gillis
Fourth Workshop on Very Large Corpora