Peter Baumann


Fixing paper assignments

  1. Please select all papers that belong to the same person.
  2. Indicate below which author they should be assigned to.
Provide a valid ORCID iD here. This will be used to match future papers to this author.
Provide the name of the school or the university where the author has received or will receive their highest degree (e.g., Ph.D. institution for researchers, or current affiliation for students). This will be used to form the new author page ID, if needed.

TODO: "submit" and "cancel" buttons here


2014

pdf bib
Using Resource-Rich Languages to Improve Morphological Analysis of Under-Resourced Languages
Peter Baumann | Janet Pierrehumbert
Proceedings of the Ninth International Conference on Language Resources and Evaluation (LREC'14)

The world-wide proliferation of digital communications has created the need for language and speech processing systems for under-resourced languages. Developing such systems is challenging if only small data sets are available, and the problem is exacerbated for languages with highly productive morphology. However, many under-resourced languages are spoken in multi-lingual environments together with at least one resource-rich language and thus have numerous borrowings from resource-rich languages. Based on this insight, we argue that readily available resources from resource-rich languages can be used to bootstrap the morphological analyses of under-resourced languages with complex and productive morphological systems. In a case study of two such languages, Tagalog and Zulu, we show that an easily obtainable English wordlist can be deployed to seed a morphological analysis algorithm from a small training set of conversational transcripts. Our method achieves a precision of 100% and identifies 28 and 66 of the most productive affixes in Tagalog and Zulu, respectively.

pdf bib
Discosuite - A parser test suite for German discontinuous structures
Wolfgang Maier | Miriam Kaeshammer | Peter Baumann | Sandra Kübler
Proceedings of the Ninth International Conference on Language Resources and Evaluation (LREC'14)

Parser evaluation traditionally relies on evaluation metrics which deliver a single aggregate score over all sentences in the parser output, such as PARSEVAL. However, for the evaluation of parser performance concerning a particular phenomenon, a test suite of sentences is needed in which this phenomenon has been identified. In recent years, the parsing of discontinuous structures has received a rising interest. Therefore, in this paper, we present a test suite for testing the performance of dependency and constituency parsers on non-projective dependencies and discontinuous constituents for German. The test suite is based on the newly released TIGER treebank version 2.2. It provides a unique possibility of benchmarking parsers on non-local syntactic relationships in German, for constituents and dependencies. We include a linguistic analysis of the phenomena that cause discontinuity in the TIGER annotation, thereby closing gaps in previous literature. The linguistic phenomena we investigate include extraposition, a placeholder/repeated element construction, topicalization, scrambling, local movement, parentheticals, and fronting of pronouns.