Michaela Atterer


2009

2008

Web count statistics gathered from search engines have been widely used as a resource in a variety of NLP tasks. For some tasks, however, the information they exploit is not fine-grained enough. We propose an inverted index over grammatical relations as a fast and reliable resource to access more general and also more detailed frequency information. To build the index, we use a dependency parser to parse a large corpus. We extract binary dependency relations, such as he-subj-say (“he” is the subject of “say”) as index terms and construct the index using publicly available open-source indexing software. The unit we index over is the sentence. The index can be used to extract grammatical relations and frequency counts for these relations. The framework also provides the possibility to search for partial dependencies (say, the frequency of “he” occurring in subject position), words, strings and a combination of these. One possible application is the disambiguation of syntactic structures.
Question Answering systems are systems that enable the user to ask questions in natural language and to also receive an answer in natural language. Most existing systems, however, are constructed for the English language, and it is not clear in how far these approaches are also applicable to other languages. A richer morphology, greater syntactic variability, and smaller fraction of webpages available in the language are just some issues that complicate the construction of systems for German. In this paper, we present a modular Question Answering System for German which uses several morphological resources to increase recall. Nouns are converted into verbs, verbs into nouns, and the tenses of verbs are modified. We use a web search engine as a back end to allow for open-domain Question Answering. A POS-tagger is employed to identify answer candidates which are then filtered and tiled. The system is shown to achieve a higher recall than other systems for German.

2007

2006

2004

2002