Ulrich Schäfer

Also published as: Ulrich Schaefer, Ulrich Schafer


2012

pdf
A Graphical Citation Browser for the ACL Anthology
Benjamin Weitz | Ulrich Schäfer
Proceedings of the Eighth International Conference on Language Resources and Evaluation (LREC'12)

Navigation in large scholarly paper collections is tedious and not well supported in most scientific digital libraries. We describe a novel browser-based graphical tool implemented using HTML5 Canvas. It displays citation information extracted from the paper text to support useful navigation. The tool is implemented using a client/server architecture. A citation graph of the digital library is built in the memory of the server. On the client side, egdes of the displayed citation (sub)graph surrounding a document are labeled with keywords signifying the kind of citation made from one document to another. These keywords were extracted using NLP tools such as tokenizer, sentence boundary detection and part-of-speech tagging applied to the text extracted from the original PDF papers (currently 22,500). By clicking on an egde, the user can inspect the corresponding citation sentence in context, in most cases even also highlighted in the original PDF layout. The system is publicly accessible as part of the ACL Anthology Searchbench.

pdf
A Fully Coreference-annotated Corpus of Scholarly Papers from the ACL Anthology
Ulrich Schäfer | Christian Spurk | Jörg Steffen
Proceedings of COLING 2012: Posters

pdf
Extracting glossary sentences from scholarly articles: A comparative evaluation of pattern bootstrapping and deep analysis
Melanie Reiplinger | Ulrich Schäfer | Magdalena Wolska
Proceedings of the ACL-2012 Special Workshop on Rediscovering 50 Years of Discoveries

pdf
Towards an ACL Anthology Corpus with Logical Document Structure. An Overview of the ACL 2012 Contributed Task
Ulrich Schäfer | Jonathon Read | Stephan Oepen
Proceedings of the ACL-2012 Special Workshop on Rediscovering 50 Years of Discoveries

pdf
Combining OCR Outputs for Logical Document Structure Markup. Technical Background to the ACL 2012 Contributed Task
Ulrich Schäfer | Benjamin Weitz
Proceedings of the ACL-2012 Special Workshop on Rediscovering 50 Years of Discoveries

2011

pdf
Ensemble-style Self-training on Citation Classification
Cailing Dong | Ulrich Schäfer
Proceedings of 5th International Joint Conference on Natural Language Processing

pdf bib
The ACL Anthology Searchbench
Ulrich Schäfer | Bernd Kiefer | Christian Spurk | Jörg Steffen | Rui Wang
Proceedings of the ACL-HLT 2011 System Demonstrations

2010

pdf
DL Meet FL: A Bidirectional Mapping between Ontologies and Linguistic Knowledge
Hans-Ulrich Krieger | Ulrich Schäfer
Coling 2010: Posters

pdf bib
Scientific Authoring Support: A Tool to Navigate in Typed Citation Graphs
Ulrich Schäfer | Uwe Kasterka
Proceedings of the NAACL HLT 2010 Workshop on Computational Linguistics and Writing: Writing Processes and Authoring Aids

pdf
Towards an Integrated Architecture for Composite Language Services and Multiple Linguistic Processing Components
Arif Bramantoro | Ulrich Schäfer | Toru Ishida
Proceedings of the Seventh International Conference on Language Resources and Evaluation (LREC'10)

Web services are increasingly being used in the natural language processing community as a way to increase the interoperability amongst language resources. This paper extends our previous work on integrating two different platforms, i.e. Heart of Gold and Language Grid. The Language Grid is an infrastructure built on top of the Internet to provide distributed language services. Heart of Gold is known as middleware architecture for integrating deep and shallow natural language processing components. The new feature of the integrated architecture is the combination of composite language services in the Language Grid and the multiple linguistic processing components in Heart of Gold to provide a better quality of language resources available on the Web. Thus, language resources with different characteristics can be combined based on the concept of service oriented computing with different treatment for each combination. Having Heart of Gold fully integrated in the Language Grid environment would contribute to the heterogeneity of language services.

2008

pdf
Extracting and Querying Relations in Scientific Papers on Language Technology
Ulrich Schäfer | Hans Uszkoreit | Christian Federmann | Torsten Marek | Yajing Zhang
Proceedings of the Sixth International Conference on Language Resources and Evaluation (LREC'08)

We describe methods for extracting interesting factual relations from scientific texts in computational linguistics and language technology taken from the ACL Anthology. We use a hybrid NLP architecture with shallow preprocessing for increased robustness and domain-specific, ontology-based named entity recognition, followed by a deep HPSG parser running the English Resource Grammar (ERG). The extracted relations in the MRS (minimal recursion semantics) format are simplified and generalized using WordNet. The resulting “quriples” are stored in a database from where they can be retrieved (again using abstraction methods) by relation-based search. The query interface is embedded in a web browser-based application we call the Scientist’s Workbench. It supports researchers in editing and online-searching scientific papers.

2006

pdf
Automatic Testing and Evaluation of Multilingual Language Technology Resources and Components
Ulrich Schäfer | Daniel Beck
Proceedings of the Fifth International Conference on Language Resources and Evaluation (LREC’06)

We describe SProUTomat, a tool for daily building, testing and evaluating a complex general-purpose multilingual natural language text processor including its linguistic resources (lingware). Software and lingware are developed, maintained and extended in a distributed manner by multiple authors and projects, i.e., the source code stored in a version control system is modified frequently. The modular design of different, dedicated lingware modules like tokenizers, morphology, gazetteers, type hierarchy, rule formalism on the one hand increases flexibility and re-usability, but on the other hand may lead to fragility with respect to changes. Therefore, frequent testing as known from software engineering is necessary also for lingware to warrant a high level of quality and overall stability of the system. We describe the build, testing and evaluation methods for LT software and lingware we have developed on the basis of the open source, platform-independent Apache Ant tool and the configurable evaluation tool JTaCo.

pdf
OntoNERdIE – Mapping and Linking Ontologies to Named Entity Recognition and Information Extraction Resources
Ulrich Schäfer
Proceedings of the Fifth International Conference on Language Resources and Evaluation (LREC’06)

Semantic Web and NLP We describe an implemented offline procedure that maps OWL/RDF-encoded ontologies with large, dynamically maintained instance data to named entity recognition (NER) and information extraction (IE) engine resources, preserving hierarchical concept information and links back to the ontology concepts and instances. The main motivations are (i) improving NER/IE precision and recall in closed domains, (ii) exploiting linguistic knowledge (context, inflection, anaphora) for identifying ontology instances in texts more robustly, (iii) giving full access to ontology instances and concepts in natural language processing results, e.g. for subsequent ontology queries, navigation or inference, (iv) avoiding duplication of work in development and maintenance of similar resources in independent places, namely lingware and ontologies. We show an application in hybrid deep-shallow natural language processing that is e.g. used for question analysis in closed domains. Further applications could be automatic hyperlinking or other innovative semantic-web related applications.

pdf
Preprocessing and Tokenisation Standards in DELPH-IN Tools
Benjamin Waldron | Ann Copestake | Ulrich Schäfer | Bernd Kiefer
Proceedings of the Fifth International Conference on Language Resources and Evaluation (LREC’06)

We discuss preprocessing and tokenisation standards within DELPH-IN, a large scale open-source collaboration providing multiple independent multilingual shallow and deep processors. We discuss (i) a component-specific XML interface format which has been used for some time to interface preprocessor results to the PET parser, and (ii) our implementation of a more generic XML interface format influenced heavily by the (ISO working draft) Morphosyntactic Annotation Framework (MAF). Our generic format encapsulates the information which may be passed from the preprocessing stage to a parser: it uses standoff-annotation, a lattice for the representation of structural ambiguity, intra-annotation dependencies and allows for highly structured annotation content. This work builds on the existing Heart of Gold middleware system, and previous work on Robust Minimal Recursion Semantics (RMRS) as part of an inter-component interface. We give examples of usage with a number of the DELPH-IN processing components and deep grammars.

pdf
Middleware for Creating and Combining Multi-dimensional NLP Markup
Ulrich Schäfer
Proceedings of the 5th Workshop on NLP and XML (NLPXML-2006): Multi-Dimensional Markup in Natural Language Processing

2004

pdf
The DeepThought Core Architecture Framework
Ulrich Callmeier | Andreas Eisele | Ulrich Schäfer | Melanie Siegel
Proceedings of the Fourth International Conference on Language Resources and Evaluation (LREC’04)

2003

pdf bib
WHAT: An XSLT-based Infrastructure for the Integration of Natural Language Processing Components
Ulrich Schäfer
Proceedings of the HLT-NAACL 2003 Workshop on Software Engineering and Architecture of Language Technology Systems (SEALTS)

pdf
Integrated Shallow and Deep Parsing: TopP Meets HPSG
Anette Frank | Markus Becker | Berthold Crysmann | Bernd Kiefer | Ulrich Schäfer
Proceedings of the 41st Annual Meeting of the Association for Computational Linguistics

pdf
Integrating Information Extraction and Automatic Hyperlinking
Stephan Busemann | Witold Drozdzynski | Hans-Ulrich Krieger | Jakub Piskorski | Ulrich Schaefer | Hans Uszkoreit | Feiyu Xu
The Companion Volume to the Proceedings of 41st Annual Meeting of the Association for Computational Linguistics

2002

pdf
An Integrated Archictecture for Shallow and Deep Processing
Berthold Crysmann | Anette Frank | Bernd Kiefer | Stefan Mueller | Guenter Neumann | Jakub Piskorski | Ulrich Schaefer | Melanie Siegel | Hans Uszkoreit | Feiyu Xu | Markus Becker | Hans-Ulrich Krieger
Proceedings of the 40th Annual Meeting of the Association for Computational Linguistics

1994

pdf
TDL-A Type Description Language for Constraint-Based Grammars
Hans-Ulrich Krieger | Ulrich Schafer
COLING 1994 Volume 2: The 15th International Conference on Computational Linguistics