Sebastian Sulger


2016

pdf
Discontinuous Genitives in Hindi/Urdu
Sebastian Sulger
Proceedings of the Workshop on Discontinuous Structures in Natural Language Processing

2014

pdf
Automatic Detection of Causal Relations in German Multilogs
Tina Bögel | Annette Hautli-Janisz | Sebastian Sulger | Miriam Butt
Proceedings of the EACL 2014 Workshop on Computational Approaches to Causality in Language (CAtoCL)

pdf bib
Towards Identifying Hindi/Urdu Noun Templates in Support of a Large-Scale LFG Grammar
Sebastian Sulger | Ashwini Vaidya
Proceedings of the Fifth Workshop on South and Southeast Asian Natural Language Processing

2013

pdf
ParGramBank: The ParGram Parallel Treebank
Sebastian Sulger | Miriam Butt | Tracy Holloway King | Paul Meurer | Tibor Laczkó | György Rákosi | Cheikh Bamba Dione | Helge Dyvik | Victoria Rosén | Koenraad De Smedt | Agnieszka Patejuk | Özlem Çetinoğlu | I Wayan Arka | Meladel Mistica
Proceedings of the 51st Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers)

2012

pdf
Identifying Urdu Complex Predication via Bigram Extraction
Miriam Butt | Tina Bögel | Annette Hautli | Sebastian Sulger | Tafseer Ahmed
Proceedings of COLING 2012

pdf
A Reference Dependency Bank for Analyzing Complex Predicates
Tafseer Ahmed | Miriam Butt | Annette Hautli | Sebastian Sulger
Proceedings of the Eighth International Conference on Language Resources and Evaluation (LREC'12)

When dealing with languages of South Asia from an NLP perspective, a problem that repeatedly crops up is the treatment of complex predicates. This paper presents a first approach to the analysis of complex predicates (CPs) in the context of dependency bank development. The efforts originate in theoretical work on CPs done within Lexical-Functional Grammar (LFG), but are intended to provide a guideline for analyzing different types of CPs in an independent framework. Despite the fact that we focus on CPs in Hindi and Urdu, the design of the dependencies is kept general enough to account for CP constructions across languages.

2011

pdf
Extracting and Classifying Urdu Multiword Expressions
Annette Hautli | Sebastian Sulger
Proceedings of the ACL 2011 Student Session

2010

pdf
Transliterating Urdu for a Broad-Coverage Urdu/Hindi LFG Grammar
Muhammad Kamran Malik | Tafseer Ahmed | Sebastian Sulger | Tina Bögel | Atif Gulzar | Ghulam Raza | Sarmad Hussain | Miriam Butt
Proceedings of the Seventh International Conference on Language Resources and Evaluation (LREC'10)

In this paper, we present a system for transliterating the Arabic-based script of Urdu to a Roman transliteration scheme. The system is integrated into a larger system consisting of a morphology module, implemented via finite state technologies, and a computational LFG grammar of Urdu that was developed with the grammar development platform XLE (Crouch et al. 2008). Our long-term goal is to handle Hindi alongside Urdu; the two languages are very similar with respect to syntax and lexicon and hence, one grammar can be used to cover both languages. However, they are not similar concerning the script -- Hindi is written in Devanagari, while Urdu uses an Arabic-based script. By abstracting away to a common Roman transliteration scheme in the respective transliterators, our system can be enabled to handle both languages in parallel. In this paper, we discuss the pipeline architecture of the Urdu-Roman transliterator, mention several linguistic and orthographic issues and present the integration of the transliterator into the LFG parsing system.