Liesbeth Augustinus


2017

pdf
Universal Dependencies for Afrikaans
Peter Dirix | Liesbeth Augustinus | Daniel van Niekerk | Frank Van Eynde
Proceedings of the NoDaLiDa 2017 Workshop on Universal Dependencies (UDW 2017)

2016

pdf
AfriBooms: An Online Treebank for Afrikaans
Liesbeth Augustinus | Peter Dirix | Daniel van Niekerk | Ineke Schuurman | Vincent Vandeghinste | Frank Van Eynde | Gerhard van Huyssteen
Proceedings of the Tenth International Conference on Language Resources and Evaluation (LREC'16)

Compared to well-resourced languages such as English and Dutch, natural language processing (NLP) tools for Afrikaans are still not abundant. In the context of the AfriBooms project, KU Leuven and the North-West University collaborated to develop a first, small treebank, a dependency parser, and an easy to use online linguistic search engine for Afrikaans for use by researchers and students in the humanities and social sciences. The search tool is based on a similar development for Dutch, i.e. GrETEL, a user-friendly search engine which allows users to query a treebank by means of a natural language example instead of a formal search instruction.

pdf
Poly-GrETEL: Cross-Lingual Example-based Querying of Syntactic Constructions
Liesbeth Augustinus | Vincent Vandeghinste | Tom Vanallemeersch
Proceedings of the Tenth International Conference on Language Resources and Evaluation (LREC'16)

We present Poly-GrETEL, an online tool which enables syntactic querying in parallel treebanks, based on the monolingual GrETEL environment. We provide online access to the Europarl parallel treebank for Dutch and English, allowing users to query the treebank using either an XPath expression or an example sentence in order to look for similar constructions. We provide automatic alignments between the nodes. By combining example-based query functionality with node alignments, we limit the need for users to be familiar with the query language and the structure of the trees in the source and target language, thus facilitating the use of parallel corpora for comparative linguistics and translation studies.

2013

pdf
The IPP Effect in Afrikaans: A Corpus Analysis
Liesbeth Augustinus | Peter Dirix
Proceedings of the 19th Nordic Conference of Computational Linguistics (NODALIDA 2013)

pdf
Example-Based Treebank Querying with GrETEL–Now Also for Spoken Dutch
Liesbeth Augustinus | Vincent Vandeghinste | Ineke Schuurman | Frank Van Eynde
Proceedings of the 19th Nordic Conference of Computational Linguistics (NODALIDA 2013)

2012

pdf
Example-Based Treebank Querying
Liesbeth Augustinus | Vincent Vandeghinste | Frank Van Eynde
Proceedings of the Eighth International Conference on Language Resources and Evaluation (LREC'12)

The recent construction of large linguistic treebanks for spoken and written Dutch (e.g. CGN, LASSY, Alpino) has created new and exciting opportunities for the empirical investigation of Dutch syntax and semantics. However, the exploitation of those treebanks requires knowledge of specific data structures and query languages such as XPath. Linguists who are unfamiliar with formal languages are often reluctant towards learning such a language. In order to make treebank querying more attractive for non-technical users we developed GrETEL (Greedy Extraction of Trees for Empirical Linguistics), a query engine in which linguists can use natural language examples as a starting point for searching the Lassy treebank without knowledge about tree representations nor formal query languages. By allowing linguists to search for similar constructions as the example they provide, we hope to bridge the gap between traditional and computational linguistics. Two case studies are conducted to provide a concrete demonstration of the tool. The architecture of the tool is optimised for searching the LASSY treebank, but the approach can be adapted to other treebank lay-outs.