Ruli Manurung

Also published as: R. Manurung


2018

Ambiguity is a problem we frequently face in Natural Language Processing. Word Sense Disambiguation (WSD) is a task to determine the correct sense of an ambiguous word. However, research in WSD for Indonesian is still rare to find. The availability of English-Indonesian parallel corpora and WordNet for both languages can be used as training data for WSD by applying Cross-Lingual WSD method. This training data is used as an input to build a model using supervised machine learning algorithms. Our research also examines the use of Word Embedding features to build the WSD model.

2017

The Conference on Computational Natural Language Learning (CoNLL) features a shared task, in which participants train and test their learning systems on the same data sets. In 2017, the task was devoted to learning dependency parsers for a large number of languages, in a real-world setting without any gold-standard annotation on input. All test sets followed a unified annotation scheme, namely that of Universal Dependencies. In this paper, we define the task and evaluation methodology, describe how the data sets were prepared, report and analyze the main results, and provide a brief categorization of the different approaches of the participating systems.

2015

2014

2012

2010

2008

2006

As part of a project to construct an interactive program which will encourage children to play with language by building jokes, we have developed a large lexical database, closely based on WordNet. As well as the standard WordNet information about part of speech, synonymy, hyponymy, etc, we have added phonetic representations and symbolic links allowing attachment of pictures. All information is represented in a relational database, allowing powerful searches using SQL via a Java API. The lexicon has a facility to label subsets of the lexicon with symbolic names, and we are working to incorporate some educationally relevant word lists as sublexicons. This should also allow us to improve the familiarity ratings which the lexicon assigns to words.