István Nagy T.

Also published as: Istvan Nagy, István T. Nagy, István Nagy

2014

4FX: Light Verb Constructions in a Multilingual Parallel Corpus
Anita Rácz | István Nagy T. | Veronika Vincze
Proceedings of the Ninth International Conference on Language Resources and Evaluation (LREC'14)

In this paper, we describe 4FX, a quadrilingual (English-Spanish-German-Hungarian) parallel corpus annotated for light verb constructions. We present the annotation process, and report statistical data on the frequency of LVCs in each language. We also offer inter-annotator agreement rates and we highlight some interesting facts and tendencies on the basis of comparing multilingual data from the four corpora. According to the frequency of LVC categories and the calculated Kendalls coefficient for the four corpora, we found that Spanish and German are very similar to each other, Hungarian is also similar to both, but German differs from all these three. The qualitative and quantitative data analysis might prove useful in theoretical linguistic research for all the four languages. Moreover, the corpus will be an excellent testbed for the development and evaluation of machine learning based methods aiming at extracting or identifying light verb constructions in these four languages.

pdf bib

VPCTagger: Detecting Verb-Particle Constructions With Syntax-Based Methods
István Nagy T. | Veronika Vincze
Proceedings of the 10th Workshop on Multiword Expressions (MWE)

2013

pdf bib

Dependency Parsing for Identifying Hungarian Light Verb Constructions
Veronika Vincze | János Zsibrita | István Nagy T.
Proceedings of the Sixth International Joint Conference on Natural Language Processing

pdf bib

Full-coverage Identification of English Light Verb Constructions
István Nagy T. | Veronika Vincze | Richárd Farkas
Proceedings of the Sixth International Joint Conference on Natural Language Processing

pdf bib

Identifying English and Hungarian Light Verb Constructions: A Contrastive Approach
Veronika Vincze | István Nagy T. | Richárd Farkas
Proceedings of the 51st Annual Meeting of the Association for Computational Linguistics (Volume 2: Short Papers)

2012

pdf bib abs

HunOr: A Hungarian—Russian Parallel Corpus
Martina Katalin Szabó | Veronika Vincze | István Nagy T.
Proceedings of the Eighth International Conference on Language Resources and Evaluation (LREC'12)

In this paper, we present HunOr, the first multi-domain Hungarian―Russian parallel corpus. Some of the corpus texts have been manually aligned and split into sentences, besides, named entities also have been annotated while the other parts are automatically aligned at the sentence level and they are POS-tagged as well. The corpus contains texts from the domains literature, official language use and science, however, we would like to add texts from the news domain to the corpus. In the future, we are planning to carry out a syntactic annotation of the HunOr corpus, which will further enhance the usability of the corpus in various NLP fields such as transfer-based machine translation or cross lingual information retrieval.

István Nagy T.

2014

2013

2012

2011

2009

1998

Co-authors

Venues