Stefanie Hegele


2024

pdf bib
Surveying the Technology Support of Languages
Annika Grützner-Zahn | Federico Gaspari | Maria Giagkou | Stefanie Hegele | Andy Way | Georg Rehm
Proceedings of the Second International Workshop Towards Digital Language Equality (TDLE): Focusing on Sustainability @ LREC-COLING 2024

Many of the world’s languages are left behind when it comes to Language Technology applications, since most of these are available only in a limited number of languages, creating a digital divide that affects millions of users worldwide. It is crucial, therefore, to monitor and quantify the progress of technology support for individual languages, which also enables comparisons across language communities. In this way, efforts can be directed towards reducing language barriers, promoting economic and social inclusion, and ensuring that all citizens can use their preferred language in the digital age. This paper critically reviews and compares recent quantitative approaches to measuring technology support for languages. Despite using different approaches and methodologies, the findings of all analysed papers demonstrate the unequal distribution of technology support and emphasise the existence of a digital divide among languages.

2021

pdf
European Language Grid: A Joint Platform for the European Language Technology Community
Georg Rehm | Stelios Piperidis | Kalina Bontcheva | Jan Hajic | Victoria Arranz | Andrejs Vasiļjevs | Gerhard Backfried | Jose Manuel Gomez-Perez | Ulrich Germann | Rémi Calizzano | Nils Feldhus | Stefanie Hegele | Florian Kintzel | Katrin Marheinecke | Julian Moreno-Schneider | Dimitris Galanis | Penny Labropoulou | Miltos Deligiannis | Katerina Gkirtzou | Athanasia Kolovou | Dimitris Gkoumas | Leon Voukoutis | Ian Roberts | Jana Hamrlova | Dusan Varis | Lukas Kacena | Khalid Choukri | Valérie Mapelli | Mickaël Rigault | Julija Melnika | Miro Janosik | Katja Prinz | Andres Garcia-Silva | Cristian Berrio | Ondrej Klejch | Steve Renals
Proceedings of the 16th Conference of the European Chapter of the Association for Computational Linguistics: System Demonstrations

Europe is a multilingual society, in which dozens of languages are spoken. The only option to enable and to benefit from multilingualism is through Language Technologies (LT), i.e., Natural Language Processing and Speech Technologies. We describe the European Language Grid (ELG), which is targeted to evolve into the primary platform and marketplace for LT in Europe by providing one umbrella platform for the European LT landscape, including research and industry, enabling all stakeholders to upload, share and distribute their services, products and resources. At the end of our EU project, which will establish a legal entity in 2022, the ELG will provide access to approx. 1300 services for all European languages as well as thousands of data sets.

2020

pdf
The European Language Technology Landscape in 2020: Language-Centric and Human-Centric AI for Cross-Cultural Communication in Multilingual Europe
Georg Rehm | Katrin Marheinecke | Stefanie Hegele | Stelios Piperidis | Kalina Bontcheva | Jan Hajič | Khalid Choukri | Andrejs Vasiļjevs | Gerhard Backfried | Christoph Prinz | José Manuel Gómez-Pérez | Luc Meertens | Paul Lukowicz | Josef van Genabith | Andrea Lösch | Philipp Slusallek | Morten Irgens | Patrick Gatellier | Joachim Köhler | Laure Le Bars | Dimitra Anastasiou | Albina Auksoriūtė | Núria Bel | António Branco | Gerhard Budin | Walter Daelemans | Koenraad De Smedt | Radovan Garabík | Maria Gavriilidou | Dagmar Gromann | Svetla Koeva | Simon Krek | Cvetana Krstev | Krister Lindén | Bernardo Magnini | Jan Odijk | Maciej Ogrodniczuk | Eiríkur Rögnvaldsson | Mike Rosner | Bolette Pedersen | Inguna Skadiņa | Marko Tadić | Dan Tufiș | Tamás Váradi | Kadri Vider | Andy Way | François Yvon
Proceedings of the Twelfth Language Resources and Evaluation Conference

Multilingualism is a cultural cornerstone of Europe and firmly anchored in the European treaties including full language equality. However, language barriers impacting business, cross-lingual and cross-cultural communication are still omnipresent. Language Technologies (LTs) are a powerful means to break down these barriers. While the last decade has seen various initiatives that created a multitude of approaches and technologies tailored to Europe’s specific needs, there is still an immense level of fragmentation. At the same time, AI has become an increasingly important concept in the European Information and Communication Technology area. For a few years now, AI – including many opportunities, synergies but also misconceptions – has been overshadowing every other topic. We present an overview of the European LT landscape, describing funding programmes, activities, actions and challenges in the different countries with regard to LT, including the current state of play in industry and the LT market. We present a brief overview of the main LT-related activities on the EU level in the last ten years and develop strategic guidance with regard to four key dimensions.

pdf
European Language Grid: An Overview
Georg Rehm | Maria Berger | Ela Elsholz | Stefanie Hegele | Florian Kintzel | Katrin Marheinecke | Stelios Piperidis | Miltos Deligiannis | Dimitris Galanis | Katerina Gkirtzou | Penny Labropoulou | Kalina Bontcheva | David Jones | Ian Roberts | Jan Hajič | Jana Hamrlová | Lukáš Kačena | Khalid Choukri | Victoria Arranz | Andrejs Vasiļjevs | Orians Anvari | Andis Lagzdiņš | Jūlija Meļņika | Gerhard Backfried | Erinç Dikici | Miroslav Janosik | Katja Prinz | Christoph Prinz | Severin Stampler | Dorothea Thomas-Aniola | José Manuel Gómez-Pérez | Andres Garcia Silva | Christian Berrío | Ulrich Germann | Steve Renals | Ondrej Klejch
Proceedings of the Twelfth Language Resources and Evaluation Conference

With 24 official EU and many additional languages, multilingualism in Europe and an inclusive Digital Single Market can only be enabled through Language Technologies (LTs). European LT business is dominated by hundreds of SMEs and a few large players. Many are world-class, with technologies that outperform the global players. However, European LT business is also fragmented – by nation states, languages, verticals and sectors, significantly holding back its impact. The European Language Grid (ELG) project addresses this fragmentation by establishing the ELG as the primary platform for LT in Europe. The ELG is a scalable cloud platform, providing, in an easy-to-integrate way, access to hundreds of commercial and non-commercial LTs for all European languages, including running tools and services as well as data sets and resources. Once fully operational, it will enable the commercial and non-commercial European LT community to deposit and upload their technologies and data sets into the ELG, to deploy them through the grid, and to connect with other resources. The ELG will boost the Multilingual Digital Single Market towards a thriving European LT community, creating new jobs and opportunities. Furthermore, the ELG project organises two open calls for up to 20 pilot projects. It also sets up 32 national competence centres and the European LT Council for outreach and coordination purposes.

2018

pdf
Language Technology for Multilingual Europe: An Analysis of a Large-Scale Survey regarding Challenges, Demands, Gaps and Needs
Georg Rehm | Stefanie Hegele
Proceedings of the Eleventh International Conference on Language Resources and Evaluation (LREC 2018)

2017

pdf
Large-scale news entity sentiment analysis
Ralf Steinberger | Stefanie Hegele | Hristo Tanev | Leonida Della Rocca
Proceedings of the International Conference Recent Advances in Natural Language Processing, RANLP 2017

We work on detecting positive or negative sentiment towards named entities in very large volumes of news articles. The aim is to monitor changes over time, as well as to work towards media bias detection by com-paring differences across news sources and countries. With view to applying the same method to dozens of languages, we use lin-guistically light-weight methods: searching for positive and negative terms in bags of words around entity mentions (also consid-ering negation). Evaluation results are good and better than a third-party baseline sys-tem, but precision is not sufficiently high to display the results publicly in our multilin-gual news analysis system Europe Media Monitor (EMM). In this paper, we focus on describing our effort to improve the English language results by avoiding the biggest sources of errors. We also present new work on using a syntactic parser to identify safe opinion recognition rules, such as predica-tive structures in which sentiment words di-rectly refer to an entity. The precision of this method is good, but recall is very low.
Search
Co-authors