Khalid Choukri
Also published as: Kalid Choukri, K. Choukri
2024
Proceedings of the Workshop on Legal and Ethical Issues in Human Language Technologies @ LREC-COLING 2024
Ingo Siegert | Khalid Choukri
Proceedings of the Workshop on Legal and Ethical Issues in Human Language Technologies @ LREC-COLING 2024
Ingo Siegert | Khalid Choukri
Proceedings of the Workshop on Legal and Ethical Issues in Human Language Technologies @ LREC-COLING 2024
Compliance by Design Methodologies in the Legal Governance Schemes of European Data Spaces
Kossay Talmoudi | Khalid Choukri | Isabelle Gavanon
Proceedings of the Workshop on Legal and Ethical Issues in Human Language Technologies @ LREC-COLING 2024
Kossay Talmoudi | Khalid Choukri | Isabelle Gavanon
Proceedings of the Workshop on Legal and Ethical Issues in Human Language Technologies @ LREC-COLING 2024
Creating novel ways of sharing data to boost the digital economy has been one of the growing priorities of the European Union. In order to realise a set of data-sharing modalities, the European Union funds several projects that aim to put in place Common Data Spaces. These infrastructures are set to be a catalyser for the data economy. However, many hurdles face their implementation. Legal compliance is still one of the major ambiguities of European Common Data Spaces and many initiatives intend to proactively integrate legal compliance schemes in the architecture of sectoral Data Spaces. The various initiatives must navigate a complex web of cross-cutting legal frameworks, including contract law, data protection, intellectual property, protection of trade secrets, competition law, European sovereignty, and cybersecurity obligations. As the conceptualisation of Data Spaces evolves and shows signs of differentiation from one sector to another, it is important to showcase the legal repercussions of the options of centralisation and decentralisation that can be observed in different Data Spaces. This paper will thus delve into their legal requirements and attempt to sketch out a stepping stone for understanding legal governance in data spaces.
Common European Language Data Space
Georg Rehm | Stelios Piperidis | Khalid Choukri | Andrejs Vasiļjevs | Katrin Marheinecke | Victoria Arranz | Aivars Bērziņš | Miltos Deligiannis | Dimitris Galanis | Maria Giagkou | Katerina Gkirtzou | Dimitris Gkoumas | Annika Grützner-Zahn | Athanasia Kolovou | Penny Labropoulou | Andis Lagzdiņš | Elena Leitner | Valérie Mapelli | Hélène Mazo | Simon Ostermann | Stefania Racioppa | Mickaël Rigault | Leon Voukoutis
Proceedings of the 2024 Joint International Conference on Computational Linguistics, Language Resources and Evaluation (LREC-COLING 2024)
Georg Rehm | Stelios Piperidis | Khalid Choukri | Andrejs Vasiļjevs | Katrin Marheinecke | Victoria Arranz | Aivars Bērziņš | Miltos Deligiannis | Dimitris Galanis | Maria Giagkou | Katerina Gkirtzou | Dimitris Gkoumas | Annika Grützner-Zahn | Athanasia Kolovou | Penny Labropoulou | Andis Lagzdiņš | Elena Leitner | Valérie Mapelli | Hélène Mazo | Simon Ostermann | Stefania Racioppa | Mickaël Rigault | Leon Voukoutis
Proceedings of the 2024 Joint International Conference on Computational Linguistics, Language Resources and Evaluation (LREC-COLING 2024)
The Common European Language Data Space (LDS) is an integral part of the EU data strategy, which aims at developing a single market for data. Its decentralised technical infrastructure and governance scheme are currently being developed by the LDS project, which also has dedicated tasks for proof-of-concept prototypes, handling legal aspects, raising awareness and promoting the LDS through events and social media channels. The LDS is part of a broader vision for establishing all necessary components to develop European large language models.
2023
FINDINGS OF THE IWSLT 2023 EVALUATION CAMPAIGN
Milind Agarwal | Sweta Agrawal | Antonios Anastasopoulos | Luisa Bentivogli | Ondřej Bojar | Claudia Borg | Marine Carpuat | Roldano Cattoni | Mauro Cettolo | Mingda Chen | William Chen | Khalid Choukri | Alexandra Chronopoulou | Anna Currey | Thierry Declerck | Qianqian Dong | Kevin Duh | Yannick Estève | Marcello Federico | Souhir Gahbiche | Barry Haddow | Benjamin Hsu | Phu Mon Htut | Hirofumi Inaguma | Dávid Javorský | John Judge | Yasumasa Kano | Tom Ko | Rishu Kumar | Pengwei Li | Xutai Ma | Prashant Mathur | Evgeny Matusov | Paul McNamee | John P. McCrae | Kenton Murray | Maria Nadejde | Satoshi Nakamura | Matteo Negri | Ha Nguyen | Jan Niehues | Xing Niu | Atul Kr. Ojha | John E. Ortega | Proyag Pal | Juan Pino | Lonneke van der Plas | Peter Polák | Elijah Rippeth | Elizabeth Salesky | Jiatong Shi | Matthias Sperber | Sebastian Stüker | Katsuhito Sudoh | Yun Tang | Brian Thompson | Kevin Tran | Marco Turchi | Alex Waibel | Mingxuan Wang | Shinji Watanabe | Rodolfo Zevallos
Proceedings of the 20th International Conference on Spoken Language Translation (IWSLT 2023)
Milind Agarwal | Sweta Agrawal | Antonios Anastasopoulos | Luisa Bentivogli | Ondřej Bojar | Claudia Borg | Marine Carpuat | Roldano Cattoni | Mauro Cettolo | Mingda Chen | William Chen | Khalid Choukri | Alexandra Chronopoulou | Anna Currey | Thierry Declerck | Qianqian Dong | Kevin Duh | Yannick Estève | Marcello Federico | Souhir Gahbiche | Barry Haddow | Benjamin Hsu | Phu Mon Htut | Hirofumi Inaguma | Dávid Javorský | John Judge | Yasumasa Kano | Tom Ko | Rishu Kumar | Pengwei Li | Xutai Ma | Prashant Mathur | Evgeny Matusov | Paul McNamee | John P. McCrae | Kenton Murray | Maria Nadejde | Satoshi Nakamura | Matteo Negri | Ha Nguyen | Jan Niehues | Xing Niu | Atul Kr. Ojha | John E. Ortega | Proyag Pal | Juan Pino | Lonneke van der Plas | Peter Polák | Elijah Rippeth | Elizabeth Salesky | Jiatong Shi | Matthias Sperber | Sebastian Stüker | Katsuhito Sudoh | Yun Tang | Brian Thompson | Kevin Tran | Marco Turchi | Alex Waibel | Mingxuan Wang | Shinji Watanabe | Rodolfo Zevallos
Proceedings of the 20th International Conference on Spoken Language Translation (IWSLT 2023)
This paper reports on the shared tasks organized by the 20th IWSLT Conference. The shared tasks address 9 scientific challenges in spoken language translation: simultaneous and offline translation, automatic subtitling and dubbing, speech-to-speech translation, multilingual, dialect and low-resource speech translation, and formality control. The shared tasks attracted a total of 38 submissions by 31 teams. The growing interest towards spoken language translation is also witnessed by the constantly increasing number of shared task organizers and contributors to the overview paper, almost evenly distributed across industry and academia.
2022
MAPA Project: Ready-to-Go Open-Source Datasets and Deep Learning Technology to Remove Identifying Information from Text Documents
Victoria Arranz | Khalid Choukri | Montse Cuadros | Aitor García Pablos | Lucie Gianola | Cyril Grouin | Manuel Herranz | Patrick Paroubek | Pierre Zweigenbaum
Proceedings of the Workshop on Ethical and Legal Issues in Human Language Technologies and Multilingual De-Identification of Sensitive Data In Language Resources within the 13th Language Resources and Evaluation Conference
Victoria Arranz | Khalid Choukri | Montse Cuadros | Aitor García Pablos | Lucie Gianola | Cyril Grouin | Manuel Herranz | Patrick Paroubek | Pierre Zweigenbaum
Proceedings of the Workshop on Ethical and Legal Issues in Human Language Technologies and Multilingual De-Identification of Sensitive Data In Language Resources within the 13th Language Resources and Evaluation Conference
This paper presents the outcomes of the MAPA project, a set of annotated corpora for 24 languages of the European Union and an open-source customisable toolkit able to detect and substitute sensitive information in text documents from any domain, using state-of-the art, deep learning-based named entity recognition techniques. In the context of the project, the toolkit has been developed and tested on administrative, legal and medical documents, obtaining state-of-the-art results. As a result of the project, 24 dataset packages have been released and the de-identification toolkit is available as open source.
Legal and Ethical Challenges in Recording Air Traffic Control Speech
Mickaël Rigault | Claudia Cevenini | Khalid Choukri | Martin Kocour | Karel Veselý | Igor Szoke | Petr Motlicek | Juan Pablo Zuluaga-Gomez | Alexander Blatt | Dietrich Klakow | Allan Tart | Pavel Kolčárek | Jan Černocký
Proceedings of the Workshop on Ethical and Legal Issues in Human Language Technologies and Multilingual De-Identification of Sensitive Data In Language Resources within the 13th Language Resources and Evaluation Conference
Mickaël Rigault | Claudia Cevenini | Khalid Choukri | Martin Kocour | Karel Veselý | Igor Szoke | Petr Motlicek | Juan Pablo Zuluaga-Gomez | Alexander Blatt | Dietrich Klakow | Allan Tart | Pavel Kolčárek | Jan Černocký
Proceedings of the Workshop on Ethical and Legal Issues in Human Language Technologies and Multilingual De-Identification of Sensitive Data In Language Resources within the 13th Language Resources and Evaluation Conference
In this paper the authors detail the various legal and ethical issues faced during the ATCO2 project. This project is aimed at developing tools to automatically collect and transcribe air traffic conversations, especially conversations between pilots and air controls towers. In this paper the authors will develop issues related to intellectual property, public data, privacy, and general ethics issues related to the collection of air-traffic control speech.
Proceedings of the Thirteenth Language Resources and Evaluation Conference
Nicoletta Calzolari | Frédéric Béchet | Philippe Blache | Khalid Choukri | Christopher Cieri | Thierry Declerck | Sara Goggi | Hitoshi Isahara | Bente Maegaard | Joseph Mariani | Hélène Mazo | Jan Odijk | Stelios Piperidis
Proceedings of the Thirteenth Language Resources and Evaluation Conference
Nicoletta Calzolari | Frédéric Béchet | Philippe Blache | Khalid Choukri | Christopher Cieri | Thierry Declerck | Sara Goggi | Hitoshi Isahara | Bente Maegaard | Joseph Mariani | Hélène Mazo | Jan Odijk | Stelios Piperidis
Proceedings of the Thirteenth Language Resources and Evaluation Conference
Language Resources to Support Language Diversity – the ELRA Achievements
Valérie Mapelli | Victoria Arranz | Khalid Choukri | Hélène Mazo
Proceedings of the Thirteenth Language Resources and Evaluation Conference
Valérie Mapelli | Victoria Arranz | Khalid Choukri | Hélène Mazo
Proceedings of the Thirteenth Language Resources and Evaluation Conference
This article highlights ELRA’s latest achievements in the field of Language Resources (LRs) identification, sharing and production. It also reports on ELRA’s involvement in several national and international projects, as well as in the organization of events for the support of LRs and related Language Technologies, including for under-resourced languages. Over the past few years, ELRA, together with its operational agency ELDA, has continued to increase its catalogue offer of LRs, establishing worldwide partnerships for the production of various types of LRs (SMS, tweets, crawled data, MT aligned data, speech LRs, sentiment-based data, etc.). Through their consistent involvement in EU-funded projects, ELRA and ELDA have contributed to improve the access to multilingual information in the context of the pandemic, develop tools for the de-identification of texts in the legal and medical domains, support the EU eTranslation Machine Translation system, and set up a European platform providing access to both resources and services. In December 2019, ELRA co-organized the LT4All conference, whose main topics were Language Technologies for enabling linguistic diversity and multilingualism worldwide. Moreover, although LREC was cancelled in 2020, ELRA published the LREC 2020 proceedings for the Main conference and Workshops papers, and carried on its dissemination activities while targeting the new LREC edition for 2022.
2021
European Language Grid: A Joint Platform for the European Language Technology Community
Georg Rehm | Stelios Piperidis | Kalina Bontcheva | Jan Hajic | Victoria Arranz | Andrejs Vasiļjevs | Gerhard Backfried | Jose Manuel Gomez-Perez | Ulrich Germann | Rémi Calizzano | Nils Feldhus | Stefanie Hegele | Florian Kintzel | Katrin Marheinecke | Julian Moreno-Schneider | Dimitris Galanis | Penny Labropoulou | Miltos Deligiannis | Katerina Gkirtzou | Athanasia Kolovou | Dimitris Gkoumas | Leon Voukoutis | Ian Roberts | Jana Hamrlova | Dusan Varis | Lukas Kacena | Khalid Choukri | Valérie Mapelli | Mickaël Rigault | Julija Melnika | Miro Janosik | Katja Prinz | Andres Garcia-Silva | Cristian Berrio | Ondrej Klejch | Steve Renals
Proceedings of the 16th Conference of the European Chapter of the Association for Computational Linguistics: System Demonstrations
Georg Rehm | Stelios Piperidis | Kalina Bontcheva | Jan Hajic | Victoria Arranz | Andrejs Vasiļjevs | Gerhard Backfried | Jose Manuel Gomez-Perez | Ulrich Germann | Rémi Calizzano | Nils Feldhus | Stefanie Hegele | Florian Kintzel | Katrin Marheinecke | Julian Moreno-Schneider | Dimitris Galanis | Penny Labropoulou | Miltos Deligiannis | Katerina Gkirtzou | Athanasia Kolovou | Dimitris Gkoumas | Leon Voukoutis | Ian Roberts | Jana Hamrlova | Dusan Varis | Lukas Kacena | Khalid Choukri | Valérie Mapelli | Mickaël Rigault | Julija Melnika | Miro Janosik | Katja Prinz | Andres Garcia-Silva | Cristian Berrio | Ondrej Klejch | Steve Renals
Proceedings of the 16th Conference of the European Chapter of the Association for Computational Linguistics: System Demonstrations
Europe is a multilingual society, in which dozens of languages are spoken. The only option to enable and to benefit from multilingualism is through Language Technologies (LT), i.e., Natural Language Processing and Speech Technologies. We describe the European Language Grid (ELG), which is targeted to evolve into the primary platform and marketplace for LT in Europe by providing one umbrella platform for the European LT landscape, including research and industry, enabling all stakeholders to upload, share and distribute their services, products and resources. At the end of our EU project, which will establish a legal entity in 2022, the ELG will provide access to approx. 1300 services for all European languages as well as thousands of data sets.
2020
The Multilingual Anonymisation Toolkit for Public Administrations (MAPA) Project
Ēriks Ajausks | Victoria Arranz | Laurent Bié | Aleix Cerdà-i-Cucó | Khalid Choukri | Montse Cuadros | Hans Degroote | Amando Estela | Thierry Etchegoyhen | Mercedes García-Martínez | Aitor García-Pablos | Manuel Herranz | Alejandro Kohan | Maite Melero | Mike Rosner | Roberts Rozis | Patrick Paroubek | Artūrs Vasiļevskis | Pierre Zweigenbaum
Proceedings of the 22nd Annual Conference of the European Association for Machine Translation
Ēriks Ajausks | Victoria Arranz | Laurent Bié | Aleix Cerdà-i-Cucó | Khalid Choukri | Montse Cuadros | Hans Degroote | Amando Estela | Thierry Etchegoyhen | Mercedes García-Martínez | Aitor García-Pablos | Manuel Herranz | Alejandro Kohan | Maite Melero | Mike Rosner | Roberts Rozis | Patrick Paroubek | Artūrs Vasiļevskis | Pierre Zweigenbaum
Proceedings of the 22nd Annual Conference of the European Association for Machine Translation
We describe the MAPA project, funded under the Connecting Europe Facility programme, whose goal is the development of an open-source de-identification toolkit for all official European Union languages. It will be developed since January 2020 until December 2021.
Proceedings of the 1st International Workshop on Language Technology Platforms
Georg Rehm | Kalina Bontcheva | Khalid Choukri | Jan Hajič | Stelios Piperidis | Andrejs Vasiļjevs
Proceedings of the 1st International Workshop on Language Technology Platforms
Georg Rehm | Kalina Bontcheva | Khalid Choukri | Jan Hajič | Stelios Piperidis | Andrejs Vasiļjevs
Proceedings of the 1st International Workshop on Language Technology Platforms
ELRI: A Decentralised Network of National Relay Stations to Collect, Prepare and Share Language Resources
Thierry Etchegoyhen | Borja Anza Porras | Andoni Azpeitia | Eva Martínez Garcia | José Luis Fonseca | Patricia Fonseca | Paulo Vale | Jane Dunne | Federico Gaspari | Teresa Lynn | Helen McHugh | Andy Way | Victoria Arranz | Khalid Choukri | Hervé Pusset | Alexandre Sicard | Rui Neto | Maite Melero | David Perez | António Branco | Ruben Branco | Luís Gomes
Proceedings of the 1st International Workshop on Language Technology Platforms
Thierry Etchegoyhen | Borja Anza Porras | Andoni Azpeitia | Eva Martínez Garcia | José Luis Fonseca | Patricia Fonseca | Paulo Vale | Jane Dunne | Federico Gaspari | Teresa Lynn | Helen McHugh | Andy Way | Victoria Arranz | Khalid Choukri | Hervé Pusset | Alexandre Sicard | Rui Neto | Maite Melero | David Perez | António Branco | Ruben Branco | Luís Gomes
Proceedings of the 1st International Workshop on Language Technology Platforms
We describe the European Language Resource Infrastructure (ELRI), a decentralised network to help collect, prepare and share language resources. The infrastructure was developed within a project co-funded by the Connecting Europe Facility Programme of the European Union, and has been deployed in the four Member States participating in the project, namely France, Ireland, Portugal and Spain. ELRI provides sustainable and flexible means to collect and share language resources via National Relay Stations, to which members of public institutions can freely subscribe. The infrastructure includes fully automated data processing engines to facilitate the preparation, sharing and wider reuse of useful language resources that can help optimise human and automated translation services in the European Union.
Proceedings of the Twelfth Language Resources and Evaluation Conference
Nicoletta Calzolari | Frédéric Béchet | Philippe Blache | Khalid Choukri | Christopher Cieri | Thierry Declerck | Sara Goggi | Hitoshi Isahara | Bente Maegaard | Joseph Mariani | Hélène Mazo | Asuncion Moreno | Jan Odijk | Stelios Piperidis
Proceedings of the Twelfth Language Resources and Evaluation Conference
Nicoletta Calzolari | Frédéric Béchet | Philippe Blache | Khalid Choukri | Christopher Cieri | Thierry Declerck | Sara Goggi | Hitoshi Isahara | Bente Maegaard | Joseph Mariani | Hélène Mazo | Asuncion Moreno | Jan Odijk | Stelios Piperidis
Proceedings of the Twelfth Language Resources and Evaluation Conference
The European Language Technology Landscape in 2020: Language-Centric and Human-Centric AI for Cross-Cultural Communication in Multilingual Europe
Georg Rehm | Katrin Marheinecke | Stefanie Hegele | Stelios Piperidis | Kalina Bontcheva | Jan Hajič | Khalid Choukri | Andrejs Vasiļjevs | Gerhard Backfried | Christoph Prinz | José Manuel Gómez-Pérez | Luc Meertens | Paul Lukowicz | Josef van Genabith | Andrea Lösch | Philipp Slusallek | Morten Irgens | Patrick Gatellier | Joachim Köhler | Laure Le Bars | Dimitra Anastasiou | Albina Auksoriūtė | Núria Bel | António Branco | Gerhard Budin | Walter Daelemans | Koenraad De Smedt | Radovan Garabík | Maria Gavriilidou | Dagmar Gromann | Svetla Koeva | Simon Krek | Cvetana Krstev | Krister Lindén | Bernardo Magnini | Jan Odijk | Maciej Ogrodniczuk | Eiríkur Rögnvaldsson | Mike Rosner | Bolette Pedersen | Inguna Skadiņa | Marko Tadić | Dan Tufiș | Tamás Váradi | Kadri Vider | Andy Way | François Yvon
Proceedings of the Twelfth Language Resources and Evaluation Conference
Georg Rehm | Katrin Marheinecke | Stefanie Hegele | Stelios Piperidis | Kalina Bontcheva | Jan Hajič | Khalid Choukri | Andrejs Vasiļjevs | Gerhard Backfried | Christoph Prinz | José Manuel Gómez-Pérez | Luc Meertens | Paul Lukowicz | Josef van Genabith | Andrea Lösch | Philipp Slusallek | Morten Irgens | Patrick Gatellier | Joachim Köhler | Laure Le Bars | Dimitra Anastasiou | Albina Auksoriūtė | Núria Bel | António Branco | Gerhard Budin | Walter Daelemans | Koenraad De Smedt | Radovan Garabík | Maria Gavriilidou | Dagmar Gromann | Svetla Koeva | Simon Krek | Cvetana Krstev | Krister Lindén | Bernardo Magnini | Jan Odijk | Maciej Ogrodniczuk | Eiríkur Rögnvaldsson | Mike Rosner | Bolette Pedersen | Inguna Skadiņa | Marko Tadić | Dan Tufiș | Tamás Váradi | Kadri Vider | Andy Way | François Yvon
Proceedings of the Twelfth Language Resources and Evaluation Conference
Multilingualism is a cultural cornerstone of Europe and firmly anchored in the European treaties including full language equality. However, language barriers impacting business, cross-lingual and cross-cultural communication are still omnipresent. Language Technologies (LTs) are a powerful means to break down these barriers. While the last decade has seen various initiatives that created a multitude of approaches and technologies tailored to Europe’s specific needs, there is still an immense level of fragmentation. At the same time, AI has become an increasingly important concept in the European Information and Communication Technology area. For a few years now, AI – including many opportunities, synergies but also misconceptions – has been overshadowing every other topic. We present an overview of the European LT landscape, describing funding programmes, activities, actions and challenges in the different countries with regard to LT, including the current state of play in industry and the LT market. We present a brief overview of the main LT-related activities on the EU level in the last ten years and develop strategic guidance with regard to four key dimensions.
European Language Grid: An Overview
Georg Rehm | Maria Berger | Ela Elsholz | Stefanie Hegele | Florian Kintzel | Katrin Marheinecke | Stelios Piperidis | Miltos Deligiannis | Dimitris Galanis | Katerina Gkirtzou | Penny Labropoulou | Kalina Bontcheva | David Jones | Ian Roberts | Jan Hajič | Jana Hamrlová | Lukáš Kačena | Khalid Choukri | Victoria Arranz | Andrejs Vasiļjevs | Orians Anvari | Andis Lagzdiņš | Jūlija Meļņika | Gerhard Backfried | Erinç Dikici | Miroslav Janosik | Katja Prinz | Christoph Prinz | Severin Stampler | Dorothea Thomas-Aniola | José Manuel Gómez-Pérez | Andres Garcia Silva | Christian Berrío | Ulrich Germann | Steve Renals | Ondrej Klejch
Proceedings of the Twelfth Language Resources and Evaluation Conference
Georg Rehm | Maria Berger | Ela Elsholz | Stefanie Hegele | Florian Kintzel | Katrin Marheinecke | Stelios Piperidis | Miltos Deligiannis | Dimitris Galanis | Katerina Gkirtzou | Penny Labropoulou | Kalina Bontcheva | David Jones | Ian Roberts | Jan Hajič | Jana Hamrlová | Lukáš Kačena | Khalid Choukri | Victoria Arranz | Andrejs Vasiļjevs | Orians Anvari | Andis Lagzdiņš | Jūlija Meļņika | Gerhard Backfried | Erinç Dikici | Miroslav Janosik | Katja Prinz | Christoph Prinz | Severin Stampler | Dorothea Thomas-Aniola | José Manuel Gómez-Pérez | Andres Garcia Silva | Christian Berrío | Ulrich Germann | Steve Renals | Ondrej Klejch
Proceedings of the Twelfth Language Resources and Evaluation Conference
With 24 official EU and many additional languages, multilingualism in Europe and an inclusive Digital Single Market can only be enabled through Language Technologies (LTs). European LT business is dominated by hundreds of SMEs and a few large players. Many are world-class, with technologies that outperform the global players. However, European LT business is also fragmented – by nation states, languages, verticals and sectors, significantly holding back its impact. The European Language Grid (ELG) project addresses this fragmentation by establishing the ELG as the primary platform for LT in Europe. The ELG is a scalable cloud platform, providing, in an easy-to-integrate way, access to hundreds of commercial and non-commercial LTs for all European languages, including running tools and services as well as data sets and resources. Once fully operational, it will enable the commercial and non-commercial European LT community to deposit and upload their technologies and data sets into the ELG, to deploy them through the grid, and to connect with other resources. The ELG will boost the Multilingual Digital Single Market towards a thriving European LT community, creating new jobs and opportunities. Furthermore, the ELG project organises two open calls for up to 20 pilot projects. It also sets up 32 national competence centres and the European LT Council for outreach and coordination purposes.
Making Metadata Fit for Next Generation Language Technology Platforms: The Metadata Schema of the European Language Grid
Penny Labropoulou | Katerina Gkirtzou | Maria Gavriilidou | Miltos Deligiannis | Dimitris Galanis | Stelios Piperidis | Georg Rehm | Maria Berger | Valérie Mapelli | Michael Rigault | Victoria Arranz | Khalid Choukri | Gerhard Backfried | José Manuel Gómez-Pérez | Andres Garcia-Silva
Proceedings of the Twelfth Language Resources and Evaluation Conference
Penny Labropoulou | Katerina Gkirtzou | Maria Gavriilidou | Miltos Deligiannis | Dimitris Galanis | Stelios Piperidis | Georg Rehm | Maria Berger | Valérie Mapelli | Michael Rigault | Victoria Arranz | Khalid Choukri | Gerhard Backfried | José Manuel Gómez-Pérez | Andres Garcia-Silva
Proceedings of the Twelfth Language Resources and Evaluation Conference
The current scientific and technological landscape is characterised by the increasing availability of data resources and processing tools and services. In this setting, metadata have emerged as a key factor facilitating management, sharing and usage of such digital assets. In this paper we present ELG-SHARE, a rich metadata schema catering for the description of Language Resources and Technologies (processing and generation services and tools, models, corpora, term lists, etc.), as well as related entities (e.g., organizations, projects, supporting documents, etc.). The schema powers the European Language Grid platform that aims to be the primary hub and marketplace for industry-relevant Language Technology in Europe. ELG-SHARE has been based on various metadata schemas, vocabularies, and ontologies, as well as related recommendations and guidelines.
2018
ELRI - European Language Resources Infrastructure
Thierry Etchegoyhen | Borja Anza Porras | Andoni Azpeitia | Eva Martínez Garcia | Paulo Vale | José Luis Fonseca | Teresa Lynn | Jane Dunne | Federico Gaspari | Andy Way | Victoria Arranz | Khalid Choukri | Vladimir Popescu | Pedro Neiva | Rui Neto | Maite Melero | David Perez Fernandez | Antonio Branco | Ruben Branco | Luis Gomes
Proceedings of the 21st Annual Conference of the European Association for Machine Translation
Thierry Etchegoyhen | Borja Anza Porras | Andoni Azpeitia | Eva Martínez Garcia | Paulo Vale | José Luis Fonseca | Teresa Lynn | Jane Dunne | Federico Gaspari | Andy Way | Victoria Arranz | Khalid Choukri | Vladimir Popescu | Pedro Neiva | Rui Neto | Maite Melero | David Perez Fernandez | Antonio Branco | Ruben Branco | Luis Gomes
Proceedings of the 21st Annual Conference of the European Association for Machine Translation
We describe the European Language Resources Infrastructure project, whose main aim is the provision of an infrastructure to help collect, prepare and share language resources that can in turn improve translation services in Europe.
2011
Evaluation Methodology and Results for English-to-Arabic MT
Olivier Hamon | Khalid Choukri
Proceedings of Machine Translation Summit XIII: Papers
Olivier Hamon | Khalid Choukri
Proceedings of Machine Translation Summit XIII: Papers
2007
End-to-end evaluation of a speech-to-speech translation system in TC-STAR
Olivier Hamon | Djamel Mostefa | Khalid Choukri
Proceedings of Machine Translation Summit XI: Papers
Olivier Hamon | Djamel Mostefa | Khalid Choukri
Proceedings of Machine Translation Summit XI: Papers
Assessing human and automated quality judgments in the French MT evaluation campaign CESTA
Olivier Hamon | Anthony Hartley | Andrei Popescu-Belis | Khalid Choukri
Proceedings of Machine Translation Summit XI: Papers
Olivier Hamon | Anthony Hartley | Andrei Popescu-Belis | Khalid Choukri
Proceedings of Machine Translation Summit XI: Papers
Proceedings of the Workshop on Automatic procedures in MT evaluation
Gregor Thurmair | Khalid Choukri | Bente Maegaard
Proceedings of the Workshop on Automatic procedures in MT evaluation
Gregor Thurmair | Khalid Choukri | Bente Maegaard
Proceedings of the Workshop on Automatic procedures in MT evaluation
MT evaluation & TC-STAR
Khalid Choukri | Olivier Hamon | Djamel Mostefa
Proceedings of the Workshop on Automatic procedures in MT evaluation
Khalid Choukri | Olivier Hamon | Djamel Mostefa
Proceedings of the Workshop on Automatic procedures in MT evaluation
2005
Evaluation of Machine Translation with Predictive Metrics beyond BLEU/NIST: CESTA Evaluation Campaign # 1
Sylvain Surcin | Olivier Hamon | Antony Hartley | Martin Rajman | Andrei Popescu-Belis | Widad Mustafa El Hadi | Ismaïl Timimi | Marianne Dabbadie | Khalid Choukri
Proceedings of Machine Translation Summit X: Papers
Sylvain Surcin | Olivier Hamon | Antony Hartley | Martin Rajman | Andrei Popescu-Belis | Widad Mustafa El Hadi | Ismaïl Timimi | Marianne Dabbadie | Khalid Choukri
Proceedings of Machine Translation Summit X: Papers
In this paper, we report on the results of a full-size evaluation campaign of various MT systems. This campaign is novel compared to the classical DARPA/NIST MT evaluation campaigns in the sense that French is the target language, and that it includes an experiment of meta-evaluation of various metrics claiming to better predict different attributes of translation quality. We first describe the campaign, its context, its protocol and the data we used. Then we summarise the results obtained by the participating systems and discuss the meta-evaluation of the metrics used.
Search
Fix author
Co-authors
- Stelios Piperidis 20
- Victoria Arranz 18
- Valérie Mapelli 18
- Nicoletta Calzolari 17
- Bente Maegaard 15
- Djamel Mostefa 15
- Olivier Hamon 14
- Jan Odijk 12
- Joseph Mariani 11
- Hélène Mazo 10
- Asunción Moreno 10
- Thierry Declerck 8
- Georg Rehm 7
- Vladimir Popescu 6
- Andrejs Vasiļjevs 6
- Sara Goggi 5
- Penny Labropoulou 5
- Gerhard Backfried 4
- Kalina Bontcheva 4
- Christopher Cieri 4
- Miltos Deligiannis 4
- Dimitrios Galanis 4
- Katerina Gkirtzou 4
- José Manuel Gómez-Pérez 4
- Jan Hajic 4
- Steven Krauwer 4
- Katrin Marheinecke 4
- Monica Monachini 4
- Ismail Timimi 4
- Henk van den Heuvel 4
- Frederic Bechet 3
- António Branco 3
- Marianne Dabbadie 3
- Riccardo Del Gratta 3
- Thierry Etchegoyhen 3
- Andres Garcia-Silva 3
- Federico Gaspari 3
- Guillaume Gravier 3
- Widad Mustafa El Hadi 3
- Stefanie Hegele 3
- Hitoshi Isahara 3
- Lin Liu 3
- Kevin McTait 3
- Maite Melero 3
- Patrick Paroubek 3
- Andrei Popescu-Belis 3
- Mickaël Rigault 3
- Michael Rosner 3
- Claudia Soria 3
- Daniel Tapias 3
- Herbert Tropf 3
- Andy Way 3
- Mustafa Yaseen 3
- Jeffrey Allen 2
- Borja Anza Porras 2
- Mohamed Attia 2
- Andoni Azpeitia 2
- Meritxell Fernández Barrera 2
- Núria Bel 2
- Maria Berger 2
- Philippe Blache 2
- Jean-François Bonastre 2
- Ruben Branco 2
- Stéphane Chaudiron 2
- Emanuela Cresti 2
- Montse Cuadros 2
- Jane Dunne 2
- Ossama Emam 2
- Hanne Fersøe 2
- José Luis Fonseca 2
- Olivier Galibert 2
- Sylvain Galliano 2
- Aitor García-Pablos 2
- Maria Gavriilidou 2
- Maria Gavrilidou 2
- Oren Gedge 2
- Edouard Geoffrois 2
- Ulrich Germann 2
- Dimitris Gkoumas 2
- Luís Gomes 2
- Jana Hamrlová 2
- Anthony Hartley 2
- Manuel Herranz 2
- Dorota Iskra 2
- Lukáš Kačena 2
- Florian Kintzel 2
- Ondřej Klejch 2
- Athanasia Kolovou 2
- Andis Lagzdiņš 2
- Teresa Lynn 2
- Andrea Lösch 2
- Bernardo Magnini 2
- Audrey Mance 2
- Philippe Martin 2
- Eva Martínez Garcia 2
- Jūlija Meļņika 2
- Nicolas Moreau 2
- Antonio Moreno-Sandoval 2
- Maria Fernanda Bacelar do Nascimento 2
- Rui Neto 2
- Albino Nogueiras 2
- Jungyeul Park 2
- Niklas Paulsson 2
- Christoph Prinz 2
- Katja Prinz 2
- Valeria Quochi 2
- Martin Rajman 2
- Steve Renals 2
- Ian Roberts 2
- Sophie Rosset 2
- Eric Sanders 2
- Rainer Siemund 2
- Sylvain Surcin 2
- Paulo Vale 2
- Jean Veronis 2
- Leon Voukoutis 2
- Alex Waibel 2
- Imed Zitouni 2
- Pierre Zweigenbaum 2
- Josef van Genabith 2
- Gilles Adda 1
- Milind Agarwal 1
- Sweta Agrawal 1
- Ēriks Ajausks 1
- Adil Al-Kufaishi 1
- Dimitra Anastasiou 1
- Antonios Anastasopoulos 1
- Jean-Yves Antoine 1
- Orians Anvari 1
- Mohammed Atiyya 1
- Albina Auksoriūtė 1
- Paola Baroni 1
- Chomicha Bendahman 1
- Luisa Bentivogli 1
- Cristian Berrio 1
- Christian Berrío 1
- Romaric Besançon 1
- Frédéric Bimbot 1
- Laurent Bié 1
- Alexander Blatt 1
- Claude Blum 1
- Ondřej Bojar 1
- Hélène Bonneau-Maynard 1
- Claudia Borg 1
- Jamal Borno 1
- Karim Boudahmane 1
- Caroline Bousquet-Vernhettes 1
- Lou Boves 1
- Olivier Boëffard 1
- Martin Braschler 1
- Sylvie Brunessaux 1
- Gerhard Budin 1
- Susanne Burger 1
- Davide Buscaldi 1
- Aivars Bērziņš 1
- Rémi Calizzano 1
- Marine Carpuat 1
- Matthieu Carré 1
- Roldano Cattoni 1
- Aleix Cerdà-i-Cucó 1
- Mauro Cettolo 1
- Claudia Cevenini 1
- Delphine Charlet 1
- Laurent Charnay 1
- Francis Charpentier 1
- Mingda Chen 1
- William Chen 1
- Noureddine Chenfour 1
- Yun-Chuang Chiao 1
- Alexandra Chronopoulou 1
- Antonio Cid 1
- Pere Comas 1
- Anna Currey 1
- Walter Daelemans 1
- Koenraad De Smedt 1
- Hans Degroote 1
- Laurence Devillers 1
- Erinç Dikici 1
- Qianqian Dong 1
- Mehmet Uğur Doğan 1
- Christoph Draxler 1
- Kevin Duh 1
- Ela Elsholz 1
- Amando Estela 1
- Yannick Estève 1
- Stephan Euler 1
- Nikos Fakotakis 1
- Daniele Falavigna 1
- Marcello Federico 1
- Nils Feldhus 1
- Christian Fluhr 1
- Dominique Fohr 1
- Patricia Fonseca 1
- Hélène François 1
- Jochen Friedrich 1
- Christian Fügen 1
- Souhir Gahbiche 1
- Franck Gandcher 1
- Aldo Gangemi 1
- Radovan Garabík 1
- Marie-Neige Garcia 1
- Mercedes García-Martínez 1
- Patrick Gatellier 1
- Isabelle Gavanon 1
- Maria Giagkou 1
- Lucie Gianola 1
- Voula Giouli 1
- Christian Girardi 1
- Simo Goddijn 1
- Christian Gollan 1
- Julio Gonzalo 1
- Jérôme Goulian 1
- Marko Grobelnik 1
- Dagmar Gromann 1
- Cyril Grouin 1
- Annika Grützner-Zahn 1
- Salah Haamid 1
- Bassam Haddad 1
- Barry Haddow 1
- Phil Hall 1
- Antony Hartley 1
- Koiti Hasida 1
- Barbara Heuft 1
- Benjamin Hsu 1
- Phu Mon Htut 1
- Harald Höge 1
- Nancy Ide 1
- Hirofumi Inaguma 1
- Morten Irgens 1
- Miroslav Janosik 1
- Miro Janosik 1
- Dávid Javorský 1
- David Jones 1
- John Judge 1
- Lise Damsgaard Jørgensen 1
- Paweł Kamocki 1
- Yasumasa Kano 1
- Dietrich Klakow 1
- Michael Kluck 1
- Tom Ko 1
- Martin Kocour 1
- Svetla Koeva 1
- Alejandro Kohan 1
- Muntsin Kolss 1
- Pavel Kolčárek 1
- Simon Krek 1
- Cvetana Krstev 1
- Rishu Kumar 1
- Joachim Köhler 1
- Lori Lamel 1
- D. Terence Langendoen 1
- Laure Le Bars 1
- Elena Leitner 1
- Jeremy Leixa 1
- Alessandro Lenci 1
- Johannes Leveling 1
- Pengwei Li 1
- Børge Lindberg 1
- Krister Lindén 1
- Hrafn Loftsson 1
- Paul Lukowicz 1
- Xutai Ma 1
- Giulio Maltese 1
- Michele Mammini 1
- Emmanuel Maragoudakis 1
- Prashant Mathur 1
- Evgeny Matusov 1
- John Philip McCrae 1
- Helen McHugh 1
- Paul McNamee 1
- Luc Meertens 1
- Odile Mella 1
- Chafic Mokbel 1
- Massimo Moneglia 1
- Julian Moreno Schneider 1
- Petr Motlicek 1
- Abdelhak Mouradi 1
- Chafic Mukbel 1
- Kenton Murray 1
- Maria Nadejde 1
- Satoshi Nakamura 1
- Maria Nava 1
- Matteo Negri 1
- Pedro Neiva 1
- Ha Nguyen 1
- Jan Niehues 1
- Mahtab Nikkhou 1
- Xing Niu 1
- Maciej Ogrodniczuk 1
- Atul Kr. Ojha 1
- John E. Ortega 1
- Simon Ostermann 1
- Proyag Pal 1
- Martha Palmer 1
- Harris Papageorgiou 1
- Bolette Sandford Pedersen 1
- Carol Peters 1
- Yann Philip 1
- Juan Pino 1
- Elisabeth Pinto 1
- Peter Polák 1
- Hervé Pusset 1
- James Pustejovsky 1
- David Pérez 1
- David Pérez-Fernández 1
- Stefania Racioppa 1
- Ahmed Ragheb 1
- Mohsen Rashwan 1
- Gaël Richard 1
- Michael Rigault 1
- Elijah Rippeth 1
- Laurent Romary 1
- Paolo Rosso 1
- Roberts Rozis 1
- Irene Russo 1
- Eirikur Rögnvaldsson 1
- Houda Saadane 1
- Elizabeth Salesky 1
- Eileen Schnur 1
- Hosni Seffih 1
- Nasredine Semmar 1
- Francesco Senia 1
- Mostafa Shahin 1
- Sherrie Shammass 1
- Jiatong Shi 1
- Alexandre Sicard 1
- Ingo Siegert 1
- Inguna Skadiņa 1
- Philipp Slusallek 1
- Lilli Smal 1
- Matthias Sperber 1
- Christian Spurk 1
- Severin Stampler 1
- Rainer Stiefelhagen 1
- Sebastian Stüker 1
- Katsuhito Sudoh 1
- Igor Szoke 1
- Marko Tadić 1
- Kossay Talmoudi 1
- Yun Tang 1
- Allan Tart 1
- Dorothea Thomas-Aniola 1
- Brian Thompson 1
- Gregor Thurmair 1
- Takenobu Tokunaga 1
- Antonio Toral 1
- Kevin Tran 1
- Anastasios Tsopanoglou 1
- Dan Tufiş 1
- Marco Turchi 1
- Jordi Turmo 1
- Marisa Ulivieri 1
- Dusan Varis 1
- Artūrs Vasiļevskis 1
- Myriam Vergnes 1
- Karel Veselý 1
- Kadri Vider 1
- Nadine Vigouroux 1
- Jeanne Villaneau 1
- Tamás Váradi 1
- Mingxuan Wang 1
- Shinji Watanabe 1
- François Yvon 1
- Rodolfo Zevallos 1
- Juan Pablo Zuluaga Gomez 1
- Lonneke van der Plas 1
- Jan Černocký 1