Bente Maegaard

Also published as: B. Maegaard


2022

pdf bib
Proceedings of the Thirteenth Language Resources and Evaluation Conference
Nicoletta Calzolari | Frédéric Béchet | Philippe Blache | Khalid Choukri | Christopher Cieri | Thierry Declerck | Sara Goggi | Hitoshi Isahara | Bente Maegaard | Joseph Mariani | Hélène Mazo | Jan Odijk | Stelios Piperidis
Proceedings of the Thirteenth Language Resources and Evaluation Conference

2020

pdf bib
Proceedings of the Twelfth Language Resources and Evaluation Conference
Nicoletta Calzolari | Frédéric Béchet | Philippe Blache | Khalid Choukri | Christopher Cieri | Thierry Declerck | Sara Goggi | Hitoshi Isahara | Bente Maegaard | Joseph Mariani | Hélène Mazo | Asuncion Moreno | Jan Odijk | Stelios Piperidis
Proceedings of the Twelfth Language Resources and Evaluation Conference

pdf
Interoperability in an Infrastructure Enabling Multidisciplinary Research: The case of CLARIN
Franciska de Jong | Bente Maegaard | Darja Fišer | Dieter van Uytvanck | Andreas Witt
Proceedings of the Twelfth Language Resources and Evaluation Conference

CLARIN is a European Research Infrastructure providing access to language resources and technologies for researchers in the humanities and social sciences. It supports the use and study of language data in general and aims to increase the potential for comparative research of cultural and societal phenomena across the boundaries of languages and disciplines, all in line with the European agenda for Open Science. Data infrastructures such as CLARIN have recently embarked on the emerging frameworks for the federation of infrastructural services, such as the European Open Science Cloud and the integration of services resulting from multidisciplinary collaboration in federated services for the wider SSH domain. In this paper we describe the interoperability requirements that arise through the existing ambitions and the emerging frameworks. The interoperability theme will be addressed at several levels, including organisation and ecosystem, design of workflow services, data curation, performance measurement and collaboration.

2018

bib
Proceedings of the Eleventh International Conference on Language Resources and Evaluation (LREC 2018)
Nicoletta Calzolari | Khalid Choukri | Christopher Cieri | Thierry Declerck | Sara Goggi | Koiti Hasida | Hitoshi Isahara | Bente Maegaard | Joseph Mariani | Hélène Mazo | Asuncion Moreno | Jan Odijk | Stelios Piperidis | Takenobu Tokunaga
Proceedings of the Eleventh International Conference on Language Resources and Evaluation (LREC 2018)

pdf
CLARIN: Towards FAIR and Responsible Data Science Using Language Resources
Franciska de Jong | Bente Maegaard | Koenraad De Smedt | Darja Fišer | Dieter Van Uytvanck
Proceedings of the Eleventh International Conference on Language Resources and Evaluation (LREC 2018)

2016

bib
Proceedings of the Tenth International Conference on Language Resources and Evaluation (LREC'16)
Nicoletta Calzolari | Khalid Choukri | Thierry Declerck | Sara Goggi | Marko Grobelnik | Bente Maegaard | Joseph Mariani | Helene Mazo | Asuncion Moreno | Jan Odijk | Stelios Piperidis
Proceedings of the Tenth International Conference on Language Resources and Evaluation (LREC'16)

pdf
Providing a Catalogue of Language Resources for Commercial Users
Bente Maegaard | Lina Henriksen | Andrew Joscelyne | Vesna Lusicky | Margaretha Mazura | Sussi Olsen | Claus Povlsen | Philippe Wacker
Proceedings of the Tenth International Conference on Language Resources and Evaluation (LREC'16)

Language resources (LR) are indispensable for the development of tools for machine translation (MT) or various kinds of computer-assisted translation (CAT). In particular language corpora, both parallel and monolingual are considered most important for instance for MT, not only SMT but also hybrid MT. The Language Technology Observatory will provide easy access to information about LRs deemed to be useful for MT and other translation tools through its LR Catalogue. In order to determine what aspects of an LR are useful for MT practitioners, a user study was made, providing a guide to the most relevant metadata and the most relevant quality criteria. We have seen that many resources exist which are useful for MT and similar work, but the majority are for (academic) research or educational use only, and as such not available for commercial use. Our work has revealed a list of gaps: coverage gap, awareness gap, quality gap, quantity gap. The paper ends with recommendations for a forward-looking strategy.

2014

bib
Proceedings of the Ninth International Conference on Language Resources and Evaluation (LREC'14)
Nicoletta Calzolari | Khalid Choukri | Thierry Declerck | Hrafn Loftsson | Bente Maegaard | Joseph Mariani | Asuncion Moreno | Jan Odijk | Stelios Piperidis
Proceedings of the Ninth International Conference on Language Resources and Evaluation (LREC'14)

pdf
Encompassing a spectrum of LT users in the CLARIN-DK Infrastructure
Lina Henriksen | Dorte Haltrup Hansen | Bente Maegaard | Bolette Sandford Pedersen | Claus Povlsen
Proceedings of the Ninth International Conference on Language Resources and Evaluation (LREC'14)

CLARIN-DK is a platform with language resources constituting the Danish part of the European infrastructure CLARIN ERIC. Unlike some other language based infrastructures CLARIN-DK is not solely a repository for upload and storage of data, but also a platform of web services permitting the user to process data in various ways. This involves considerable complications in relation to workflow requirements. The CLARIN-DK interface must guide the user to perform the necessary steps of a workflow; even when the user is inexperienced and perhaps has an unclear conception of the requested results. This paper describes a user driven approach to creating a user interface specification for CLARIN-DK. We indicate how different user profiles determined different crucial interface design options. We also describe some use cases established in order to give illustrative examples of how the platform may facilitate research.

2012

bib
Proceedings of the Eighth International Conference on Language Resources and Evaluation (LREC'12)
Nicoletta Calzolari | Khalid Choukri | Thierry Declerck | Mehmet Uğur Doğan | Bente Maegaard | Joseph Mariani | Asuncion Moreno | Jan Odijk | Stelios Piperidis
Proceedings of the Eighth International Conference on Language Resources and Evaluation (LREC'12)

pdf
Creation and use of Language Resources in a Question-Answering eHealth System
Ulrich Andersen | Anna Braasch | Lina Henriksen | Csaba Huszka | Anders Johannsen | Lars Kayser | Bente Maegaard | Ole Norgaard | Stefan Schulz | Jürgen Wedekind
Proceedings of the Eighth International Conference on Language Resources and Evaluation (LREC'12)

ESICT (Experience-oriented Sharing of health knowledge via Information and Communication Technology) is an ongoing research project funded by the Danish Council for Strategic Research. It aims at developing a health/disease related information system based on information technology, language technology, and formalized medical knowledge. The formalized medical knowledge consists partly of the terminology database SNOMED CT and partly of authorized medical texts on the domain. The system will allow users to ask questions in Danish and will provide natural language answers. Currently, the project is pursuing three basically different methods for question answering, and they are all described to some extent in this paper. A system prototype will handle questions related to diabetes and heart diseases. This paper concentrates on the methods employed for question answering and the language resources that are utilized. Some resources were existing, such as SNOMED CT, others, such as a corpus of sample questions, have had to be created or constructed.

2010

bib
Proceedings of the Seventh International Conference on Language Resources and Evaluation (LREC'10)
Nicoletta Calzolari | Khalid Choukri | Bente Maegaard | Joseph Mariani | Jan Odijk | Stelios Piperidis | Mike Rosner | Daniel Tapias
Proceedings of the Seventh International Conference on Language Resources and Evaluation (LREC'10)

pdf
Resource and Service Centres as the Backbone for a Sustainable Service Infrastructure
Peter Wittenburg | Nuria Bel | Lars Borin | Gerhard Budin | Nicoletta Calzolari | Eva Hajicova | Kimmo Koskenniemi | Lothar Lemnitzer | Bente Maegaard | Maciej Piasecki | Jean-Marie Pierrel | Stelios Piperidis | Inguna Skadina | Dan Tufis | Remco van Veenendaal | Tamas Váradi | Martin Wynne
Proceedings of the Seventh International Conference on Language Resources and Evaluation (LREC'10)

Currently, research infrastructures are being designed and established in many disciplines since they all suffer from an enormous fragmentation of their resources and tools. In the domain of language resources and tools the CLARIN initiative has been funded since 2008 to overcome many of the integration and interoperability hurdles. CLARIN can build on knowledge and work from many projects that were carried out during the last years and wants to build stable and robust services that can be used by researchers. Here service centres will play an important role that have the potential of being persistent and that adhere to criteria as they have been established by CLARIN. In the last year of the so-called preparatory phase these centres are currently developing four use cases that can demonstrate how the various pillars CLARIN has been working on can be integrated. All four use cases fulfil the criteria of being cross-national.

pdf
Cooperation for Arabic Language Resources and Tools — The MEDAR Project
Bente Maegaard | Mohamed Attia | Khalid Choukri | Olivier Hamon | Steven Krauwer | Mustafa Yaseen
Proceedings of the Seventh International Conference on Language Resources and Evaluation (LREC'10)

The paper describes some of the work carried out within the European funded project MEDAR. The project has three streams of activity: the technical stream, the cooperation stream and the dissemination stream. MEDAR has first updated the existing surveys and BLARK for Arabic, and then the technical stream focused on machine translation. The consortium identified a number of freely available MT systems and then customized two versions of the famous MOSES package. The Consortium addressed the needs to package MOSES for English to Arabic (while the main MT stream is on Arabic to English). For performance assessment purposes, the partners produced test data that allowed carrying out an evaluation campaign with 5 different systems (including from outside the consortium) and two online ones. Both the MT baselines and the collected data will be made available via ELRA catalogue. The cooperation stream focuses mostly on the cooperation roadmap for Human Language Technologies for Arabic. Cooperation Roadmap for the region directed towards the Arabic HLT in general. It is the purpose of the roadmap to outline areas and priorities for collaboration, in terms of collaboration between EU countries and Arabic speaking countries, as well as cooperation in general: between countries, between universities, and last but not least between universities and industry.

2008

bib
Proceedings of the Sixth International Conference on Language Resources and Evaluation (LREC'08)
Nicoletta Calzolari | Khalid Choukri | Bente Maegaard | Joseph Mariani | Jan Odijk | Stelios Piperidis | Daniel Tapias
Proceedings of the Sixth International Conference on Language Resources and Evaluation (LREC'08)

pdf
MEDAR: Collaboration between European and Mediterranean Arabic Partners to Support the Development of Language Technology for Arabic
Bente Maegaard | Mohammed Atiyya | Khalid Choukri | Steven Krauwer | Chafic Mokbel | Mustafa Yaseen
Proceedings of the Sixth International Conference on Language Resources and Evaluation (LREC'08)

After the successful completion of the NEMLAR project 2003-2005, a new opportunity for a project was opened by the European Commission, and a group of largely the same partners is now executing the MEDAR project. MEDAR will be updating the surveys and BLARK for Arabic already made, and will then focus on machine translation (and other tools for translation) and information retrieval with a focus on language resources, tools and evaluation for these applications. A very important part of the MEDAR project is to reinforce and extend the NEMLAR network and to create a cooperation roadmap for Human Language Technologies for Arabic. It is expected that the cooperation roadmap will attract wide attention from other parties and that it can help create a larger platform for collaborative projects. Finally, the project will focus on dissemination of knowledge about existing resources and tools, as well as actors and activities; this will happen through newsletter, website and an international conference which will follow up on the Cairo conference of 2004. Dissemination to user communities will also be important, e.g. through participation in translators? conferences. The goal of these activities is to create a stronger and lasting collaboration between EU countries and Arabic speaking countries.

pdf bib
Proceedings of the 12th Annual Conference of the European Association for Machine Translation
John Hutchins | Walther v. Hahn | Bente Maegaard | John Hutchins
Proceedings of the 12th Annual Conference of the European Association for Machine Translation

pdf
Domain specific MT in use
Lene Offersgaard | Claus Povlsen | Lisbeth Almsten | Bente Maegaard
Proceedings of the 12th Annual Conference of the European Association for Machine Translation

2007

pdf bib
Proceedings of Machine Translation Summit XI: Papers
Bente Maegaard
Proceedings of Machine Translation Summit XI: Papers

pdf bib
Proceedings of the Workshop on Automatic procedures in MT evaluation
Gregor Thurmair | Khalid Choukri | Bente Maegaard
Proceedings of the Workshop on Automatic procedures in MT evaluation

2006

pdf bib
Proceedings of the 11th Annual Conference of the European Association for Machine Translation
Viggo Hansen | Bente Maegaard
Proceedings of the 11th Annual Conference of the European Association for Machine Translation

bib
Proceedings of the Fifth International Conference on Language Resources and Evaluation (LREC’06)
Nicoletta Calzolari | Khalid Choukri | Aldo Gangemi | Bente Maegaard | Joseph Mariani | Jan Odijk | Daniel Tapias
Proceedings of the Fifth International Conference on Language Resources and Evaluation (LREC’06)

pdf
Building Annotated Written and Spoken Arabic LRs in NEMLAR Project
M. Yaseen | M. Attia | B. Maegaard | K. Choukri | N. Paulsson | S. Haamid | S. Krauwer | C. Bendahman | H. Fersøe | M. Rashwan | B. Haddad | C. Mukbel | A. Mouradi | A. Al-Kufaishi | M. Shahin | N. Chenfour | A. Ragheb
Proceedings of the Fifth International Conference on Language Resources and Evaluation (LREC’06)

The NEMLAR project: Network for Euro-Mediterranean LAnguage Resource and human language technology development and support (www.nemlar.org) was a project supported by the EC with partners from Europe and Arabic countries, whose objective is to build a network of specialized partners to promote and support the development of Arabic Language Resources (LRs) in the Mediterranean region. The project focused on identifying the state of the art of LRs in the region, assessing priority requirements through consultations with language industry and communication players, and establishing a protocol for developing and identifying a Basic Language Resource Kit (BLARK) for Arabic, and to assess first priority requirements. The BLARK is defined as the minimal set of language resources that is necessary to do any pre-competitive research and education, in addition to the development of crucial components for any future NLP industry. Following the identification of high priority resources the NEMLAR partners agreed to focus on, and produce three main resources, which are 1) Annotated Arabic written corpus of about 500 K words, 2) Arabic speech corpus for TTS applications of 2x5 hours, and 3) Arabic broadcast news speech corpus of 40 hours Modern Standard Arabic. For each of the resources underlying linguistic models and assumptions of the corpus, technical specifications, methodologies for the collection and building of the resources, validation and verification mechanisms were put and applied for the three LRs.

pdf
The MULINCO corpus and corpus platform
Bente Maegaard | Lene Offersgaard | Lina Henriksen | Hanne Jansen | Xavier Lepetit | Costanza Navarretta | Claus Povlsen
Proceedings of the Fifth International Conference on Language Resources and Evaluation (LREC’06)

The MULINCO project (MUltiLINgual Corpus of the University of Copenhagen) started early 2005. The purpose of this cross-disciplinary project is to create a corpus platform for education and research in monolingual and translation studies. The project covers two main types of corpus texts: literary and non-literary. The platform is being developed using available tools as far as possible, and integrating them in a very open architecture. In this paper we describe the current status and future developments of both the text and tool side of the corpus platform, and we show some examples of student exercises taking advantage of tagged and aligned texts.

pdf
KUNSTI - Knowledge Generation for Norwegian Language Technology
Bente Maegaard | Jens-Erik Fenstad | Lars Ahrenberg | Knut Kvale | Katarina Mühlenbock | Bernt-Erik Heid
Proceedings of the Fifth International Conference on Language Resources and Evaluation (LREC’06)

KUNSTI is the Norwegian national language technology programme, running 2001-2006 inclusive. The goal of the programme is to boost Norwegian language technology research. In this paper we describe the background, the objectives, the methodology applied in the management of the programme, the projects selected, and our first conclusions. We also describe national programmes form Sweden, France and Germany and compare objectives and methods.

pdf
The BLARK concept and BLARK for Arabic
Bente Maegaard | Steven Krauwer | Khalid Choukri | Lise Damsgaard Jørgensen
Proceedings of the Fifth International Conference on Language Resources and Evaluation (LREC’06)

The EU project NEMLAR (Network for Euro-Mediterranean LAnguage Resources) on Arabic language resources carried out two surveys on the availability of Arabic LRs in the region, and on industrial requirements. The project also worked out a BLARK (Basic Language Resource Kit) for Arabic. In this paper we describe the further development of the BLARK concept made during the work on a BLARK for Arabic, as well as the results for Arabic.

2005

pdf bib
Frontmatter
Bente Maegaard
Proceedings of the 10th EAMT Conference: Practical applications of machine translation

2004

pdf
The NEMLAR project on Arabic language resources
Bente Maegaard
Proceedings of the 9th EAMT Workshop: Broadening horizons of machine translation and its applications

pdf
Industrial Needs for Language Resources
Bente Maegaard
Proceedings of the Fourth International Conference on Language Resources and Evaluation (LREC’04)

pdf
Corporate Voice, Tone of Voice and Controlled Language Techniques
Lina Henriksen | Bart Jongejan | Bente Maegaard
Proceedings of the Fourth International Conference on Language Resources and Evaluation (LREC’04)

pdf
ENABLER Thematic Network of National Projects: Technical, Strategic and Political Issues of LRs
Nicoletta Calzolari | Khalid Choukri | Maria Gavrilidou | Bente Maegaard | Paola Baroni | Hanne Fersøe | Alessandro Lenci | Valérie Mapelli | Monica Monachini | Stelios Piperidis
Proceedings of the Fourth International Conference on Language Resources and Evaluation (LREC’04)

pdf
NEMLAR - An Arabic Language Resources Project
Bente Maegaard
Proceedings of the Fourth International Conference on Language Resources and Evaluation (LREC’04)

2001

pdf bib
Proceedings of Machine Translation Summit VIII
Bente Maegaard
Proceedings of Machine Translation Summit VIII

1999


Human Language Technologies - possibilities in the EU 5th Framework Programme for Research and Technological Development
Bente Maegaard
EAMT Workshop: EU and the new languages

1998

pdf
Summary of the concluding session
Dimitrios Theologitis | Bente Maegaard
Proceedings of the 1998 EAMT Workshop: Translation technology: integration in the workflow environment

pdf bib
Proceedings of the 11th Nordic Conference of Computational Linguistics (NODALIDA 1998)
Bente Maegaard
Proceedings of the 11th Nordic Conference of Computational Linguistics (NODALIDA 1998)

1997

pdf
Whither MT?
Bente Maegaard
Proceedings of Machine Translation Summit VI: Plenaries

MT started out as a ‘technology push’: more than 50 years ago, researchers had the bright idea of doing translation with the use of the newly developed computers. MT remained in the technology push area for many years. However, in the nineties we are seeing the ‘market pull’ beginning to play a role and there are good reasons to believe that this trend will continue. MT is going where the market and the users wants it to go, and MT will be prospering in the future. MT will be available electronically over the network, and MT will be available in environments which also offer a variety of other tools for translation, as well as tools for other types of information management. Also in research and in development of new technologies, MT will further develop, e.g. along the lines of knowledge-based MT, advanced integration of different analysis techniques (rule-based, statistics-based, etc.), integration with speech etc.

pdf
Evaluation of Language Tools
Bente Maegaard
Proceedings of Translating and the Computer 19

pdf
The workflow in a document production environment using translation tools
Bente Maegaard
EAMT Workshop: Language Technology in your Organization?

pdf
Summary and conclusions
Dimitri Theologitis | Bente Maegaard
EAMT Workshop: Language Technology in your Organization?

1996

pdf
Evaluation of NLP systems
Bente Maegaard
COLING 1996 Volume 2: The 16th International Conference on Computational Linguistics

pdf
PaTrans- A Patent Translation System
Bjarne Orsnes | Bradley Music | Bente Maegaard
COLING 1996 Volume 2: The 16th International Conference on Computational Linguistics

1995

pdf
Eurotra, history and results
Bente Maegaard
Proceedings of Machine Translation Summit V

1988

pdf
Designing and testing linguistic development phases in machine translation project
Bente Maegaard
Coling Budapest 1988 Volume 1: International Conference on Computational Linguistics

1987

pdf bib
Third Conference of the European Chapter of the Association for Computational Linguistics
Bente Maegaard
Third Conference of the European Chapter of the Association for Computational Linguistics

1984

pdf
Regelformalismer til brug ved datamatisk lingvistik (Rule formalisms for use in computational linguistics) [In Danish]
Bente Maegaard
Proceedings of the 4th Nordic Conference of Computational Linguistics (NODALIDA 1983)

1982

pdf
The Transfer of Finite Verb Forms in a Machine Translation System
Bente Maegaard
Coling 1982 Abstracts: Proceedings of the Ninth International Conference on Computational Linguistics Abstracts

1979

pdf bib
Proceedings of the 2nd Nordic Conference of Computational Linguistics (NODALIDA 1979)
Bente Maegaard
Proceedings of the 2nd Nordic Conference of Computational Linguistics (NODALIDA 1979)

pdf
Strukturering af lingvistiske data til brug ved maskinoversættelse (Structuring of linguistic data for use in machine translation) [In Danish]
Bente Maegaard | Hanne Ruus
Proceedings of the 2nd Nordic Conference of Computational Linguistics (NODALIDA 1979)

1977

pdf
DANwORD – Hyppighedsundersøgelser i moderne dansk (DANwORD – Frequency surveys in modern Danish) [In Danish]
Bente Maegaard | Hanne Ruus
Proceedings of the 1st Nordic Conference of Computational Linguistics (NODALIDA 1977)

1973

pdf
Segmentation of French Sentences
Bente Maegaard | Ebbe Spang-Hanssen
COLING 1973 Volume 2: Computational And Mathematical Linguistics: Proceedings of the International Conference on Computational Linguistics

Search
Co-authors
Venues