Harold Somers

Also published as: Harold L. Somers, H.L. Somers

Making a sow’s ear out of a silk purse: (mis)using online MT services as bilingual dictionaries
Federico Gaspari | Harold Somers
Proceedings of Translating and the Computer 29

pdf bib

Medical spoken language translation: What do the users really need?
Harold Somers
Proceedings of Translating and the Computer 29

pdf bib

Theoretical and methodological issues regarding the use of language technologies for patients with limited English proficiency
Harold Somers
Proceedings of the 11th Conference on Theoretical and Methodological Issues in Machine Translation of Natural Languages: Papers

2006

pdf bib

Detecting Inappropriate Use of Free Online Machine Translation by Language Students. A Special Case of Plagiarism Detection
Harold Somers | Federico Gaspari | Ana Niño
Proceedings of the 11th Annual Conference of the European Association for Machine Translation

pdf bib abs

Developing Speech Synthesis for Under-Resourced Languages by “Faking it”: An Experiment with Somali
Harold Somers | Gareth Evans | Zeinab Mohamed
Proceedings of the Fifth International Conference on Language Resources and Evaluation (LREC’06)

Speech synthesis or text-to-speech (TTS) systems are currently available for a number of the world's major languages, but for thousands of other, unsupported, languages no such technology is available. While awaiting the development of such technology, we propose using an existing TTS system for a major language (the base language, BL) to "fake" TTS for an unsupported language (the target language, TL). This paper describes the factors which determine the choice of a suitable BL for a given TL, and describe an experiment with a fake Somali TTS system evaluated in the real-life situation of a doctorpatient dialogue. 28 Somali participants were asked to judge the comprehensibility of 25 short Somali sentences recorded with a German TTS system. Results suggest that "faking it" provides reasonable stop-gap TTS for unsupported languages.

pdf bib

Language Engineering and the Pathway to Healthcare: A User-Oriented View
Harold Somers
Proceedings of the First International Workshop on Medical Speech Translation

2005

pdf bib

DEMOCRAT: Deciding between Multiple Outputs Created by Automatic Translation
Menno van Zaanen | Harold Somers
Proceedings of Machine Translation Summit X: Papers

pdf bib

Faking it: Synthetic Text-to-speech Synthesis for Under-resourced Languages – Experimental Design
Harold Somers
Proceedings of the Australasian Language Technology Workshop 2005

pdf bib

Round-trip Translation: What Is It Good For?
Harold Somers
Proceedings of the Australasian Language Technology Workshop 2005

2004

bib

Latest challenges to MT R&D
Harold Somers
Proceedings of the 10th Conference on Theoretical and Methodological Issues in Machine Translation of Natural Languages

2003

pdf bib abs

Evaluating commercial spoken language translation software
Harold Somers | Yuri Sugita
Proceedings of Machine Translation Summit IX: Papers

While spoken language translation remains a research goal, a crude form of it is widely available commercially for Japanese–English as a pipeline concatenation of speech-to-text recognition (SR), text-to-text translation (MT) and text-to-speech synthesis (SS). This paper proposes and illustrates an evaluation methodology for this noisy channel which tries to quantify the relative amount of degradation in translation quality due to each of the contributing modules. A small pilot experiment involving word-accuracy rate for the SR, and a fidelity evaluation for the MT and SS modules is proposed in which subjects are asked to paraphrase translated and/or synthesised sentences from a tourist’s phrasebook. Results show (as expected) that MT is the “noisiest” channel, with SS contributing least noise. The concatenation of the three channels is worse than could be predicted from the performance of each as individual tasks.

pdf bib abs

Prolog models of classical approaches to MT
Harold Somers
Workshop on Teaching Translation Technologies and Tools

This paper describes a number of “toy” MT systems written in Prolog, designed as programming exercises and illustrations of various approaches to MT. The systems include a dumb word-for-word system, DCG-based “transfer” system, an interlingua-based system with an LFG-like interface structure, a first-generation-like Russian-English system, an interactive system, and an implementation based on early example-based MT.

pdf bib

Computer-based Support for Patients with Limited English
Harold Somers | Hermione Lovel
Proceedings of the 7th International EAMT workshop on MT and other language technology tools, Improving MT through other language technology tools, Resource and tools for building MT at EACL 2003

2002

bib

What are we celebrating today?
Harold Somers
Workshop on machine translation roadmap

2001

pdf bib abs

EBMT seen as case-based reasoning
Harold Somers
Workshop on Example-Based machine Translation

This paper looks at EBMT from the perspective of the Case-based Reasoning (CBR) paradigm. We attempt to describe the task of machine translation (MT) seen as a potential application of CBR, and attempt to describe MT in standard CBR terms. The aim is to see if other applications of CBR can suggest better ways to approach EBMT.

pdf bib abs

Three perspectives on MT in the classroom
Harold Somers
Workshop on Teaching Machine Translation

This paper considers the role of translation software, especially Machine Translation (MT), in curricula for students of computational linguistics, for trainee translators and for language learners. These three sets of students have differing needs and interests, although there is some overlap between them. A brief historical view of MT in the classroom is given, including comments on the author’s 25 years of experience in the field. This is followed by discussion and examples of strategies for teaching about MT and related aspects of Language Engineering and Information Technology for the three types of student.

2000

pdf bib abs

Is MT software documentation appropriate for MT users?
David Mowatt | Harold Somers
Proceedings of the Fourth Conference of the Association for Machine Translation in the Americas: User Studies

This paper discusses an informal methodology for evaluating Machine Translation software documentation with reference to a case study, in which a number of currently available MT packages are evaluated. Different types of documentation style are discussed, as well as different user profiles. It is found that documentation is often inadequate in identifying the level of linguistic background and knowledge necessary to use translation software, and in explaining technical (linguistic) terms needed to use the software effectively. In particular, the level of knowledge and training needed to use the software is often incompatible with the user profile implied by the documentation. Also, guidance on how to perform more complex tasks, which may be especially idiosyncratic, is often inadequate or missing altogether.

pdf bib

Evaluating Machine Translation: the Cloze Procedure Revisited
Harold Somers | Elizabeth Wild
Proceedings of Translating and the Computer 22

1999

pdf bib abs

Sources of linguistic knowledge for minority languages
Harold L. Somers
Proceedings of Machine Translation Summit VII

Language Engineering (LE) products and resources for the world’s “major” languages are steadily increasing, but there remains a major gap as regards less widely-used languages. This paper considers the current situation regarding LE resources for some of the languages in question, and some proposals for rectifying this situation are made, including techniques based on adapting existing resources and “knowledge extraction” techniques from machine-readable corpora.

pdf bib

Aligning Phonetic Segments for Children’s Articulation Assessment
Harold Somers
Computational Linguistics, Volume 25, Number 2, June 1999

1998

bib

Survey of methodological approaches to MT
Harold Somers
Proceedings of the Third Conference of the Association for Machine Translation in the Americas: Tutorial Descriptions

pdf bib

Similarity metrics for aligning children’s articulation data
Harold L. Somers
COLING 1998 Volume 2: The 17th International Conference on Computational Linguistics

pdf bib

Similarity Metrics for Aligning Children’s Articulation Data
Harold L. Somers
36th Annual Meeting of the Association for Computational Linguistics and 17th International Conference on Computational Linguistics, Volume 2

pdf bib

An Attempt to Use Weighted Cusums to Identify Sublanguages
Harold Somers
New Methods in Language Processing and Computational Natural Language Learning

pdf bib

Extracting Recurrent Phrases and Terms from Texts Using a Purely Statistical Method
Zhao-Ming Gao | Harold Somers
Proceedings of the 12th Pacific Asia Conference on Language, Information and Computation

1997

pdf bib abs

The Current State of Machine Translation
Harold L. Somers
Proceedings of Machine Translation Summit VI: Plenaries

This paper aims to survey the current state of research, development and use of Machine Translation (MT). Under ‘research’ the role of linguistics is discussed, and contrasted with research in ‘analogy- based’ MT. The range of languages covered by MT systems is discussed, and the lack of development for minority languages noted. The new research area of spoken language translation (SLT) is reviewed, with some major differences between SLT and text MT described. Under ‘use and users’ we discuss tools for users: Translation Memory, bilingual concordances and software to help checking for mistranslations. The use of MT on the World Wide Web is also discussed, regarding pre- and post-editing, the impact of ‘controlled language’ is reviewed, and finally a proposal is made that MT users can revise the input text in the light of errors that the system makes, thus ‘post-editing the source text’.

pdf bib

Machine Translation and Minority Languages
Harold Somers
Proceedings of Translating and the Computer 19

pdf bib