Julie Carson-Berndsen

Also published as: Julie Carson, Julle Carson-Berndsen


Domain-Informed Probing of wav2vec 2.0 Embeddings for Phonetic Features
Patrick Cormac English | John D. Kelleher | Julie Carson-Berndsen
Proceedings of the 19th SIGMORPHON Workshop on Computational Research in Phonetics, Phonology, and Morphology

In recent years large transformer model architectures have become available which provide a novel means of generating high-quality vector representations of speech audio. These transformers make use of an attention mechanism to generate representations enhanced with contextual and positional information from the input sequence. Previous works have explored the capabilities of these models with regard to performance in tasks such as speech recognition and speaker verification, but there has not been a significant inquiry as to the manner in which the contextual information provided by the transformer architecture impacts the representation of phonetic information within these models. In this paper, we report the results of a number of probing experiments on the representations generated by the wav2vec 2.0 model’s transformer component, with regard to the encoding of phonetic categorization information within the generated embeddings. We find that the contextual information generated by the transformer’s operation results in enhanced capture of phonetic detail by the model, and allows for distinctions to emerge in acoustic data that are otherwise difficult to separate.


The Influence of Regional Pronunciation Variation on Children’s Spelling and the Potential Benefits of Accent Adapted Spellcheckers
Emma O’Neill | Joe Kenny | Anthony Ventresque | Julie Carson-Berndsen
Proceedings of the 25th Conference on Computational Natural Language Learning

A child who is unfamiliar with the correct spelling of a word often employs a “sound it out” approach: breaking the word down into its constituent sounds and then choosing letters to represent the identified sounds. This often results in a misspelling that is orthographically very different to the intended target. Recently, efforts have been made to develop phonetic based spellcheckers to tackle the more deviant nature of children’s misspellings. However, little work has been done to investigate the potential of spelling correction tools that incorporate regional pronunciation variation. If a child must first identify the sounds that make up a word, it stands to reason their pronunciation would influence this process. We investigate this hypothesis along with the feasibility and potential benefits of adapting spelling correction tools to more specific language variants - particularly Irish Accented English. We use misspelling data from schoolchildren across Ireland to adapt an existing English phonetic-based spellchecker and demonstrate improvements in performance. These results not only prompt consideration of language varieties in the development of spellcheckers but also contribute to existing literature on the role of regional accent in the acquisition of writing proficiency.


English to Indonesian Transliteration to Support English Pronunciation Practice
Amalia Zahra | Julie Carson-Berndsen
Proceedings of the Eighth International Conference on Language Resources and Evaluation (LREC'12)

The work presented in this paper explores the use of Indonesian transliteration to support English pronunciation practice. It is mainly aimed for Indonesian speakers who have no or minimum English language skills. The approach implemented combines a rule-based and a statistical method. The rules of English-Phone-to-Indonesian-Grapheme mapping are implemented with a Finite State Transducer (FST), followed by a statistical method which is a grapheme-based trigram language model. The Indonesian transliteration generated was used as a means to support the learners where their speech were then recorded. The speech recordings have been evaluated by 19 participants: 8 English native and 11 non-native speakers. The results show that the transliteration positively contributes to the improvement of their English pronunciation.

Evaluating expressive speech synthesis from audiobook corpora for conversational phrases
Éva Székely | Joao Paulo Cabral | Mohamed Abou-Zleikha | Peter Cahill | Julie Carson-Berndsen
Proceedings of the Eighth International Conference on Language Resources and Evaluation (LREC'12)

Audiobooks are a rich resource of large quantities of natural sounding, highly expressive speech. In our previous research we have shown that it is possible to detect different expressive voice styles represented in a particular audiobook, using unsupervised clustering to group the speech corpus of the audiobook into smaller subsets representing the detected voice styles. These subsets of corpora of different voice styles reflect the various ways a speaker uses their voice to express involvement and affect, or imitate characters. This study is an evaluation of the detection of voice styles in an audiobook in the application of expressive speech synthesis. A further aim of this study is to investigate the usability of audiobooks as a language resource for expressive speech synthesis of utterances of conversational speech. Two evaluations have been carried out to assess the effect of the genre transfer: transmitting expressive speech from read aloud literature to conversational phrases with the application of speech synthesis. The first evaluation revealed that listeners have different voice style preferences for a particular conversational phrase. The second evaluation showed that it is possible for users of speech synthesis systems to learn the characteristics of a voice style well enough to make reliable predictions about what a certain utterance will sound like when synthesised using that voice style.

Rapidly Testing the Interaction Model of a Pronunciation Training System via Wizard-of-Oz
Joao Paulo Cabral | Mark Kane | Zeeshan Ahmed | Mohamed Abou-Zleikha | Éva Székely | Amalia Zahra | Kalu Ogbureke | Peter Cahill | Julie Carson-Berndsen | Stephan Schlögl
Proceedings of the Eighth International Conference on Language Resources and Evaluation (LREC'12)

This paper describes a prototype of a computer-assisted pronunciation training system called MySpeech. The interface of the MySpeech system is web-based and it currently enables users to practice pronunciation by listening to speech spoken by native speakers and tuning their speech production to correct any mispronunciations detected by the system. This practice exercise is facilitated in different topics and difficulty levels. An experiment was conducted in this work that combines the MySpeech service with the WebWOZ Wizard-of-Oz platform (http://www.webwoz.com), in order to improve the human-computer interaction (HCI) of the service and the feedback that it provides to the user. The employed Wizard-of-Oz method enables a human (who acts as a wizard) to give feedback to the practising user, while the user is not aware that there is another person involved in the communication. This experiment permitted to quickly test an HCI model before its implementation on the MySpeech system. It also allowed to collect input data from the wizard that can be used to improve the proposed model. Another outcome of the experiment was the preliminary evaluation of the pronunciation learning service in terms of user satisfaction, which would be difficult to conduct before integrating the HCI part.

pdf bib
WinkTalk: a demonstration of a multimodal speech synthesis platform linking facial expressions to expressive synthetic voices
Éva Székely | Zeeshan Ahmed | João P. Cabral | Julie Carson-Berndsen
Proceedings of the Third Workshop on Speech and Language Processing for Assistive Technologies

pdf bib
Hierarchical Phrase-Based MT for Phonetic Representation-Based Speech Translation
Zeeshan Ahmed | Jie Jiang | Julie Carson-Berndsen | Peter Cahill | Andy Way
Proceedings of the 10th Conference of the Association for Machine Translation in the Americas: Research Papers

The paper presents a novel technique for speech translation using hierarchical phrased-based statistical machine translation (HPB-SMT). The system is based on translation of speech from phone sequences as opposed to conventional approach of speech translation from word sequences. The technique facilitates speech translation by allowing a machine translation (MT) system to access to phonetic information. This enables the MT system to act as both a word recognition and a translation component. This results in better performance than conventional speech translation approaches by recovering from recognition error with help of a source language model, translation model and target language model. For this purpose, the MT translation models are adopted to work on source language phones using a grapheme-to-phoneme component. The source-side phonetic confusions are handled using a confusion network. The result on IWLST'10 English- Chinese translation task shows a significant improvement in translation quality. In this paper, results for HPB-SMT are compared with previously published results of phrase-based statistical machine translation (PB-SMT) system (Baseline). The HPB-SMT system outperforms PB-SMT in this regard.


Phonetic Representation-Based Speech Translation
Jie Jiang | Zeeshan Ahmed | Julie Carson-Berndsen | Peter Cahill | Andy Way
Proceedings of Machine Translation Summit XIII: Papers


Lattice Score Based Data Cleaning for Phrase-Based Statistical Machine Translation
Jie Jiang | Julie Carson-Berndsen | Andy Way
Proceedings of the 14th Annual conference of the European Association for Machine Translation


Acquiring Reusable Multilingual Phonotactic Resources
Julie Carson-Berndsen | Robert Kelly
Proceedings of the Fourth International Conference on Language Resources and Evaluation (LREC’04)

A Multilingual Phonological Resource Toolkit for Ubiquitous Speech Technology
Daniel Aioanei | Julie Carson-Berndsen | Anja Geumann | Robert Kelly | Moritz Neugebauer | Stephen Wilson
Proceedings of the Fourth International Conference on Language Resources and Evaluation (LREC’04)

Automatic Acquisition of Feature-Based Phonotactic Resources
Julie Carson-Berndsen | Robert Kelly | Moritz Neugebauer
Proceedings of the 7th Meeting of the ACL Special Interest Group in Computational Phonology: Current Themes in Computational Phonology and Morphology


XiSTS - XML in Speech Technology Systems
Michael Walsh | Stephen Wilson | Julie Carson-Berndsen
COLING-02: The 2nd Workshop on NLP and XML (NLPXML-2002)


Web tools for introductory computational linguistics
Dafydd Gibbon | Julie Carson-Berndsen
EACL 1999: Computer and Internet Supported Education in Language and Speech Technology


Event Relations at the Phonetics/Phonology Interface
Julie Carson-Berndsen | Dafydd Gibbon
COLING 1992 Volume 4: The 14th International Conference on Computational Linguistics


Phonological Processing of Speech Variants
Julle Carson-Berndsen
COLING 1990 Volume 3: Papers presented to the 13th International Conference on Computational Linguistics


Unification and Transduction in Computational Phonology
Julie Carson
Coling Budapest 1988 Volume 1: International Conference on Computational Linguistics