Statistical Evaluation of Pronunciation Encoding

Iris Merkus, Florian Schiel


Abstract
In this study we investigate the idea to automatically evaluate newly created pronunciation encodings for being correct or containing a potential error. Using a cascaded triphone detector and phonotactical n-gram modeling with an optimal Bayesian threshold we classify unknown pronunciation transcripts into the classes 'probably faulty' or 'probably correct'. Transcripts tagged 'probably faulty' are forwarded to a manual inspection performed by an expert, while encodings tagged 'probably correct' are passed without further inspection. An evaluation of the new method on the German PHONOLEX lexical resource shows that with a tolerable error margin of approximately 3% faulty transcriptions a major reduction in work effort during the production of a new lexical resource can be achieved.
Anthology ID:
L12-1199
Volume:
Proceedings of the Eighth International Conference on Language Resources and Evaluation (LREC'12)
Month:
May
Year:
2012
Address:
Istanbul, Turkey
Venue:
LREC
SIG:
Publisher:
European Language Resources Association (ELRA)
Note:
Pages:
981–985
Language:
URL:
http://www.lrec-conf.org/proceedings/lrec2012/pdf/391_Paper.pdf
DOI:
Bibkey:
Cite (ACL):
Iris Merkus and Florian Schiel. 2012. Statistical Evaluation of Pronunciation Encoding. In Proceedings of the Eighth International Conference on Language Resources and Evaluation (LREC'12), pages 981–985, Istanbul, Turkey. European Language Resources Association (ELRA).
Cite (Informal):
Statistical Evaluation of Pronunciation Encoding (Merkus & Schiel, LREC 2012)
Copy Citation:
PDF:
http://www.lrec-conf.org/proceedings/lrec2012/pdf/391_Paper.pdf