Abstract
The Linguistic Data Consortium and Georgetown University Press are collaborating to create updated editions of bilingual diction- aries that had originally been published in the 1960's for English-speaking learners of Moroccan, Syrian and Iraqi Arabic. In their first editions, these dictionaries used ad hoc Latin-alphabet orthography for each colloquial Arabic dialect, but adopted some proper- ties of Arabic-based writing (collation order of Arabic headwords, clitic attachment to word forms in example phrases); despite their common features, there are notable differences among the three books that impede comparisons across the dialects, as well as com- parisons of each dialect to Modern Standard Arabic. In updating these volumes, we use both Arabic script and International Pho- netic Alphabet orthographies; the former provides a common basis for word recognition across dialects, while the latter provides dialect-specific pronunciations. Our goal is to preserve the full content of the original publications, supplement the Arabic headword inventory with new usages, and produce a uniform lexicon structure expressible via the Lexical Markup Framework (LMF, ISO 24613). To this end, we developed a relational database schema that applies consistently to each dialect, and HTTP-based tools for searching, editing, workflow, review and inventory management.- Anthology ID:
- L12-1245
- Volume:
- Proceedings of the Eighth International Conference on Language Resources and Evaluation (LREC'12)
- Month:
- May
- Year:
- 2012
- Address:
- Istanbul, Turkey
- Editors:
- Nicoletta Calzolari, Khalid Choukri, Thierry Declerck, Mehmet Uğur Doğan, Bente Maegaard, Joseph Mariani, Asuncion Moreno, Jan Odijk, Stelios Piperidis
- Venue:
- LREC
- SIG:
- Publisher:
- European Language Resources Association (ELRA)
- Note:
- Pages:
- 269–274
- Language:
- URL:
- http://www.lrec-conf.org/proceedings/lrec2012/pdf/461_Paper.pdf
- DOI:
- Cite (ACL):
- David Graff and Mohamed Maamouri. 2012. Developing LMF-XML Bilingual Dictionaries for Colloquial Arabic Dialects. In Proceedings of the Eighth International Conference on Language Resources and Evaluation (LREC'12), pages 269–274, Istanbul, Turkey. European Language Resources Association (ELRA).
- Cite (Informal):
- Developing LMF-XML Bilingual Dictionaries for Colloquial Arabic Dialects (Graff & Maamouri, LREC 2012)
- PDF:
- http://www.lrec-conf.org/proceedings/lrec2012/pdf/461_Paper.pdf