EMILLE, A 67-Million Word Corpus of Indic Languages: Data Collection, Mark-up and Harmonisation
Paul Baker, Andrew Hardie, Tony McEnery, Hamish Cunningham, Rob Gaizauskas
- Anthology ID:
- L02-1319
- Volume:
- Proceedings of the Third International Conference on Language Resources and Evaluation (LREC’02)
- Month:
- May
- Year:
- 2002
- Address:
- Las Palmas, Canary Islands - Spain
- Venue:
- LREC
- SIG:
- Publisher:
- European Language Resources Association (ELRA)
- Note:
- Pages:
- Language:
- URL:
- http://www.lrec-conf.org/proceedings/lrec2002/pdf/319.pdf
- DOI:
- Cite (ACL):
- Paul Baker, Andrew Hardie, Tony McEnery, Hamish Cunningham, and Rob Gaizauskas. 2002. EMILLE, A 67-Million Word Corpus of Indic Languages: Data Collection, Mark-up and Harmonisation. In Proceedings of the Third International Conference on Language Resources and Evaluation (LREC’02), Las Palmas, Canary Islands - Spain. European Language Resources Association (ELRA).
- Cite (Informal):
- EMILLE, A 67-Million Word Corpus of Indic Languages: Data Collection, Mark-up and Harmonisation (Baker et al., LREC 2002)
- PDF:
- http://www.lrec-conf.org/proceedings/lrec2002/pdf/319.pdf