Abstract
In this paper we describe the annotation of COMPARA, currently the largest post-edited parallel corpora which include Portuguese. We describe the motivation, the results so far, and the way the corpus is being annotated. We also provide the first grounded results about syntactical ambiguity in Portuguese. Finally, we discuss some interesting problems in this connection.- Anthology ID:
- L06-1175
- Volume:
- Proceedings of the Fifth International Conference on Language Resources and Evaluation (LREC’06)
- Month:
- May
- Year:
- 2006
- Address:
- Genoa, Italy
- Editors:
- Nicoletta Calzolari, Khalid Choukri, Aldo Gangemi, Bente Maegaard, Joseph Mariani, Jan Odijk, Daniel Tapias
- Venue:
- LREC
- SIG:
- Publisher:
- European Language Resources Association (ELRA)
- Note:
- Pages:
- Language:
- URL:
- http://www.lrec-conf.org/proceedings/lrec2006/pdf/309_pdf.pdf
- DOI:
- Cite (ACL):
- Diana Santos and Susana Inácio. 2006. Annotating COMPARA, a Grammar-aware Parallel Corpus. In Proceedings of the Fifth International Conference on Language Resources and Evaluation (LREC’06), Genoa, Italy. European Language Resources Association (ELRA).
- Cite (Informal):
- Annotating COMPARA, a Grammar-aware Parallel Corpus (Santos & Inácio, LREC 2006)
- PDF:
- http://www.lrec-conf.org/proceedings/lrec2006/pdf/309_pdf.pdf