Abstract
With our experiment, we show how we can detect and annotate clausal coordinate ellipsis with Constraint Grammar rules. We focus on such an elliptical structure in which there are two coordinated clauses, and the latter one lacks a verb. For example, the sentence This belongs to me and that to you demonstrates the ellipsis in question, namely gapping. The Constraint Grammar rules are made for a Finnish parsebank, FinnTreeBank. The FinnTreeBank project is building a parsebank in the dependency syntactic framework in which verbs are central since other sentence elements depend on them. Without correct detection of omitted verbs, the syntactic analysis of the whole sentence fails. In the experiment, we detect gapping based on morphology and linear order of the words without using syntactic or semantic information. The test corpus, Finnish Wikipedia, is morphologically analyzed but not disambiguated. Even with an ambiguous morphological analysis, the results show that 89,9% of the detected sentences are elliptical, making the rules accurate enough to be used in the creation of FinnTreeBank. Once we have a morphologically disambiguated corpus, we can write more accurate rules and expect better results.- Anthology ID:
- L12-1146
- Volume:
- Proceedings of the Eighth International Conference on Language Resources and Evaluation (LREC'12)
- Month:
- May
- Year:
- 2012
- Address:
- Istanbul, Turkey
- Editors:
- Nicoletta Calzolari, Khalid Choukri, Thierry Declerck, Mehmet Uğur Doğan, Bente Maegaard, Joseph Mariani, Asuncion Moreno, Jan Odijk, Stelios Piperidis
- Venue:
- LREC
- SIG:
- Publisher:
- European Language Resources Association (ELRA)
- Note:
- Pages:
- 1955–1959
- Language:
- URL:
- http://www.lrec-conf.org/proceedings/lrec2012/pdf/313_Paper.pdf
- DOI:
- Cite (ACL):
- Kristiina Muhonen and Tanja Purtonen. 2012. Rule-Based Detection of Clausal Coordinate Ellipsis. In Proceedings of the Eighth International Conference on Language Resources and Evaluation (LREC'12), pages 1955–1959, Istanbul, Turkey. European Language Resources Association (ELRA).
- Cite (Informal):
- Rule-Based Detection of Clausal Coordinate Ellipsis (Muhonen & Purtonen, LREC 2012)
- PDF:
- http://www.lrec-conf.org/proceedings/lrec2012/pdf/313_Paper.pdf