Improved Statistical Measures to Assess Natural Language Parser Performance across Domains

Barbara Plank


Abstract
We examine the performance of three dependency parsing systems, in particular, their performance variation across Wikipedia domains. We assess the performance variation of (i) Alpino, a deep grammar-based system coupled with a statistical disambiguation versus (ii) MST and Malt, two purely data-driven statistical dependency parsing systems. The question is how the performance of each parser correlates with simple statistical measures of the text (e.g. sentence length, unknown word rate, etc.). This would give us an idea of how sensitive the different systems are to domain shifts, i.e. which system is more in need for domain adaptation techniques. To this end, we extend the statistical measures used by Zhang and Wang (2009) for English and evaluate the systems on several Wikipedia domains by focusing on a freer word-order language, Dutch. The results confirm the general findings of Zhang and Wang (2009), i.e. different parsing systems have different sensitivity against various statistical measure of the text, where the highest correlation to parsing accuracy was found for the measure we added, sentence perplexity.
Anthology ID:
L10-1551
Volume:
Proceedings of the Seventh International Conference on Language Resources and Evaluation (LREC'10)
Month:
May
Year:
2010
Address:
Valletta, Malta
Editors:
Nicoletta Calzolari, Khalid Choukri, Bente Maegaard, Joseph Mariani, Jan Odijk, Stelios Piperidis, Mike Rosner, Daniel Tapias
Venue:
LREC
SIG:
Publisher:
European Language Resources Association (ELRA)
Note:
Pages:
Language:
URL:
http://www.lrec-conf.org/proceedings/lrec2010/pdf/801_Paper.pdf
DOI:
Bibkey:
Cite (ACL):
Barbara Plank. 2010. Improved Statistical Measures to Assess Natural Language Parser Performance across Domains. In Proceedings of the Seventh International Conference on Language Resources and Evaluation (LREC'10), Valletta, Malta. European Language Resources Association (ELRA).
Cite (Informal):
Improved Statistical Measures to Assess Natural Language Parser Performance across Domains (Plank, LREC 2010)
Copy Citation:
PDF:
http://www.lrec-conf.org/proceedings/lrec2010/pdf/801_Paper.pdf