Abstract
This paper presents a new technique for selecting the correct parse of ambiguous sentences based on a probabilistic analysis, of lexical cooccurrences in semantic forms. The method is called “Semco” (for semantic cooccurrence analysis) and is specifically targeted at the differential distribution of such cooccurrences in correct and incorrect parses. It uses Bayesian Estimation for the cooccurrence probabilities to achieve higher accuracy for sparse data than the more common Maximum Likelihood Estimation would. It has been tested on the Wall Street Journal corpus (in the PENN Treebank) and shown to find the correct parse of 60.9% of parseable sentences of 6-20 words.- Anthology ID:
- 1997.iwpt-1.15
- Volume:
- Proceedings of the Fifth International Workshop on Parsing Technologies
- Month:
- September 17-20
- Year:
- 1997
- Address:
- Boston/Cambridge, Massachusetts, USA
- Editors:
- Anton Nijholt, Robert C. Berwick, Harry C. Bunt, Bob Carpenter, Eva Hajicova, Mark Johnson, Aravind Joshi, Ronald Kaplan, Martin Kay, Bernard Lang, Alon Lavie, Makoto Nagao, Mark Steedman, Masaru Tomita, K. Vijay-Shanker, David Weir, Kent Wittenburg, Mats Wiren
- Venue:
- IWPT
- SIG:
- SIGPARSE
- Publisher:
- Association for Computational Linguistics
- Note:
- Pages:
- 113–122
- Language:
- URL:
- https://aclanthology.org/1997.iwpt-1.15
- DOI:
- Cite (ACL):
- Eirik Hektoen. 1997. Probabilistic Parse Selection based on Semantic Cooccurrences. In Proceedings of the Fifth International Workshop on Parsing Technologies, pages 113–122, Boston/Cambridge, Massachusetts, USA. Association for Computational Linguistics.
- Cite (Informal):
- Probabilistic Parse Selection based on Semantic Cooccurrences (Hektoen, IWPT 1997)
- PDF:
- https://preview.aclanthology.org/ml4al-ingestion/1997.iwpt-1.15.pdf