Abstract
We describe an approach to robust domain-independent syntactic parsing of unrestricted naturally-occurring (English) input. The technique involves parsing sequences of part-of-speech and punctuation labels using a unification-based grammar coupled with a probabilistic LR parser. We describe the coverage of several corpora using this grammar and report the results of a parsing experiment using probabilities derived from bracketed training data. We report the first substantial experiments to assess the contribution of punctuation to deriving an accurate syntactic analysis, by parsing identical texts both with and without naturally-occurring punctuation marks.- Anthology ID:
- 1995.iwpt-1.8
- Volume:
- Proceedings of the Fourth International Workshop on Parsing Technologies
- Month:
- September 20-24
- Year:
- 1995
- Address:
- Prague and Karlovy Vary, Czech Republic
- Editors:
- Eva Hajicova, Bernard Lang, Robert Berwick, Harry Bunt, Bob Carpenter, Ken Church, Aravind Joshi, Ronald Kaplan, Martin Kay, Makoto Nagao, Anton Nijholt, Mark Steedman, Henry Thompson, Masaru Tomita, K. Vijay-Shanker, Yorick Wilks, Kent Wittenburg
- Venues:
- IWPT | WS
- SIG:
- SIGPARSE
- Publisher:
- Association for Computational Linguistics
- Note:
- Pages:
- 48–58
- Language:
- URL:
- https://preview.aclanthology.org/build-pipeline-with-new-library/1995.iwpt-1.8/
- DOI:
- Cite (ACL):
- Ted Briscoe and John Carroll. 1995. Developing and Evaluating a Probabilistic LR Parser of Part-of-Speech and Punctuation Labels. In Proceedings of the Fourth International Workshop on Parsing Technologies, pages 48–58, Prague and Karlovy Vary, Czech Republic. Association for Computational Linguistics.
- Cite (Informal):
- Developing and Evaluating a Probabilistic LR Parser of Part-of-Speech and Punctuation Labels (Briscoe & Carroll, IWPT 1995)
- PDF:
- https://preview.aclanthology.org/build-pipeline-with-new-library/1995.iwpt-1.8.pdf