Yaqin Yang


2015

pdf bib
Recovering dropped pronouns from Chinese text messages
Yaqin Yang | Yalin Liu | Nianwen Xue
Proceedings of the 53rd Annual Meeting of the Association for Computational Linguistics and the 7th International Joint Conference on Natural Language Processing (Volume 2: Short Papers)

2013

pdf bib
Dependency-based empty category detection via phrase structure trees
Nianwen Xue | Yaqin Yang
Proceedings of the 2013 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies

pdf bib
Distant annotation of Chinese tense and modality
Nianwen Xue | Yuchen Zhang | Yaqin Yang
Proceedings of the IWCS 2013 Workshop on Annotation of Modal Meanings in Natural Language (WAMM)

2012

pdf bib
Chinese Comma Disambiguation for Discourse Analysis
Yaqin Yang | Nianwen Xue
Proceedings of the 50th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers)

pdf bib
Annotating dropped pronouns in Chinese newswire text
Elizabeth Baran | Yaqin Yang | Nianwen Xue
Proceedings of the Eighth International Conference on Language Resources and Evaluation (LREC'12)

We propose an annotation framework to explicitly identify dropped subject pronouns in Chinese. We acknowledge and specify 10 concrete pronouns that exist as words in Chinese and 4 abstract pronouns that do not correspond to Chinese words, but that are recognized conceptually, to native Chinese speakers. These abstract pronouns are identified as """"unspecified"""", """"pleonastic"""", """"event"""", and """"existential"""" and are argued to exist cross-linguistically. We trained two annotators, fluent in Chinese, and adjudicated their annotations to form a gold standard. We achieved an inter-annotator agreement kappa of .6 and an observed agreement of .7. We found that annotators had the most difficulty with the abstract pronouns, such as """"unspecified"""" and """"event"""", but we posit that further specification and training has the potential to significantly improve these results. We believe that this annotated data will serve to help improve Machine Translation models that translate from Chinese to a non pro-drop language, like English, that requires all subject pronouns to be explicit.

2011

pdf bib
Chinese sentence segmentation as comma classification
Nianwen Xue | Yaqin Yang
Proceedings of the 49th Annual Meeting of the Association for Computational Linguistics: Human Language Technologies

pdf bib
A Machine Learning-Based Coreference Detection System for OntoNotes
Yaqin Yang | Nianwen Xue | Peter Anick
Proceedings of the Fifteenth Conference on Computational Natural Language Learning: Shared Task

2010

pdf bib
Chasing the ghost: recovering empty categories in the Chinese Treebank
Yaqin Yang | Nianwen Xue
Coling 2010: Posters