Shu Zhang


2016

pdf
A Distribution-based Model to Learn Bilingual Word Embeddings
Hailong Cao | Tiejun Zhao | Shu Zhang | Yao Meng
Proceedings of COLING 2016, the 26th International Conference on Computational Linguistics: Technical Papers

We introduce a distribution based model to learn bilingual word embeddings from monolingual data. It is simple, effective and does not require any parallel data or any seed lexicon. We take advantage of the fact that word embeddings are usually in form of dense real-valued low-dimensional vector and therefore the distribution of them can be accurately estimated. A novel cross-lingual learning objective is proposed which directly matches the distributions of word embeddings in one language with that in the other language. During the joint learning process, we dynamically estimate the distributions of word embeddings in two languages respectively and minimize the dissimilarity between them through standard back propagation algorithm. Our learned bilingual word embeddings allow to group each word and its translations together in the shared vector space. We demonstrate the utility of the learned embeddings on the task of finding word-to-word translations from monolingual corpora. Our model achieved encouraging performance on data in both related languages and substantially different languages.

2015

pdf
Bidirectional Long Short-Term Memory Networks for Relation Classification
Shu Zhang | Dequan Zheng | Xinchen Hu | Ming Yang
Proceedings of the 29th Pacific Asia Conference on Language, Information and Computation

2013

pdf
Semi-supervised Classification of Twitter Messages for Organization Name Disambiguation
Shu Zhang | Jianwei Wu | Dequan Zheng | Yao Meng | Hao Yu
Proceedings of the Sixth International Joint Conference on Natural Language Processing

pdf
Cross-Lingual Link Discovery between Chinese and English Wiki Knowledge Bases
Qingliang Miao | Huayu Lu | Shu Zhang | Yao Meng
Proceedings of the 27th Pacific Asia Conference on Language, Information, and Computation (PACLIC 27)

2012

pdf
Part-of-Speech Tagging for Chinese-English Mixed Texts with Dynamic Features
Jiayi Zhao | Xipeng Qiu | Shu Zhang | Feng Ji | Xuanjing Huang
Proceedings of the 2012 Joint Conference on Empirical Methods in Natural Language Processing and Computational Natural Language Learning

pdf
Extracting and Visualizing Semantic Relationships from Chinese Biomedical Text
Qingliang Miao | Shu Zhang | Bo Zhang | Hao Yu
Proceedings of the 26th Pacific Asia Conference on Language, Information, and Computation

pdf
An Adaptive Method for Organization Name Disambiguation with Feature Reinforcing
Shu Zhang | Jianwei Wu | Dequan Zheng | Yao Meng | Hao Yu
Proceedings of the 26th Pacific Asia Conference on Language, Information, and Computation

2011

pdf
Automatic Wrapper Generation and Maintenance
Yingju Xia | Yuhang Yang | Shu Zhang | Hao Yu
Proceedings of the 25th Pacific Asia Conference on Language, Information and Computation

pdf
Supervised and Semi-supervised Methods based Organization Name Disambiguity
Shu Zhang | Hao Yu
Proceedings of the 25th Pacific Asia Conference on Language, Information and Computation

2010

pdf
Structure-Aware Review Mining and Summarization
Fangtao Li | Chao Han | Minlie Huang | Xiaoyan Zhu | Ying-Ju Xia | Shu Zhang | Hao Yu
Proceedings of the 23rd International Conference on Computational Linguistics (Coling 2010)

pdf
Extracting Product Features and Sentiments from Chinese Customer Reviews
Shu Zhang | Wenjie Jia | Yingju Xia | Yao Meng | Hao Yu
Proceedings of the Seventh International Conference on Language Resources and Evaluation (LREC'10)

With the growing interest in opinion mining from web data, more works are focused on mining in English and Chinese reviews. Probing into the problem of product opinion mining, this paper describes the details of our language resources, and imports them into the task of extracting product feature and sentiment task. Different from the traditional unsupervised methods, a supervised method is utilized to identify product features, combining the domain knowledge and lexical information. Nearest vicinity match and syntactic tree based methods are proposed to identify the opinions regarding the product features. Multi-level analysis module is proposed to determine the sentiment orientation of the opinions. With the experiments on the electronic reviews of COAE 2008, the validities of the product features identified by CRFs and the two opinion words identified methods are testified and compared. The results show the resource is well utilized in this task and our proposed method is valid.

2009

pdf
A Bootstrapping Method for Finer-Grained Opinion Mining Using Graph Model
Shu Zhang | Yingju Xia | Yao Meng | Hao Yu
Proceedings of the 23rd Pacific Asia Conference on Language, Information and Computation, Volume 2