Tianyong Hao


A Self-supervised Joint Training Framework for Document Reranking
Xiaozhi Zhu | Tianyong Hao | Sijie Cheng | Fu Lee Wang | Hai Liu
Findings of the Association for Computational Linguistics: NAACL 2022

Pretrained language models such as BERT have been successfully applied to a wide range of natural language processing tasks and also achieved impressive performance in document reranking tasks. Recent works indicate that further pretraining the language models on the task-specific datasets before fine-tuning helps improve reranking performance. However, the pre-training tasks like masked language model and next sentence prediction were based on the context of documents instead of encouraging the model to understand the content of queries in document reranking task. In this paper, we propose a new self-supervised joint training framework (SJTF) with a self-supervised method called Masked Query Prediction (MQP) to establish semantic relations between given queries and positive documents. The framework randomly masks a token of query and encodes the masked query paired with positive documents, and uses a linear layer as a decoder to predict the masked token. In addition, the MQP is used to jointly optimize the models with supervised ranking objective during fine-tuning stage without an extra further pre-training stage. Extensive experiments on the MS MARCO passage ranking and TREC Robust datasets show that models trained with our framework obtain significant improvements compared to original models.

中文糖尿病问题分类体系及标注语料库构建研究(The Construction of Question Taxonomy and An Annotated Chinese Corpus for Diabetes Question Classification)
Xiaobo Qian (钱晓波) | Wenxiu Xie (谢文秀) | Shaopei Long (龙绍沛) | Murong Lan (兰牧融) | Yuanyuan Mu (慕媛媛) | Tianyong Hao (郝天永)
Proceedings of the 21st Chinese National Conference on Computational Linguistics



T-Know: a Knowledge Graph-based Question Answering and Infor-mation Retrieval System for Traditional Chinese Medicine
Ziqing Liu | Enwei Peng | Shixing Yan | Guozheng Li | Tianyong Hao
Proceedings of the 27th International Conference on Computational Linguistics: System Demonstrations

T-Know is a knowledge service system based on the constructed knowledge graph of Traditional Chinese Medicine (TCM). Using authorized and anonymized clinical records, medicine clinical guidelines, teaching materials, classic medical books, academic publications, etc., as data resources, the system extracts triples from free texts to build a TCM knowledge graph by our developed natural language processing methods. On the basis of the knowledge graph, a deep learning algorithm is implemented for single-round question understanding and multiple-round dialogue. In addition, the TCM knowledge graph also is used to support human-computer interactive knowledge retrieval by normalizing search keywords to medical terminology.

Annotating Measurable Quantitative Informationin Language: for an ISO Standard
Tianyong Hao | Haotai Wang | Xinyu Cao | Kiyong Lee
Proceedings 14th Joint ACL - ISO Workshop on Interoperable Semantic Annotation


The representation and extraction of qunatitative information
Tianyong Hao | Yunyan We | Jiaqi Qiang | Haitao Wang | Kiyong Lee
Proceedings of the 13th Joint ISO-ACL Workshop on Interoperable Semantic Annotation (ISA-13)


Towards Automatic Question Answering over Social Media by Learning Question Equivalence Patterns
Tianyong Hao | Wenyin Liu | Eugene Agichtein
Proceedings of the NAACL HLT 2010 Workshop on Computational Linguistics in a World of Social Media