Yang Jiao


2022

pdf
aiML at the FinNLP-2022 ERAI Task: Combining Classification and Regression Tasks for Financial Opinion Mining
Zhaoxuan Qin | Jinan Zou | Qiaoyang Luo | Haiyao Cao | Yang Jiao
Proceedings of the Fourth Workshop on Financial Technology and Natural Language Processing (FinNLP)

Identifying posts of high financial quality from opinions is of extraordinary significance for investors. Hence, this paper focuses on evaluating the rationales of amateur investors (ERAI) in a shared task, and we present our solutions. The pairwise comparison task aims at extracting the post that will trigger higher MPP and ML values from pairs of posts. The goal of the unsupervised ranking task is to find the top 10% of posts with higher MPP and ML values. We initially model the shared task as text classification and regression problems. We then propose a multi-learning approach applied by financial domain pre-trained models and multiple linear classifiers for factor combinations to integrate better relationships and information between training data. The official results have proved that our method achieves 48.28% and 52.87% for MPP and ML accuracy on pairwise tasks, 14.02% and -4.17% regarding unsupervised ranking tasks for MPP and ML. Our source code is available.

2020

pdf
SeNsER: Learning Cross-Building Sensor Metadata Tagger
Yang Jiao | Jiacheng Li | Jiaman Wu | Dezhi Hong | Rajesh Gupta | Jingbo Shang
Findings of the Association for Computational Linguistics: EMNLP 2020

Sensor metadata tagging, akin to the named entity recognition task, provides key contextual information (e.g., measurement type and location) about sensors for running smart building applications. Unfortunately, sensor metadata in different buildings often follows distinct naming conventions. Therefore, learning a tagger currently requires extensive annotations on a per building basis. In this work, we propose a novel framework, SeNsER, which learns a sensor metadata tagger for a new building based on its raw metadata and some existing fully annotated building. It leverages the commonality between different buildings: At the character level, it employs bidirectional neural language models to capture the shared underlying patterns between two buildings and thus regularizes the feature learning process; At the word level, it leverages as features the k-mers existing in the fully annotated building. During inference, we further incorporate the information obtained from sources such as Wikipedia as prior knowledge. As a result, SeNsER shows promising results in extensive experiments on multiple real-world buildings.