Abstract
In this paper, we provide a simple and effective baseline for classifying both patents and papers to the well-established Cooperative Patent Classification (CPC). We propose a label-informative classifier based on the Wide & Deep structure, where the Wide part encodes string-level similarities between texts and labels, and the Deep part captures semantic-level similarities via non-linear transformations. Our model trains on millions of patents, and transfers to papers by developing distant-supervised training set and domain-specific features. Extensive experiments show that our model achieves comparable performance to the state-of-the-art model used in industry on both patents and papers. The output of this work should facilitate the searching, granting and filing of innovative ideas for patent examiners, attorneys and researchers.- Anthology ID:
- D19-1344
- Volume:
- Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing (EMNLP-IJCNLP)
- Month:
- November
- Year:
- 2019
- Address:
- Hong Kong, China
- Editors:
- Kentaro Inui, Jing Jiang, Vincent Ng, Xiaojun Wan
- Venues:
- EMNLP | IJCNLP
- SIG:
- SIGDAT
- Publisher:
- Association for Computational Linguistics
- Note:
- Pages:
- 3438–3443
- Language:
- URL:
- https://aclanthology.org/D19-1344
- DOI:
- 10.18653/v1/D19-1344
- Cite (ACL):
- Muyao Niu and Jie Cai. 2019. A Label Informative Wide & Deep Classifier for Patents and Papers. In Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing (EMNLP-IJCNLP), pages 3438–3443, Hong Kong, China. Association for Computational Linguistics.
- Cite (Informal):
- A Label Informative Wide & Deep Classifier for Patents and Papers (Niu & Cai, EMNLP-IJCNLP 2019)
- PDF:
- https://preview.aclanthology.org/ingest-acl-2023-videos/D19-1344.pdf