This is an internal, incomplete preview of a proposed change to the ACL Anthology.
For efficiency reasons, we don't generate MODS or Endnote formats, and the preview may be incomplete in other ways, or contain mistakes.
Do not treat this content as an official publication.
PeiyanWang
Also published as:
裴岩 王
Fixing paper assignments
Please select all papers that belong to the same person.
Indicate below which author they should be assigned to.
This paper focuses on the task of generating concept sememe trees to study whether Large Language Models (LLMs) can understand and generate domain conceptual knowledge. Concept sememe tree is a hierarchical structure that represents lexical meaning by combining sememes and their relationships.To this end, we introduce the Neighbor Semantic Structure (NSS) and Chain-of-Thought (CoT) prompting method to evaluate the effectiveness of various LLMs in generating accurate and comprehensive sememe trees across different domains. The NSS, guided by conceptual metaphors, identifies terms that exhibit significant external systematicity within a hierarchical relational network and incorporates them as examples in the learning process of LLMs. Meanwhile, the CoT prompting method guides LLMs through a systematic analysis of a term’s intrinsic core concepts, essential attributes, and semantic relationships, enabling the generation of concept sememe trees.We conduct experiments using datasets drawn from four authoritative terminology manuals and evaluate different LLMs. The experimental results indicate that LLMs possess the capability to capture and represent the conceptual knowledge aspects of domain-specific terms. Moreover, the integration of NSS examples with a structured CoT process allows LLMs to explore domain conceptual knowledge more profoundly, leading to the generation of highly accurate concept sememe trees.
Manufacturing specifications are documents entailing different techniques, processes, and components involved in manufacturing. There is a growing demand for named entity recognition (NER) resources and techniques for manufacturing-specific named entities, with the development of smart manufacturing. In this paper, we introduce a corpus of Chinese manufacturing specifications, named MS-NERC, including 4,424 sentences and 16,383 entities. We also propose an entity recognizer named Trainable State Transducer (TST), which is initialized with a finite state transducer describing the morphological patterns of entities. It can directly recognize entities based on prior morphological knowledge without training. Experimental results show that TST achieves an overall 82.05% F1 score for morphological-specific entities in zero-shot. TST can be improved through training, the result of which outperforms neural methods in few-shot and rich-resource. We believe that our corpus and model will be valuable resources for NER research not only in manufacturing but also in other low-resource domains.