2024
pdf
abs
EpLSA: Synergy of Expert-prefix Mixtures and Task-Oriented Latent Space Adaptation for Diverse Generative Reasoning
Fujun Zhang
|
Xiangdong Su
|
Jiang Li
|
Rong Yan
|
Guanglai Gao
Proceedings of the 2024 Joint International Conference on Computational Linguistics, Language Resources and Evaluation (LREC-COLING 2024)
Existing models for diverse generative reasoning still struggle to generate multiple unique and plausible results. Through an in-depth examination, we argue that it is critical to leverage a mixture of experts as prefixes to enhance the diversity of generated results and make task-oriented adaptation in the latent space of the generation models to improve the quality of the responses. At this point, we propose EpLSA, an innovative model based on the synergy of expert-prefix mixtures and task-oriented latent space adaptation for diverse generative reasoning. Specifically, we use expert-prefixes mixtures to encourage the model to create multiple responses with different semantics and design a loss function to address the problem that the semantics is interfered by the expert-prefixes. Meanwhile, we design a task-oriented adaptation block to make the pre-trained encoder within the generation model more effectively adapted to the pre-trained decoder in the latent space, thus further improving the quality of the generated text. Extensive experiments on three different types of generative reasoning tasks demonstrate that EpLSA outperforms existing baseline models in terms of both the quality and diversity of the generated outputs. Our code is publicly available at https://github.com/IMU-MachineLearningSXD/EpLSA.
pdf
abs
Exploring the Synergy of Dual-path Encoder and Alignment Module for Better Graph-to-Text Generation
Tianxin Zhao
|
Yingxin Liu
|
Xiangdong Su
|
Jiang Li
|
Guanglai Gao
Proceedings of the 2024 Joint International Conference on Computational Linguistics, Language Resources and Evaluation (LREC-COLING 2024)
The mainstream approaches view the knowledge graph-to-text (KG-to-text) generation as a sequence-to-sequence task and fine-tune the pre-trained model (PLM) to generate the target text from the linearized knowledge graph. However, the linearization of knowledge graphs and the structure of PLMs lead to the loss of a large amount of graph structure information. Moreover, PLMs lack an explicit graph-text alignment strategy because of the discrepancy between structural and textual information. To solve these two problems, we propose a synergetic KG-to-text model with a dual-path encoder, an alignment module, and a guidance module. The dual-path encoder consists of a graph structure encoder and a text encoder, which can better encode the structure and text information of the knowledge graph. The alignment module contains a two-layer Transformer block and an MLP block, which aligns and integrates the information from the dual encoder. The guidance module combines an improved pointer network and an MLP block to avoid error-generated entities and ensures the fluency and accuracy of the generated text. Our approach obtains very competitive performance on three benchmark datasets. Our code is available from https://github.com/IMu-MachineLearningsxD/G2T.
pdf
abs
Lˆ2GC:Lorentzian Linear Graph Convolutional Networks for Node Classification
Qiuyu Liang
|
Weihua Wang
|
Feilong Bao
|
Guanglai Gao
Proceedings of the 2024 Joint International Conference on Computational Linguistics, Language Resources and Evaluation (LREC-COLING 2024)
Linear Graph Convolutional Networks (GCNs) are used to classify the node in the graph data. However, we note that most existing linear GCN models perform neural network operations in Euclidean space, which do not explicitly capture the tree-like hierarchical structure exhibited in real-world datasets that modeled as graphs. In this paper, we attempt to introduce hyperbolic space into linear GCN and propose a novel framework for Lorentzian linear GCN. Specifically, we map the learned features of graph nodes into hyperbolic space, and then perform a Lorentzian linear feature transformation to capture the underlying tree-like structure of data. Experimental results on standard citation networks datasets with semi-supervised learning show that our approach yields new state-of-the-art results of accuracy 74.7% on Citeseer and 81.3% on PubMed datasets. Furthermore, we observe that our approach can be trained up to two orders of magnitude faster than other nonlinear GCN models on PubMed dataset. Our code is publicly available at https://github.com/llqy123/LLGC-master.
pdf
abs
TransERR: Translation-based Knowledge Graph Embedding via Efficient Relation Rotation
Jiang Li
|
Xiangdong Su
|
Fujun Zhang
|
Guanglai Gao
Proceedings of the 2024 Joint International Conference on Computational Linguistics, Language Resources and Evaluation (LREC-COLING 2024)
This paper presents a translation-based knowledge geraph embedding method via efficient relation rotation (TransERR), a straightforward yet effective alternative to traditional translation-based knowledge graph embedding models. Different from the previous translation-based models, TransERR encodes knowledge graphs in the hypercomplex-valued space, thus enabling it to possess a higher degree of translation freedom in mining latent information between the head and tail entities. To further minimize the translation distance, TransERR adaptively rotates the head entity and the tail entity with their corresponding unit quaternions, which are learnable in model training. We also provide mathematical proofs to demonstrate the ability of TransERR in modeling various relation patterns, including symmetry, antisymmetry, inversion, composition, and subrelation patterns. The experiments on 10 benchmark datasets validate the effectiveness and the generalization of TransERR. The results also indicate that TransERR can better encode large-scale datasets with fewer parameters than the previous translation-based models. Our code and datasets are available at
https://github.com/dellixx/TransERR.
2023
pdf
abs
TeAST: Temporal Knowledge Graph Embedding via Archimedean Spiral Timeline
Jiang Li
|
Xiangdong Su
|
Guanglai Gao
Proceedings of the 61st Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers)
Temporal knowledge graph embedding (TKGE) models are commonly utilized to infer the missing facts and facilitate reasoning and decision-making in temporal knowledge graph based systems. However, existing methods fuse temporal information into entities, potentially leading to the evolution of entity information and limiting the link prediction performance of TKG. Meanwhile, current TKGE models often lack the ability to simultaneously model important relation patterns and provide interpretability, which hinders their effectiveness and potential applications. To address these limitations, we propose a novel TKGE model which encodes
Temporal knowledge graph
embeddings via
Archimedean
Spiral
Timeline (TeAST), which maps relations onto the corresponding Archimedean spiral timeline and transforms the quadruples completion to 3th-order tensor completion problem. Specifically, the Archimedean spiral timeline ensures that relations that occur simultaneously are placed on the same timeline, and all relations evolve over time. Meanwhile, we present a novel temporal spiral regularizer to make the spiral timeline orderly. In addition, we provide mathematical proofs to demonstrate the ability of TeAST to encode various relation patterns. Experimental results show that our proposed model significantly outperforms existing TKGE methods. Our code is available at
https://github.com/IMU-MachineLearningSXD/TeAST.
pdf
abs
How Well Apply Simple MLP to Incomplete Utterance Rewriting?
Jiang Li
|
Xiangdong Su
|
Xinlan Ma
|
Guanglai Gao
Proceedings of the 61st Annual Meeting of the Association for Computational Linguistics (Volume 2: Short Papers)
Incomplete utterance rewriting (IUR) aims to restore the incomplete utterance with sufficient context information for comprehension. This paper introduces a simple yet efficient IUR method. Different from prior studies, we first employ only one-layer
MLP architecture to mine latent semantic information between joint utterances for
IUR task (
MIUR). After that, we conduct a joint feature matrix to predict the token type and thus restore the incomplete utterance. The well-designed network and simple architecture make our method significantly superior to existing methods in terms of quality and inference speedOur code is available at
https://github.com/IMU-MachineLearningSXD/MIUR.
2020
pdf
abs
Incorporating Inner-word and Out-word Features for Mongolian Morphological Segmentation
Na Liu
|
Xiangdong Su
|
Haoran Zhang
|
Guanglai Gao
|
Feilong Bao
Proceedings of the 28th International Conference on Computational Linguistics
Mongolian morphological segmentation is regarded as a crucial preprocessing step in many Mongolian related NLP applications and has received extensive attention. Recently, end-to-end segmentation approaches with long short-term memory networks (LSTM) have achieved excellent results. However, the inner-word features among characters in the word and the out-word features from context are not well utilized in the segmentation process. In this paper, we propose a neural network incorporating inner-word and out-word features for Mongolian morphological segmentation. The network consists of two encoders and one decoder. The inner-word encoder uses the self-attention mechanisms to capture the inner-word features of the target word. The out-word encoder employs a two layers BiLSTM network to extract out-word features in the sentence. Then, the decoder adopts a multi-head double attention layer to fuse the inner-word features and out-word features and produces the segmentation result. The evaluation experiment compares the proposed network with the baselines and explores the effectiveness of the sub-modules.
2018
pdf
abs
A LSTM Approach with Sub-Word Embeddings for Mongolian Phrase Break Prediction
Rui Liu
|
Feilong Bao
|
Guanglai Gao
|
Hui Zhang
|
Yonghe Wang
Proceedings of the 27th International Conference on Computational Linguistics
In this paper, we first utilize the word embedding that focuses on sub-word units to the Mongolian Phrase Break (PB) prediction task by using Long-Short-Term-Memory (LSTM) model. Mongolian is an agglutinative language. Each root can be followed by several suffixes to form probably millions of words, but the existing Mongolian corpus is not enough to build a robust entire word embedding, thus it suffers a serious data sparse problem and brings a great difficulty for Mongolian PB prediction. To solve this problem, we look at sub-word units in Mongolian word, and encode their information to a meaningful representation, then fed it to LSTM to decode the best corresponding PB label. Experimental results show that the proposed model significantly outperforms traditional CRF model using manually features and obtains 7.49% F-Measure gain.
2016
pdf
abs
Mongolian Named Entity Recognition System with Rich Features
Weihua Wang
|
Feilong Bao
|
Guanglai Gao
Proceedings of COLING 2016, the 26th International Conference on Computational Linguistics: Technical Papers
In this paper, we first build a manually annotated named entity corpus of Mongolian. Then, we propose three morphological processing methods and study comprehensive features, including syllable features, lexical features, context features, morphological features and semantic features in Mongolian named entity recognition. Moreover, we also evaluate the influence of word cluster features on the system and combine all features together eventually. The experimental result shows that segmenting each suffix into an individual token achieves better results than deleting suffixes or using the suffixes as feature. The system based on segmenting suffixes with all proposed features yields benchmark result of F-measure=84.65 on this corpus.