Haojin Yang
2020
Best Student Forcing: A Simple Training Mechanism in Adversarial Language Generation
Jonathan Sauder
|
Ting Hu
|
Xiaoyin Che
|
Goncalo Mordido
|
Haojin Yang
|
Christoph Meinel
Proceedings of the Twelfth Language Resources and Evaluation Conference
Language models trained with Maximum Likelihood Estimation (MLE) have been considered as a mainstream solution in Natural Language Generation (NLG) for years. Recently, various approaches with Generative Adversarial Nets (GANs) have also been proposed. While offering exciting new prospects, GANs in NLG by far are nevertheless reportedly suffering from training instability and mode collapse, and therefore outperformed by conventional MLE models. In this work, we propose techniques for improving GANs in NLG, namely Best Student Forcing (BSF), a novel yet simple adversarial training mechanism in which generated sequences of high quality are selected as temporary ground-truth to further train the generator. We also use an ensemble of discriminators to increase training stability and sample diversity. Evaluation shows that the combination of BSF and multiple discriminators consistently performs better than previous GAN approaches over various metrics, and outperforms a baseline MLE in terms of Fr ́ech ́et Distance, a recently proposed metric capturing both sample quality and diversity.
2017
Traversal-Free Word Vector Evaluation in Analogy Space
Xiaoyin Che
|
Nico Ring
|
Willi Raschkowski
|
Haojin Yang
|
Christoph Meinel
Proceedings of the 2nd Workshop on Evaluating Vector Space Representations for NLP
In this paper, we propose an alternative evaluating metric for word analogy questions (A to B is as C to D) in word vector evaluation. Different from the traditional method which predicts the fourth word by the given three, we measure the similarity directly on the “relations” of two pairs of given words, just as shifting the relation vectors into a new analogy space. Cosine and Euclidean distances are then calculated as measurements. Observation and experiments shows the proposed analogy space evaluation could offer a more comprehensive evaluating result on word vectors with word analogy questions. Meanwhile, computational complexity are remarkably reduced by avoiding traversing the vocabulary.
2016
Punctuation Prediction for Unsegmented Transcript Based on Word Vector
Xiaoyin Che
|
Cheng Wang
|
Haojin Yang
|
Christoph Meinel
Proceedings of the Tenth International Conference on Language Resources and Evaluation (LREC'16)
In this paper we propose an approach to predict punctuation marks for unsegmented speech transcript. The approach is purely lexical, with pre-trained Word Vectors as the only input. A training model of Deep Neural Network (DNN) or Convolutional Neural Network (CNN) is applied to classify whether a punctuation mark should be inserted after the third word of a 5-words sequence and which kind of punctuation mark the inserted one should be. TED talks within IWSLT dataset are used in both training and evaluation phases. The proposed approach shows its effectiveness by achieving better result than the state-of-the-art lexical solution which works with same type of data, especially when predicting puncuation position only.
Search
Co-authors
- Xiaoyin Che 3
- Christoph Meinel 3
- Nico Ring 1
- Willi Raschkowski 1
- Cheng Wang 1
- show all...