Xing Shi


Consecutive Question Generation via Dynamic Multitask Learning
Yunji Li | Sujian Li | Xing Shi
Findings of the Association for Computational Linguistics: EMNLP 2022

In this paper, we propose the task of consecutive question generation (CQG), which generates a set of logically related question-answer pairs to understand a whole passage, with a comprehensive consideration of the aspects including accuracy, coverage, and informativeness.To achieve this, we first examine the four key elements of CQG, i.e., question, answer, rationale, and context history, and propose a novel dynamic multitask framework with one main task generating a question-answer pair, and four auxiliary tasks generating other elements. It directly helps the model generate good questions through both joint training and self-reranking. At the same time, to fully explore the worth-asking information in a given passage, we make use of the reranking losses to sample the rationales and search for the best question series globally.Finally, we measure our strategy by QA data augmentation and manual evaluation, as well as a novel application of generated question-answer pairs on DocNLI. We prove that our strategy can improve question generation significantly and benefit multiple related NLP tasks.


pdf bib
Ebrahim Ansari | Amittai Axelrod | Nguyen Bach | Ondřej Bojar | Roldano Cattoni | Fahim Dalvi | Nadir Durrani | Marcello Federico | Christian Federmann | Jiatao Gu | Fei Huang | Kevin Knight | Xutai Ma | Ajay Nagesh | Matteo Negri | Jan Niehues | Juan Pino | Elizabeth Salesky | Xing Shi | Sebastian Stüker | Marco Turchi | Alexander Waibel | Changhan Wang
Proceedings of the 17th International Conference on Spoken Language Translation

The evaluation campaign of the International Conference on Spoken Language Translation (IWSLT 2020) featured this year six challenge tracks: (i) Simultaneous speech translation, (ii) Video speech translation, (iii) Offline speech translation, (iv) Conversational speech translation, (v) Open domain translation, and (vi) Non-native speech translation. A total of teams participated in at least one of the tracks. This paper introduces each track’s goal, data and evaluation metrics, and reports the results of the received submissions.

DiDi’s Machine Translation System for WMT2020
Tanfang Chen | Weiwei Wang | Wenyang Wei | Xing Shi | Xiangang Li | Jieping Ye | Kevin Knight
Proceedings of the Fifth Conference on Machine Translation

This paper describes the DiDi AI Labs’ submission to the WMT2020 news translation shared task. We participate in the translation direction of Chinese->English. In this direction, we use the Transformer as our baseline model and integrate several techniques for model enhancement, including data filtering, data selection, back-translation, fine-tuning, model ensembling, and re-ranking. As a result, our submission achieves a BLEU score of 36.6 in Chinese->English.


Speeding Up Neural Machine Translation Decoding by Shrinking Run-time Vocabulary
Xing Shi | Kevin Knight
Proceedings of the 55th Annual Meeting of the Association for Computational Linguistics (Volume 2: Short Papers)

We speed up Neural Machine Translation (NMT) decoding by shrinking run-time target vocabulary. We experiment with two shrinking approaches: Locality Sensitive Hashing (LSH) and word alignments. Using the latter method, we get a 2x overall speed-up over a highly-optimized GPU implementation, without hurting BLEU. On certain low-resource language pairs, the same methods improve BLEU by 0.5 points. We also report a negative result for LSH on GPUs, due to relatively large overhead, though it was successful on CPUs. Compared with Locality Sensitive Hashing (LSH), decoding with word alignments is GPU-friendly, orthogonal to existing speedup methods and more robust across language pairs.

Hafez: an Interactive Poetry Generation System
Marjan Ghazvininejad | Xing Shi | Jay Priyadarshi | Kevin Knight
Proceedings of ACL 2017, System Demonstrations


Generating Topical Poetry
Marjan Ghazvininejad | Xing Shi | Yejin Choi | Kevin Knight
Proceedings of the 2016 Conference on Empirical Methods in Natural Language Processing

Does String-Based Neural MT Learn Source Syntax?
Xing Shi | Inkit Padhi | Kevin Knight
Proceedings of the 2016 Conference on Empirical Methods in Natural Language Processing

Why Neural Translations are the Right Length
Xing Shi | Kevin Knight | Deniz Yuret
Proceedings of the 2016 Conference on Empirical Methods in Natural Language Processing


How to Speak a Language without Knowing It
Xing Shi | Kevin Knight | Heng Ji
Proceedings of the 52nd Annual Meeting of the Association for Computational Linguistics (Volume 2: Short Papers)