Shiqi Zhang


Goal-oriented Vision-and-Dialog Navigation via Reinforcement Learning
Yan Cao | Keting Lu | David DeFazio | Shiqi Zhang
Findings of the Association for Computational Linguistics: EMNLP 2022

Vision-and-dialog navigation is a recent benchmark for evaluating the AI capabilities of perception, interaction, and decision making. While existing methods developed for this benchmark have demonstrated great successes, they mostly rely on large datasets, where data collection can be a challenge, and the learned policies are not adaptive to domain changes. In this paper, we focus on a new problem, referred to as goal-oriented vision-and-dialog navigation (GVDN), where an agent uses reinforcement learning techniques to compute dialog-navigation policies from trial and error. A robot conducts visual navigation to locate target objects, and can talk to a remote human operator as needed. Our remote human is able to provide guidance on navigation only if the robot correctly conveys its location through dialog. Experiments have been conducted using photo-realistic simulation environments. Results suggest that, our agent outperforms competitive baselines in success rate.


Label-Enhanced Hierarchical Contextualized Representation for Sequential Metaphor Identification
Shuqun Li | Liang Yang | Weidong He | Shiqi Zhang | Jingjie Zeng | Hongfei Lin
Proceedings of the 2021 Conference on Empirical Methods in Natural Language Processing

Recent metaphor identification approaches mainly consider the contextual text features within a sentence or introduce external linguistic features to the model. But they usually ignore the extra information that the data can provide, such as the contextual metaphor information and broader discourse information. In this paper, we propose a model augmented with hierarchical contextualized representation to extract more information from both sentence-level and discourse-level. At the sentence level, we leverage the metaphor information of words that except the target word in the sentence to strengthen the reasoning ability of our model via a novel label-enhanced contextualized representation. At the discourse level, the position-aware global memory network is adopted to learn long-range dependency among the same words within a discourse. Finally, our model combines the representations obtained from these two parts. The experiment results on two tasks of the VUA dataset show that our model outperforms every other state-of-the-art method that also does not use any external knowledge except what the pre-trained language model contains.


Learning and Reasoning for Robot Dialog and Navigation Tasks
Keting Lu | Shiqi Zhang | Peter Stone | Xiaoping Chen
Proceedings of the 21th Annual Meeting of the Special Interest Group on Discourse and Dialogue

Reinforcement learning and probabilistic reasoning algorithms aim at learning from interaction experiences and reasoning with probabilistic contextual knowledge respectively. In this research, we develop algorithms for robot task completions, while looking into the complementary strengths of reinforcement learning and probabilistic reasoning techniques. The robots learn from trial-and-error experiences to augment their declarative knowledge base, and the augmented knowledge can be used for speeding up the learning process in potentially different tasks. We have implemented and evaluated the developed algorithms using mobile robots conducting dialog and navigation tasks. From the results, we see that our robot’s performance can be improved by both reasoning with human knowledge and learning from task-completion experience. More interestingly, the robot was able to learn from navigation tasks to improve its dialog strategies.

Adaptive Dialog Policy Learning with Hindsight and User Modeling
Yan Cao | Keting Lu | Xiaoping Chen | Shiqi Zhang
Proceedings of the 21th Annual Meeting of the Special Interest Group on Discourse and Dialogue

Reinforcement learning (RL) methods have been widely used for learning dialog policies. Sample efficiency, i.e., the efficiency of learning from limited dialog experience, is particularly important in RL-based dialog policy learning, because interacting with people is costly and low-quality dialog policies produce very poor user experience. In this paper, we develop LHUA (Learning with Hindsight, User modeling, and Adaptation) that, for the first time, enables dialog agents to adaptively learn with hindsight from both simulated and real users. Simulation and hindsight provide the dialog agent with more experience and more (positive) reinforcement respectively. Experimental results suggest that LHUA outperforms competitive baselines from the literature, including its no-simulation, no-adaptation, and no-hindsight counterparts.


Sentence Weighting for Neural Machine Translation Domain Adaptation
Shiqi Zhang | Deyi Xiong
Proceedings of the 27th International Conference on Computational Linguistics

In this paper, we propose a new sentence weighting method for the domain adaptation of neural machine translation. We introduce a domain similarity metric to evaluate the relevance between a sentence and an available entire domain dataset. The similarity of each sentence to the target domain is calculated with various methods. The computed similarity is then integrated into the training objective to weight sentences. The adaptation results on both IWSLT Chinese-English TED task and a task with only synthetic training parallel data show that our sentence weighting method is able to achieve an significant improvement over strong baselines.