Conference on Empirical Methods in Natural Language Processing (2016)


up

pdf (full)
bib (full)
Proceedings of the 2016 Conference on Empirical Methods in Natural Language Processing

pdf bib
Proceedings of the 2016 Conference on Empirical Methods in Natural Language Processing
Jian Su | Kevin Duh | Xavier Carreras

pdf bib
Span-Based Constituency Parsing with a Structure-Label System and Provably Optimal Dynamic Oracles
James Cross | Liang Huang

pdf bib
Rule Extraction for Tree-to-Tree Transducers by Cost Minimization
Pascual Martínez-Gómez | Yusuke Miyao

pdf
A Neural Network for Coordination Boundary Prediction
Jessica Ficler | Yoav Goldberg

pdf
Using Left-corner Parsing to Encode Universal Structural Constraints in Grammar Induction
Hiroshi Noji | Yusuke Miyao | Mark Johnson

pdf
Distinguishing Past, On-going, and Future Events: The EventStatus Corpus
Ruihong Huang | Ignacio Cases | Dan Jurafsky | Cleo Condoravdi | Ellen Riloff

pdf
Nested Propositions in Open Information Extraction
Nikita Bhutani | H. V. Jagadish | Dragomir Radev

pdf
A Position Encoding Convolutional Neural Network Based on Dependency Tree for Relation Classification
Yunlun Yang | Yunhai Tong | Shulei Ma | Zhi-Hong Deng

pdf
Learning to Recognize Discontiguous Entities
Aldrian Obaja Muis | Wei Lu

pdf
Modeling Human Reading with Neural Attention
Michael Hahn | Frank Keller

pdf
Comparing Computational Cognitive Models of Generalization in a Language Acquisition Task
Libby Barak | Adele E. Goldberg | Suzanne Stevenson

pdf
Rationalizing Neural Predictions
Tao Lei | Regina Barzilay | Tommi Jaakkola

pdf
Deep Multi-Task Learning with Shared Memory for Text Classification
Pengfei Liu | Xipeng Qiu | Xuanjing Huang

pdf
Natural Language Comprehension with the EpiReader
Adam Trischler | Zheng Ye | Xingdi Yuan | Philip Bachman | Alessandro Sordoni | Kaheer Suleman

pdf
Creating Causal Embeddings for Question Answering with Minimal Supervision
Rebecca Sharp | Mihai Surdeanu | Peter Jansen | Peter Clark | Michael Hammond

pdf
Improving Semantic Parsing via Answer Type Inference
Semih Yavuz | Izzeddin Gur | Yu Su | Mudhakar Srivatsa | Xifeng Yan

pdf
Semantic Parsing to Probabilistic Programs for Situated Question Answering
Jayant Krishnamurthy | Oyvind Tafjord | Aniruddha Kembhavi

pdf
Event participant modelling with neural networks
Ottokar Tilk | Vera Demberg | Asad Sayeed | Dietrich Klakow | Stefan Thater

pdf
Context-Dependent Sense Embedding
Lin Qiu | Kewei Tu | Yong Yu

pdf
Jointly Embedding Knowledge Graphs and Logical Rules
Shu Guo | Quan Wang | Lihong Wang | Bin Wang | Li Guo

pdf
Learning Connective-based Word Representations for Implicit Discourse Relation Identification
Chloé Braud | Pascal Denis

pdf
Aspect Level Sentiment Classification with Deep Memory Network
Duyu Tang | Bing Qin | Ting Liu

pdf
Lifelong-RL: Lifelong Relaxation Labeling for Separating Entities and Aspects in Opinion Targets
Lei Shu | Bing Liu | Hu Xu | Annice Kim

pdf
Learning Sentence Embeddings with Auxiliary Tasks for Cross-Domain Sentiment Classification
Jianfei Yu | Jing Jiang

pdf
Attention-based LSTM Network for Cross-Lingual Sentiment Classification
Xinjie Zhou | Xiaojun Wan | Jianguo Xiao

pdf
Neural versus Phrase-Based Machine Translation Quality: a Case Study
Luisa Bentivogli | Arianna Bisazza | Mauro Cettolo | Marcello Federico

pdf
Zero-Resource Translation with Multi-Lingual Neural Machine Translation
Orhan Firat | Baskaran Sankaran | Yaser Al-onaizan | Fatos T. Yarman Vural | Kyunghyun Cho

pdf
Memory-enhanced Decoder for Neural Machine Translation
Mingxuan Wang | Zhengdong Lu | Hang Li | Qun Liu

pdf
Semi-Supervised Learning of Sequence Models with Method of Moments
Zita Marinho | André F. T. Martins | Shay B. Cohen | Noah A. Smith

pdf
Learning from Explicit and Implicit Supervision Jointly For Algebra Word Problems
Shyam Upadhyay | Ming-Wei Chang | Kai-Wei Chang | Wen-tau Yih

pdf
TweeTime : A Minimally Supervised Method for Recognizing and Normalizing Time Expressions in Twitter
Jeniya Tabassum | Alan Ritter | Wei Xu

pdf
Language as a Latent Variable: Discrete Generative Models for Sentence Compression
Yishu Miao | Phil Blunsom

pdf
Globally Coherent Text Generation with Neural Checklist Models
Chloé Kiddon | Luke Zettlemoyer | Yejin Choi

pdf
A Dataset and Evaluation Metrics for Abstractive Compression of Sentences and Short Paragraphs
Kristina Toutanova | Chris Brockett | Ke M. Tran | Saleema Amershi

pdf
PaCCSS-IT: A Parallel Corpus of Complex-Simple Sentences for Automatic Text Simplification
Dominique Brunato | Andrea Cimino | Felice Dell’Orletta | Giulia Venturi

pdf
Discourse Parsing with Attention-based Hierarchical Neural Networks
Qi Li | Tianshi Li | Baobao Chang

pdf
Multi-view Response Selection for Human-Computer Conversation
Xiangyang Zhou | Daxiang Dong | Hua Wu | Shiqi Zhao | Dianhai Yu | Hao Tian | Xuan Liu | Rui Yan

pdf
Variational Neural Discourse Relation Recognizer
Biao Zhang | Deyi Xiong | Jinsong Su | Qun Liu | Rongrong Ji | Hong Duan | Min Zhang

pdf
Event Detection and Co-reference with Minimal Supervision
Haoruo Peng | Yangqiu Song | Dan Roth

pdf
Learning Term Embeddings for Taxonomic Relation Identification Using Dynamic Weighting Neural Network
Anh Tuan Luu | Yi Tay | Siu Cheung Hui | See Kiong Ng

pdf
Relation Schema Induction using Tensor Factorization with Side Information
Madhav Nimishakavi | Uday Singh Saini | Partha Talukdar

pdf
Supervised Distributional Hypernym Discovery via Domain Adaptation
Luis Espinosa-Anke | Jose Camacho-Collados | Claudio Delli Bovi | Horacio Saggion

pdf
Latent Tree Language Model
Tomáš Brychcín

pdf
Comparing Data Sources and Architectures for Deep Visual Representation Learning in Semantics
Douwe Kiela | Anita Lilla Verő | Stephen Clark

pdf
Multimodal Compact Bilinear Pooling for Visual Question Answering and Visual Grounding
Akira Fukui | Dong Huk Park | Daylen Yang | Anna Rohrbach | Trevor Darrell | Marcus Rohrbach

pdf
The Structured Weighted Violations Perceptron Algorithm
Rotem Dror | Roi Reichart

pdf
How Transferable are Neural Networks in NLP Applications?
Lili Mou | Zhao Meng | Rui Yan | Ge Li | Yan Xu | Lu Zhang | Zhi Jin

pdf
Morphological Priors for Probabilistic Neural Word Embeddings
Parminder Bhatia | Robert Guthrie | Jacob Eisenstein

pdf
Automatic Cross-Lingual Similarization of Dependency Grammars for Tree-based Machine Translation
Wenbin Jiang | Wen Zhang | Jinan Xu | Rangjia Cai

pdf
IRT-based Aggregation Model of Crowdsourced Pairwise Comparison for Evaluating Machine Translations
Naoki Otani | Toshiaki Nakazawa | Daisuke Kawahara | Sadao Kurohashi

pdf
Variational Neural Machine Translation
Biao Zhang | Deyi Xiong | Jinsong Su | Hong Duan | Min Zhang

pdf
Towards a Convex HMM Surrogate for Word Alignment
Andrei Simion | Michael Collins | Cliff Stein

pdf
Solving Verbal Questions in IQ Test by Knowledge-Powered Word Embedding
Huazheng Wang | Fei Tian | Bin Gao | Chengjieren Zhu | Jiang Bian | Tie-Yan Liu

pdf
Long Short-Term Memory-Networks for Machine Reading
Jianpeng Cheng | Li Dong | Mirella Lapata

pdf
On Generating Characteristic-rich Question Sets for QA Evaluation
Yu Su | Huan Sun | Brian Sadler | Mudhakar Srivatsa | Izzeddin Gür | Zenghui Yan | Xifeng Yan

pdf
Learning to Translate for Multilingual Question Answering
Ferhan Ture | Elizabeth Boschee

pdf
A Semiparametric Model for Bayesian Reader Identification
Ahmed Abdelwahab | Reinhold Kliegl | Niels Landwehr

pdf
Inducing Domain-Specific Sentiment Lexicons from Unlabeled Corpora
William L. Hamilton | Kevin Clark | Jure Leskovec | Dan Jurafsky

pdf
Attention-based LSTM for Aspect-level Sentiment Classification
Yequan Wang | Minlie Huang | Xiaoyan Zhu | Li Zhao

pdf
Recursive Neural Conditional Random Fields for Aspect-based Sentiment Analysis
Wenya Wang | Sinno Jialin Pan | Daniel Dahlmeier | Xiaokui Xiao

pdf
Extracting Aspect Specific Opinion Expressions
Abhishek Laddha | Arjun Mukherjee

pdf
Emotion Distribution Learning from Texts
Deyu Zhou | Xuan Zhang | Yin Zhou | Quan Zhao | Xin Geng

pdf
Building an Evaluation Scale using Item Response Theory
John P. Lalor | Hao Wu | Hong Yu

pdf
WordRank: Learning Word Embeddings via Robust Ranking
Shihao Ji | Hyokun Yun | Pinar Yanardag | Shin Matsushima | S. V. N. Vishwanathan

pdf
Exploring Semantic Representation in Brain Activity Using Word Embeddings
Yu-Ping Ruan | Zhen-Hua Ling | Yu Hu

pdf
AMR Parsing with an Incremental Joint Model
Junsheng Zhou | Feiyu Xu | Hans Uszkoreit | Weiguang Qu | Ran Li | Yanhui Gu

pdf
Identifying Dogmatism in Social Media: Signals and Models
Ethan Fast | Eric Horvitz

pdf
Enhanced Personalized Search using Social Data
Dong Zhou | Séamus Lawless | Xuan Wu | Wenyu Zhao | Jianxun Liu

pdf
Effective Greedy Inference for Graph-based Non-Projective Dependency Parsing
Ilan Tchernowitz | Liron Yedidsion | Roi Reichart

pdf
Generating Abbreviations for Chinese Named Entities Using Recurrent Neural Network with Dynamic Dictionary
Qi Zhang | Jin Qian | Ya Guo | Yaqian Zhou | Xuanjing Huang

pdf
Neural Network for Heterogeneous Annotations
Hongshen Chen | Yue Zhang | Qun Liu

pdf
LAMB: A Good Shepherd of Morphologically Rich Languages
Sebastian Ebert | Thomas Müller | Hinrich Schütze

pdf
Fast Coupled Sequence Labeling on Heterogeneous Annotations via Context-aware Pruning
Zhenghua Li | Jiayuan Chao | Min Zhang | Jiwen Yang

pdf
Unsupervised Neural Dependency Parsing
Yong Jiang | Wenjuan Han | Kewei Tu

pdf
Generating Coherent Summaries of Scientific Articles Using Coherence Patterns
Daraksha Parveen | Mohsen Mesgar | Michael Strube

pdf
News Stream Summarization using Burst Information Networks
Tao Ge | Lei Cui | Baobao Chang | Sujian Li | Ming Zhou | Zhifang Sui

pdf
Rationale-Augmented Convolutional Neural Networks for Text Classification
Ye Zhang | Iain Marshall | Byron C. Wallace

pdf
Transferring User Interests Across Websites with Unstructured Text for Cold-Start Recommendation
Yu-Yang Huang | Shou-De Lin

pdf
Speculation and Negation Scope Detection via Convolutional Neural Networks
Zhong Qian | Peifeng Li | Qiaoming Zhu | Guodong Zhou | Zhunchen Luo | Wei Luo

pdf
Analyzing Linguistic Knowledge in Sequential Model of Sentence
Peng Qian | Xipeng Qiu | Xuanjing Huang

pdf
Keyphrase Extraction Using Deep Recurrent Neural Networks on Twitter
Qi Zhang | Yang Wang | Yeyun Gong | Xuanjing Huang

pdf
Solving and Generating Chinese Character Riddles
Chuanqi Tan | Furu Wei | Li Dong | Weifeng Lv | Ming Zhou

pdf
Structured prediction models for RNN based sequence labeling in clinical text
Abhyuday Jagannatha | Hong Yu

pdf
Learning to Represent Review with Tensor Decomposition for Spam Detection
Xuepeng Wang | Kang Liu | Shizhu He | Jun Zhao

pdf
Stance Detection with Bidirectional Conditional Encoding
Isabelle Augenstein | Tim Rocktäschel | Andreas Vlachos | Kalina Bontcheva

pdf
Modeling Skip-Grams for Event Detection with Convolutional Neural Networks
Thien Huu Nguyen | Ralph Grishman

pdf
Porting an Open Information Extraction System from English to German
Tobias Falke | Gabriel Stanovsky | Iryna Gurevych | Ido Dagan

pdf
Named Entity Recognition for Novel Types by Transfer Learning
Lizhen Qu | Gabriela Ferraro | Liyuan Zhou | Weiwei Hou | Timothy Baldwin

pdf
Extracting Subevents via an Effective Two-phase Approach
Allison Badgett | Ruihong Huang

pdf
Gaussian Visual-Linguistic Embedding for Zero-Shot Recognition
Tanmoy Mukherjee | Timothy Hospedales

pdf
Question Relevance in VQA: Identifying Non-Visual And False-Premise Questions
Arijit Ray | Gordon Christie | Mohit Bansal | Dhruv Batra | Devi Parikh

pdf
Sort Story: Sorting Jumbled Images and Captions into Stories
Harsh Agrawal | Arjun Chandrasekaran | Dhruv Batra | Devi Parikh | Mohit Bansal

pdf
Human Attention in Visual Question Answering: Do Humans and Deep Networks look at the same regions?
Abhishek Das | Harsh Agrawal | Larry Zitnick | Devi Parikh | Dhruv Batra

pdf
Recurrent Residual Learning for Sequence Classification
Yiren Wang | Fei Tian

pdf
Richer Interpolative Smoothing Based on Modified Kneser-Ney Language Modeling
Ehsan Shareghi | Trevor Cohn | Gholamreza Haffari

pdf
A General Regularization Framework for Domain Adaptation
Wei Lu | Hai Leong Chieu | Jonathan Löfgren

pdf
Coverage Embedding Models for Neural Machine Translation
Haitao Mi | Baskaran Sankaran | Zhiguo Wang | Abe Ittycheriah

pdf
Neural Morphological Analysis: Encoding-Decoding Canonical Segments
Katharina Kann | Ryan Cotterell | Hinrich Schütze

pdf
Exploiting Mutual Benefits between Syntax and Semantic Roles using Neural Network
Peng Shi | Zhiyang Teng | Yue Zhang

pdf
The Effects of Data Size and Frequency Range on Distributional Semantic Models
Magnus Sahlgren | Alessandro Lenci

pdf
Multi-Granularity Chinese Word Embedding
Rongchao Yin | Quan Wang | Peng Li | Rui Li | Bin Wang

pdf
Numerically Grounded Language Models for Semantic Error Correction
Georgios Spithourakis | Isabelle Augenstein | Sebastian Riedel

pdf
Towards Semi-Automatic Generation of Proposition Banks for Low-Resource Languages
Alan Akbik | Vishwajeet Kumar | Yunyao Li

pdf
A Hierarchical Model of Reviews for Aspect-based Sentiment Analysis
Sebastian Ruder | Parsa Ghaffari | John G. Breslin

pdf
Are Word Embedding-based Features Useful for Sarcasm Detection?
Aditya Joshi | Vaibhav Tripathi | Kevin Patel | Pushpak Bhattacharyya | Mark Carman

pdf
Weakly Supervised Tweet Stance Classification by Relational Bootstrapping
Javid Ebrahimi | Dejing Dou | Daniel Lowd

pdf
The Gun Violence Database: A new task and data set for NLP
Ellie Pavlick | Heng Ji | Xiaoman Pan | Chris Callison-Burch

pdf
Fluency detection on communication networks
Tom Lippincott | Benjamin Van Durme

pdf
Characterizing the Language of Online Communities and its Relation to Community Reception
Trang Tran | Mari Ostendorf

pdf
Joint Transition-based Dependency Parsing and Disfluency Detection for Automatic Speech Recognition Texts
Masashi Yoshikawa | Hiroyuki Shindo | Yuji Matsumoto

pdf
Real-Time Speech Emotion and Sentiment Recognition for Interactive Dialogue Systems
Dario Bertero | Farhad Bin Siddique | Chien-Sheng Wu | Yan Wan | Ricky Ho Yin Chan | Pascale Fung

pdf
A Neural Network Architecture for Multilingual Punctuation Generation
Miguel Ballesteros | Leo Wanner

pdf
Neural Headline Generation on Abstract Meaning Representation
Sho Takase | Jun Suzuki | Naoaki Okazaki | Tsutomu Hirao | Masaaki Nagata

pdf
Robust Gram Embeddings
Taygun Kekeç | David M. J. Tax

pdf
SimpleScience: Lexical Simplification of Scientific Terminology
Yea-Seul Kim | Jessica Hullman | Matthew Burgess | Eytan Adar

pdf
Automatic Features for Essay Scoring – An Empirical Study
Fei Dong | Yue Zhang

pdf
Semantic Parsing with Semi-Supervised Sequential Autoencoders
Tomáš Kočiský | Gábor Melis | Edward Grefenstette | Chris Dyer | Wang Ling | Phil Blunsom | Karl Moritz Hermann

pdf
Equation Parsing : Mapping Sentences to Grounded Equations
Subhro Roy | Shyam Upadhyay | Dan Roth

pdf
Automatic Extraction of Implicit Interpretations from Modal Constructions
Jordan Sanders | Eduardo Blanco

pdf
Understanding Negation in Positive Terms Using Syntactic Dependencies
Zahra Sarabi | Eduardo Blanco

pdf
Demographic Dialectal Variation in Social Media: A Case Study of African-American English
Su Lin Blodgett | Lisa Green | Brendan O’Connor

pdf
Understanding Language Preference for Expression of Opinion and Sentiment: What do Hindi-English Speakers do on Twitter?
Koustav Rudra | Shruti Rijhwani | Rafiya Begum | Kalika Bali | Monojit Choudhury | Niloy Ganguly

pdf
Detecting and Characterizing Events
Allison Chaney | Hanna Wallach | Matthew Connelly | David Blei

pdf
Convolutional Neural Network Language Models
Ngoc-Quan Pham | German Kruszewski | Gemma Boleda

pdf
Generalizing and Hybridizing Count-based and Neural Language Models
Graham Neubig | Chris Dyer

pdf
Reasoning about Pragmatics with Neural Listeners and Speakers
Jacob Andreas | Dan Klein

pdf
Generating Topical Poetry
Marjan Ghazvininejad | Xing Shi | Yejin Choi | Kevin Knight

pdf
Deep Reinforcement Learning for Dialogue Generation
Jiwei Li | Will Monroe | Alan Ritter | Dan Jurafsky | Michel Galley | Jianfeng Gao

pdf
Neural Text Generation from Structured Data with Application to the Biography Domain
Rémi Lebret | David Grangier | Michael Auli

pdf
What makes a convincing argument? Empirical analysis and detecting attributes of convincingness in Web argumentation
Ivan Habernal | Iryna Gurevych

pdf
Recognizing Implicit Discourse Relations via Repeated Reading: Neural Networks with Multi-Level Attention
Yang Liu | Sujian Li

pdf
Antecedent Selection for Sluicing: Structure and Content
Pranav Anand | Daniel Hardt

pdf
Intra-Sentential Subject Zero Anaphora Resolution using Multi-Column Convolutional Neural Network
Ryu Iida | Kentaro Torisawa | Jong-Hoon Oh | Canasai Kruengkrai | Julien Kloetzer

pdf
An Unsupervised Probability Model for Speech-to-Translation Alignment of Low-Resource Languages
Antonios Anastasopoulos | David Chiang | Long Duong

pdf
HUME: Human UCCA-Based Evaluation of Machine Translation
Alexandra Birch | Omri Abend | Ondřej Bojar | Barry Haddow

pdf
Improving Multilingual Named Entity Recognition with Wikipedia Entity Type Mapping
Jian Ni | Radu Florian

pdf
Learning Crosslingual Word Embeddings without Bilingual Corpora
Long Duong | Hiroshi Kanayama | Tengfei Ma | Steven Bird | Trevor Cohn

pdf
Sequence-to-Sequence Learning as Beam-Search Optimization
Sam Wiseman | Alexander M. Rush

pdf
Online Segment to Segment Neural Transduction
Lei Yu | Jan Buys | Phil Blunsom

pdf
Sequence-Level Knowledge Distillation
Yoon Kim | Alexander M. Rush

pdf
Controlling Output Length in Neural Encoder-Decoders
Yuta Kikuchi | Graham Neubig | Ryohei Sasano | Hiroya Takamura | Manabu Okumura

pdf
Poet Admits // Mute Cypher: Beam Search to find Mutually Enciphering Poetic Texts
Cole Peterson | Alona Fyshe

pdf
All Fingers are not Equal: Intensity of References in Scientific Articles
Tanmoy Chakraborty | Ramasuri Narayanam

pdf
Improving Users’ Demographic Prediction via the Videos They Talk about
Yuan Wang | Yang Xiao | Chao Ma | Zhen Xiao

pdf
AFET: Automatic Fine-Grained Entity Typing by Hierarchical Partial-Label Embedding
Xiang Ren | Wenqi He | Meng Qu | Lifu Huang | Heng Ji | Jiawei Han

pdf
Mining Inference Formulas by Goal-Directed Random Walks
Zhuoyu Wei | Jun Zhao | Kang Liu

pdf
Lifted Rule Injection for Relation Embeddings
Thomas Demeester | Tim Rocktäschel | Sebastian Riedel

pdf
Key-Value Memory Networks for Directly Reading Documents
Alexander Miller | Adam Fisch | Jesse Dodge | Amir-Hossein Karimi | Antoine Bordes | Jason Weston

pdf
Analyzing Framing through the Casts of Characters in the News
Dallas Card | Justin Gross | Amber Boydstun | Noah A. Smith

pdf
The Teams Corpus and Entrainment in Multi-Party Spoken Dialogues
Diane Litman | Susannah Paletz | Zahra Rahimi | Stefani Allegretti | Caitlin Rice

pdf
Personalized Emphasis Framing for Persuasive Message Generation
Tao Ding | Shimei Pan

pdf
Cross Sentence Inference for Process Knowledge
Samuel Louvan | Chetan Naik | Sadhana Kumaravel | Heeyoung Kwon | Niranjan Balasubramanian | Peter Clark

pdf
Toward Socially-Infused Information Extraction: Embedding Authors, Mentions, and Entities
Yi Yang | Ming-Wei Chang | Jacob Eisenstein

pdf
Phonologically Aware Neural Model for Named Entity Recognition in Low Resource Transfer Settings
Akash Bharadwaj | David Mortensen | Chris Dyer | Jaime Carbonell

pdf
Long-Short Range Context Neural Networks for Language Modeling
Youssef Oualil | Mittul Singh | Clayton Greenberg | Dietrich Klakow

pdf
Jointly Learning Grounded Task Structures from Language Instruction and Visual Demonstration
Changsong Liu | Shaohua Yang | Sari Saba-Sadiya | Nishant Shukla | Yunzhong He | Song-Chun Zhu | Joyce Chai

pdf
Resolving Language and Vision Ambiguities Together: Joint Segmentation & Prepositional Attachment Resolution in Captioned Scenes
Gordon Christie | Ankit Laddha | Aishwarya Agrawal | Stanislaw Antol | Yash Goyal | Kevin Kochersberger | Dhruv Batra

pdf
Charagram: Embedding Words and Sentences via Character n-grams
John Wieting | Mohit Bansal | Kevin Gimpel | Karen Livescu

pdf
Length bias in Encoder Decoder Models and a Case for Global Conditioning
Pavel Sountsov | Sunita Sarawagi

pdf
Does String-Based Neural MT Learn Source Syntax?
Xing Shi | Inkit Padhi | Kevin Knight

pdf
Exploiting Source-side Monolingual Data in Neural Machine Translation
Jiajun Zhang | Chengqing Zong

pdf
Phrase-based Machine Translation is State-of-the-Art for Automatic Grammatical Error Correction
Marcin Junczys-Dowmunt | Roman Grundkiewicz

pdf
Incorporating Discrete Translation Lexicons into Neural Machine Translation
Philip Arthur | Graham Neubig | Satoshi Nakamura

pdf
Transfer Learning for Low-Resource Neural Machine Translation
Barret Zoph | Deniz Yuret | Jonathan May | Kevin Knight

pdf
MixKMeans: Clustering Question-Answer Archives
Deepak P

pdf
It Takes Three to Tango: Triangulation Approach to Answer Ranking in Community Question Answering
Preslav Nakov | Lluís Màrquez | Francisco Guzmán

pdf
Character-Level Question Answering with Attention
Xiaodong He | David Golub

pdf
Learning to Generate Textual Data
Guillaume Bouchard | Pontus Stenetorp | Sebastian Riedel

pdf
A Theme-Rewriting Approach for Generating Algebra Word Problems
Rik Koncel-Kedziorski | Ioannis Konstas | Luke Zettlemoyer | Hannaneh Hajishirzi

pdf
Context-Sensitive Lexicon Features for Neural Sentiment Analysis
Zhiyang Teng | Duy-Tin Vo | Yue Zhang

pdf
Event-Driven Emotion Cause Extraction with Corpus Construction
Lin Gui | Dongyin Wu | Ruifeng Xu | Qin Lu | Yu Zhou

pdf
Neural Sentiment Classification with User and Product Attention
Huimin Chen | Maosong Sun | Cunchao Tu | Yankai Lin | Zhiyuan Liu

pdf
Cached Long Short-Term Memory Neural Networks for Document-Level Sentiment Classification
Jiacheng Xu | Danlu Chen | Xipeng Qiu | Xuanjing Huang

pdf
Deep Neural Networks with Massive Learned Knowledge
Zhiting Hu | Zichao Yang | Ruslan Salakhutdinov | Eric Xing

pdf
De-Conflated Semantic Representations
Mohammad Taher Pilehvar | Nigel Collier

pdf
Improving Sparse Word Representations with Distributional Inference for Semantic Composition
Thomas Kober | Julie Weeds | Jeremy Reffin | David Weir

pdf
Modelling Interaction of Sentence Pair with Coupled-LSTMs
Pengfei Liu | Xipeng Qiu | Yaqian Zhou | Jifan Chen | Xuanjing Huang

pdf
Universal Decompositional Semantics on Universal Dependencies
Aaron Steven White | Drew Reisinger | Keisuke Sakaguchi | Tim Vieira | Sheng Zhang | Rachel Rudinger | Kyle Rawlins | Benjamin Van Durme

pdf
Friends with Motives: Using Text to Infer Influence on SCOTUS
Yanchuan Sim | Bryan Routledge | Noah A. Smith

pdf
Verb Phrase Ellipsis Resolution Using Discriminative and Margin-Infused Algorithms
Kian Kenyon-Dean | Jackie Chi Kit Cheung | Doina Precup

pdf
Distilling an Ensemble of Greedy Dependency Parsers into One MST Parser
Adhiguna Kuncoro | Miguel Ballesteros | Lingpeng Kong | Chris Dyer | Noah A. Smith

pdf
LSTM Shift-Reduce CCG Parsing
Wenduan Xu

pdf
An Evaluation of Parser Robustness for Ungrammatical Sentences
Homa B. Hashemi | Rebecca Hwa

pdf
Neural Shift-Reduce CCG Semantic Parsing
Dipendra Kumar Misra | Yoav Artzi

pdf
Syntactic Parsing of Web Queries
Xiangyan Sun | Haixun Wang | Yanghua Xiao | Zhongyuan Wang

pdf
Unsupervised Text Recap Extraction for TV Series
Hongliang Yu | Shikun Zhang | Louis-Philippe Morency

pdf
On- and Off-Topic Classification and Semantic Annotation of User-Generated Software Requirements
Markus Dollmann | Michaela Geierhos

pdf
Deceptive Review Spam Detection via Exploiting Task Relatedness and Unlabeled Data
Zhen Hai | Peilin Zhao | Peng Cheng | Peng Yang | Xiao-Li Li | Guangxia Li

pdf
Regularizing Text Categorization with Clusters of Words
Konstantinos Skianis | François Rousseau | Michalis Vazirgiannis

pdf
Deep Reinforcement Learning with a Combinatorial Action Space for Predicting Popular Reddit Threads
Ji He | Mari Ostendorf | Xiaodong He | Jianshu Chen | Jianfeng Gao | Lihong Li | Li Deng

pdf
Non-Literal Text Reuse in Historical Texts: An Approach to Identify Reuse Transformations and its Application to Bible Reuse
Maria Moritz | Andreas Wiederhold | Barbara Pavlek | Yuri Bizzoni | Marco Büchler

pdf
A Graph Degeneracy-based Approach to Keyword Extraction
Antoine Tixier | Fragkiskos Malliaros | Michalis Vazirgiannis

pdf
Predicting the Relative Difficulty of Single Sentences With and Without Surrounding Context
Elliot Schumacher | Maxine Eskenazi | Gwen Frishkoff | Kevyn Collins-Thompson

pdf
A Neural Approach to Automated Essay Scoring
Kaveh Taghipour | Hwee Tou Ng

pdf
Non-uniform Language Detection in Technical Writing
Weibo Wang | Abidalrahman Moh’d | Aminul Islam | Axel Soto | Evangelos Milios

pdf
Adapting Grammatical Error Correction Based on the Native Language of Writers with Neural Network Joint Models
Shamil Chollampatt | Duc Tam Hoang | Hwee Tou Ng

pdf
Orthographic Syllable as basic unit for SMT between Related Languages
Anoop Kunchukuttan | Pushpak Bhattacharyya

pdf
Neural Generation of Regular Expressions from Natural Language with Minimal Domain Knowledge
Nicholas Locascio | Karthik Narasimhan | Eduardo DeLeon | Nate Kushman | Regina Barzilay

pdf
Supervised Keyphrase Extraction as Positive Unlabeled Learning
Lucas Sterckx | Cornelia Caragea | Thomas Demeester | Chris Develder

pdf
Learning to Answer Questions from Wikipedia Infoboxes
Alvaro Morales | Varot Premtoon | Cordelia Avery | Sue Felshin | Boris Katz

pdf
Timeline extraction using distant supervision and joint inference
Savelie Cornegruta | Andreas Vlachos

pdf
Combining Supervised and Unsupervised Enembles for Knowledge Base Population
Nazneen Fatema Rajani | Raymond Mooney

pdf
Character Sequence Models for Colorful Words
Kazuya Kawakami | Chris Dyer | Bryan Routledge | Noah A. Smith

pdf
Analyzing the Behavior of Visual Question Answering Models
Aishwarya Agrawal | Dhruv Batra | Devi Parikh

pdf
Improving LSTM-based Video Description with Linguistic Knowledge Mined from Text
Subhashini Venugopalan | Lisa Anne Hendricks | Raymond Mooney | Kate Saenko

pdf
Representing Verbs with Rich Contexts: an Evaluation on Verb Similarity
Emmanuele Chersoni | Enrico Santus | Alessandro Lenci | Philippe Blache | Chu-Ren Huang

pdf
Speed-Accuracy Tradeoffs in Tagging with Variable-Order CRFs and Structured Sparsity
Tim Vieira | Ryan Cotterell | Jason Eisner

pdf
Learning Robust Representations of Text
Yitong Li | Trevor Cohn | Timothy Baldwin

pdf
Modified Dirichlet Distribution: Allowing Negative Parameters to Induce Stronger Sparsity
Kewei Tu

pdf
Gated Word-Character Recurrent Language Model
Yasumasa Miyamoto | Kyunghyun Cho

pdf
Unsupervised Word Alignment by Agreement Under ITG Constraint
Hidetaka Kamigaito | Akihiro Tamura | Hiroya Takamura | Manabu Okumura | Eiichiro Sumita

pdf
Training with Exploration Improves a Greedy Stack LSTM Parser
Miguel Ballesteros | Yoav Goldberg | Chris Dyer | Noah A. Smith

pdf
Capturing Argument Relationship for Chinese Semantic Role Labeling
Lei Sha | Sujian Li | Baobao Chang | Zhifang Sui | Tingsong Jiang

pdf
BrainBench: A Brain-Image Test Suite for Distributional Semantic Models
Haoyan Xu | Brian Murphy | Alona Fyshe

pdf
Evaluating Induced CCG Parsers on Grounded Semantic Parsing
Yonatan Bisk | Siva Reddy | John Blitzer | Julia Hockenmaier | Mark Steedman

pdf
Vector-space models for PPDB paraphrase ranking in context
Marianna Apidianaki

pdf
Interpreting Neural Networks to Improve Politeness Comprehension
Malika Aubakirova | Mohit Bansal

pdf
Does ‘well-being’ translate on Twitter?
Laura Smith | Salvatore Giorgi | Rishi Solanki | Johannes Eichstaedt | H. Andrew Schwartz | Muhammad Abdul-Mageed | Anneke Buffone | Lyle Ungar

pdf
Beyond Canonical Texts: A Computational Analysis of Fanfiction
Smitha Milli | David Bamman

pdf
Using Syntactic and Semantic Context to Explore Psychodemographic Differences in Self-reference
Masoud Rouhizadeh | Lyle Ungar | Anneke Buffone | H Andrew Schwartz

pdf
Learning to Identify Metaphors from a Corpus of Proverbs
Gözde Özbal | Carlo Strapparava | Serra Sinem Tekiroğlu | Daniele Pighin

pdf
An Embedding Model for Predicting Roll-Call Votes
Peter Kraft | Hirsh Jain | Alexander M. Rush

pdf
Natural Language Model Re-usability for Scaling to Different Domains
Young-Bum Kim | Alexandre Rochette | Ruhi Sarikaya

pdf
Leveraging Sentence-level Information with Encoder LSTM for Semantic Slot Filling
Gakuto Kurata | Bing Xiang | Bowen Zhou | Mo Yu

pdf
AMR-to-text generation as a Traveling Salesman Problem
Linfeng Song | Yue Zhang | Xiaochang Peng | Zhiguo Wang | Daniel Gildea

pdf
Learning to Capitalize with Character-Level Recurrent Neural Networks: An Empirical Study
Raymond Hendy Susanto | Hai Leong Chieu | Wei Lu

pdf
The Effects of the Content of FOMC Communications on US Treasury Rates
Christopher Rohlfs | Sunandan Chakraborty | Lakshminarayanan Subramanian

pdf
Learning to refine text based recommendations
Youyang Gu | Tao Lei | Regina Barzilay | Tommi Jaakkola

pdf
There’s No Comparison: Reference-less Evaluation Metrics in Grammatical Error Correction
Courtney Napoles | Keisuke Sakaguchi | Joel Tetreault

pdf
Cultural Shift or Linguistic Drift? Comparing Two Computational Measures of Semantic Change
William L. Hamilton | Jure Leskovec | Dan Jurafsky

pdf
How NOT To Evaluate Your Dialogue System: An Empirical Study of Unsupervised Evaluation Metrics for Dialogue Response Generation
Chia-Wei Liu | Ryan Lowe | Iulian Serban | Mike Noseworthy | Laurent Charlin | Joelle Pineau

pdf
Addressee and Response Selection for Multi-Party Conversation
Hiroki Ouchi | Yuta Tsuboi

pdf
Nonparametric Bayesian Models for Spoken Language Understanding
Kei Wakabayashi | Johane Takeuchi | Kotaro Funakoshi | Mikio Nakano

pdf
Conditional Generation and Snapshot Learning in Neural Dialogue Systems
Tsung-Hsien Wen | Milica Gašić | Nikola Mrkšić | Lina M. Rojas-Barahona | Pei-Hao Su | Stefan Ultes | David Vandyke | Steve Young

pdf
Relations such as Hypernymy: Identifying and Exploiting Hearst Patterns in Distributional Vectors for Lexical Entailment
Stephen Roller | Katrin Erk

pdf
SimVerb-3500: A Large-Scale Evaluation Set of Verb Similarity
Daniela Gerz | Ivan Vulić | Felix Hill | Roi Reichart | Anna Korhonen

pdf
POLY: Mining Relational Paraphrases from Multilingual Sentences
Adam Grycner | Gerhard Weikum

pdf
Exploiting Sentence Similarities for Better Alignments
Tao Li | Vivek Srikumar

pdf
Bi-directional Attention with Agreement for Dependency Parsing
Hao Cheng | Hao Fang | Xiaodong He | Jianfeng Gao | Li Deng

pdf
Anchoring and Agreement in Syntactic Annotations
Yevgeni Berzak | Yan Huang | Andrei Barbu | Anna Korhonen | Boris Katz

pdf
Tense Manages to Predict Implicative Behavior in Verbs
Ellie Pavlick | Chris Callison-Burch

pdf
Who did What: A Large-Scale Person-Centered Cloze Dataset
Takeshi Onishi | Hai Wang | Mohit Bansal | Kevin Gimpel | David McAllester

pdf
Building compositional semantics and higher-order inference system for a wide-coverage Japanese CCG parser
Koji Mineshima | Ribeka Tanaka | Pascual Martínez-Gómez | Yusuke Miyao | Daisuke Bekki

pdf
Learning to Generate Compositional Color Descriptions
Will Monroe | Noah D. Goodman | Christopher Potts

pdf
A Decomposable Attention Model for Natural Language Inference
Ankur Parikh | Oscar Täckström | Dipanjan Das | Jakob Uszkoreit

pdf
Deep Reinforcement Learning for Mention-Ranking Coreference Models
Kevin Clark | Christopher D. Manning

pdf
A Stacking Gated Neural Architecture for Implicit Discourse Relation Classification
Lianhui Qin | Zhisong Zhang | Hai Zhao

pdf
Insertion Position Selection Model for Flexible Non-Terminals in Dependency Tree-to-Tree Machine Translation
Toshiaki Nakazawa | John Richardson | Sadao Kurohashi

pdf
Why Neural Translations are the Right Length
Xing Shi | Kevin Knight | Deniz Yuret

pdf
Supervised Attentions for Neural Machine Translation
Haitao Mi | Zhiguo Wang | Abe Ittycheriah

pdf
Learning principled bilingual mappings of word embeddings while preserving monolingual invariance
Mikel Artetxe | Gorka Labaka | Eneko Agirre

pdf
Measuring the behavioral impact of machine translation quality improvements with A/B testing
Ben Russell | Duncan Gillespie

pdf
Creating a Large Benchmark for Open Information Extraction
Gabriel Stanovsky | Ido Dagan

pdf
Bilingually-constrained Synthetic Data for Implicit Discourse Relation Recognition
Changxing Wu | Xiaodong Shi | Yidong Chen | Yanzhou Huang | Jinsong Su

pdf
Transition-Based Dependency Parsing with Heuristic Backtracking
Jacob Buckman | Miguel Ballesteros | Chris Dyer

pdf
Word Ordering Without Syntax
Allen Schmaltz | Alexander M. Rush | Stuart Shieber

pdf
Morphological Segmentation Inside-Out
Ryan Cotterell | Arun Kumar | Hinrich Schütze

pdf
Parsing as Language Modeling
Do Kook Choe | Eugene Charniak

pdf
Human-in-the-Loop Parsing
Luheng He | Julian Michael | Mike Lewis | Luke Zettlemoyer

pdf
Unsupervised Timeline Generation for Wikipedia History Articles
Sandro Bauer | Simone Teufel

pdf
Encoding Temporal Information for Time-Aware Link Prediction
Tingsong Jiang | Tianyu Liu | Tao Ge | Lei Sha | Sujian Li | Baobao Chang | Zhifang Sui

pdf
Improving Information Extraction by Acquiring External Evidence with Reinforcement Learning
Karthik Narasimhan | Adam Yala | Regina Barzilay

pdf
Global Neural CCG Parsing with Optimality Guarantees
Kenton Lee | Mike Lewis | Luke Zettlemoyer

pdf
Learning a Lexicon and Translation Model from Phoneme Lattices
Oliver Adams | Graham Neubig | Trevor Cohn | Steven Bird | Quoc Truong Do | Satoshi Nakamura

pdf
SQuAD: 100,000+ Questions for Machine Comprehension of Text
Pranav Rajpurkar | Jian Zhang | Konstantin Lopyrev | Percy Liang


up

bib (full) Proceedings of the 2016 Conference on Empirical Methods in Natural Language Processing: Tutorial Abstracts

bib
Practical Neural Networks for NLP: From Theory to Code
Chris Dyer | Yoav Goldberg | Graham Neubig

This tutorial aims to bring NLP researchers up to speed with the current techniques in deep learning and neural networks, and show them how they can turn their ideas into practical implementations. We will start with simple classification models (logistic regression and multilayer perceptrons) and cover more advanced patterns that come up in NLP such as recurrent networks for sequence tagging and prediction problems, structured networks (e.g., compositional architectures based on syntax trees), structured output spaces (sequences and trees), attention for sequence-to-sequence transduction, and feature induction for complex algorithm states. A particular emphasis will be on learning to represent complex objects as recursive compositions of simpler objects. This representation will reflect characterize standard objects in NLP, such as the composition of characters and morphemes into words, and words into sentences and documents. In addition, new opportunities such as learning to embed "algorithm states" such as those used in transition-based parsing and other sequential structured prediction models (for which effective features may be difficult to engineer by hand) will be covered.Everything in the tutorial will be grounded in code — we will show how to program seemingly complex neural-net models using toolkits based on the computation-graph formalism. Computation graphs decompose complex computations into a DAG, with nodes representing inputs, target outputs, parameters, or (sub)differentiable functions (e.g., "tanh", "matrix multiply", and "softmax"), and edges represent data dependencies. These graphs can be run "forward" to make predictions and compute errors (e.g., log loss, squared error) and then "backward" to compute derivatives with respect to model parameters. In particular we'll cover the Python bindings of the CNN library. CNN has been designed from the ground up for NLP applications, dynamically structured NNs, rapid prototyping, and a transparent data and execution model.

bib
Advanced Markov Logic Techniques for Scalable Joint Inference in NLP
Deepak Venugopal | Vibhav Gogate | Vincent Ng

In the early days of the statistical NLP era, many language processing tasks were tackled using the so-called pipeline architecture: the given task is broken into a series of sub-tasks such that the output of one sub-task is an input to the next sub-task in the sequence. The pipeline architecture is appealing for various reasons, including modularity, modeling convenience, and manageable computational complexity. However, it suffers from the error propagation problem: errors made in one sub-task are propagated to the next sub-task in the sequence, leading to poor accuracy on that sub-task, which in turn leads to more errors downstream. Another disadvantage associated with it is lack of feedback: errors made in a sub-task are often not corrected using knowledge uncovered while solving another sub-task down the pipeline.Realizing these weaknesses, researchers have turned to joint inference approaches in recent years. One such approach involves the use of Markov logic, which is defined as a set of weighted first-order logic formulas and, at a high level, unifies first-order logic with probabilistic graphical models. It is an ideal modeling language (knowledge representation) for compactly representing relational and uncertain knowledge in NLP. In a typical use case of MLNs in NLP, the application designer describes the background knowledge using a few first-order logic sentences and then uses software packages such as Alchemy, Tuffy, and Markov the beast to perform learning and inference (prediction) over the MLN. However, despite its obvious advantages, over the years, researchers and practitioners have found it difficult to use MLNs effectively in many NLP applications. The main reason for this is that it is hard to scale inference and learning algorithms for MLNs to large datasets and complex models, that are typical in NLP.In this tutorial, we will introduce the audience to recent advances in scaling up inference and learning in MLNs as well as new approaches to make MLNs a "black-box" for NLP applications (with only minor tuning required on the part of the user). Specifically, we will introduce attendees to a key idea that has emerged in the MLN research community over the last few years, lifted inference , which refers to inference techniques that take advantage of symmetries (e.g., synonyms), both exact and approximate, in the MLN . We will describe how these next-generation inference techniques can be used to perform effective joint inference. We will also present our new software package for inference and learning in MLNs, Alchemy 2.0, which is based on lifted inference, focusing primarily on how it can be used to scale up inference and learning in large models and datasets for applications such as semantic similarity determination, information extraction and question answering.

bib
Lifelong Machine Learning for Natural Language Processing
Zhiyuan Chen | Bing Liu

Machine learning (ML) has been successfully used as a prevalent approach to solving numerous NLP problems. However, the classic ML paradigm learns in isolation. That is, given a dataset, an ML algorithm is executed on the dataset to produce a model without using any related or prior knowledge. Although this type of isolated learning is very useful, it also has serious limitations as it does not accumulate knowledge learned in the past and use the knowledge to help future learning, which is the hallmark of human learning and human intelligence. Lifelong machine learning (LML) aims to achieve this capability. Specifically, it aims to design and develop computational learning systems and algorithms that learn as humans do, i.e., retaining the results learned in the past, abstracting knowledge from them, and using the knowledge to help future learning. In this tutorial, we will introduce the existing research of LML and to show that LML is very suitable for NLP tasks and has potential to help NLP make major progresses.


Neural Networks for Sentiment Analysis
Yue Zhang | Duy Tin Vo

Sentiment analysis has been a major research topic in natural language processing (NLP). Traditionally, the problem has been attacked using discrete models and manually-defined sparse features. Over the past few years, neural network models have received increased research efforts in most sub areas of sentiment analysis, giving highly promising results. A main reason is the capability of neural models to automatically learn dense features that capture subtle semantic information over words, sentences and documents, which are difficult to model using traditional discrete features based on words and ngram patterns. This tutorial gives an introduction to neural network models for sentiment analysis, discussing the mathematics of word embeddings, sequence models and tree structured models and their use in sentiment analysis on the word, sentence and document levels, and fine-grained sentiment analysis. The tutorial covers a range of neural network models (e.g. CNN, RNN, RecNN, LSTM) and their extensions, which are employed in four main subtasks of sentiment analysis:Sentiment-oriented embeddings;Sentence-level sentiment;Document-level sentiment;Fine-grained sentiment.The content of the tutorial is divided into 3 sections of 1 hour each. We assume that the audience is familiar with linear algebra and basic neural network structures, introduce the mathematical details of the most typical models. First, we will introduce the sentiment analysis task, basic concepts related to neural network models for sentiment analysis, and show detail approaches to integrate sentiment information into embeddings. Sentence-level models will be described in the second section. Finally, we will discuss neural network models use for document-level and fine-grained sentiment.


Continuous Vector Spaces for Cross-language NLP Applications
Rafael E. Banchs

The mathematical metaphor offered by the geometric concept of distance in vector spaces with respect to semantics and meaning has been proven to be useful in many monolingual natural language processing applications. There is also some recent and strong evidence that this paradigm can also be useful in the cross-language setting. In this tutorial, we present and discuss some of the most recent advances on exploiting the vector space model paradigm in specific cross-language natural language processing applications, along with a comprehensive review of the theoretical background behind them.First, the tutorial introduces some fundamental concepts of distributional semantics and vector space models. More specifically, the concepts of distributional hypothesis and term-document matrices are revised, followed by a brief discussion on linear and non-linear dimensionality reduction techniques and their implications to the parallel distributed approach to semantic cognition. Next, some classical examples of using vector space models in monolingual natural language processing applications are presented. Specific examples in the areas of information retrieval, related term identification and semantic compositionality are described.Then, the tutorial focuses its attention on the use of the vector space model paradigm in cross-language applications. To this end, some recent examples are presented and discussed in detail, addressing the specific problems of cross-language information retrieval, cross-language sentence matching, and machine translation. Some of the most recent developments in the area of Neural Machine Translation are also discussed.Finally, the tutorial concludes with a discussion about current and future research problems related to the use of vector space models in cross-language settings. Future avenues for scientific research are described, with major emphasis on the extension from vector and matrix representations to tensors, as well as the problem of encoding word position information into the vector-based representations.


Methods and Theories for Large-scale Structured Prediction
Xu Sun | Yansong Feng

Many important NLP tasks are casted as structured prediction problems, and try to predict certain forms of structured output from the input. Examples of structured prediction include POS tagging, named entity recognition, PCFG parsing, dependency parsing, machine translation, and many others. When apply structured prediction to a specific NLP task, there are the following challenges:1. Model selection: Among various models/algorithms with different characteristics, which one should we choose for a specific NLP task?2. Training: How to train the model parameters effectively and efficiently?3. Overfitting: To achieve good accuracy on test data, it is important to control the overfitting from the training data. How to control the overfitting risk for structured prediction?This tutorial will provide a clear overview of recent advances in structured prediction methods and theories, and address the above issues when we apply structured prediction to NLP tasks. We will introduce large margin methods (e.g., perceptrons, MIRA), graphical models (e.g., CRFs), and deep learning methods (e.g., RNN, LSTM), and show the respective advantages and disadvantages for NLP applications. For the training algorithms, we will introduce online/ stochastic training methods, and we will introduce parallel online/stochastic learning algorithms and theories to speed up the training (e.g., the Hogwild algorithm). For controlling the overfitting from training data, we will introduce the weight regularization methods, structure regularization, and implicit regularization methods.