08-1009_0	Recently , a number of data-driven distortion models , based on lexical features and relative distance , have been proposed to compensate for this weakness ( )	RB , DT NN IN JJ NN NNS , VVN IN JJ NNS CC JJ NN , VHP VBN VVN TO VV IN DT NN ( )	BackGround	GRelated	Positive
08-1009_1	Following ( ) , we conduct a targeted evaluation; we only draw our evaluation pairs from the uncohesive subset targeted by our constraint	VVG ( ) , PP VVP DT JJ NN PP RB VV PP$ NN NNS IN DT JJ NN VVN IN PP$ NN	Fundamental	Idea	Neutral
08-1009_2	Fox ( ) showed that cohesion is held in the vast majority of cases for English-French , while Cherry and Lin ( ) have shown it to be a strong feature for word alignment	NP ( ) VVD IN/that NN VBZ VVN IN DT JJ NN IN NNS IN NN , IN NP CC NP ( ) VHP VVN PP TO VB DT JJ NN IN NN NN	BackGround	GRelated	Neutral
08-1009_2	Fox ( ) demonstrated and counted cases where cohesion was not maintained in hand-aligned sentence-pairs , while Cherry and Lin ( ) showed that a soft cohesion constraint is superior to a hard constraint for word alignment	NP ( ) VVN CC VVN NNS WRB NN VBD RB VVN IN JJ NNS , IN NP CC NP ( ) VVD IN/that DT JJ NN NN VBZ JJ TO DT JJ NN IN NN NN	BackGround	GRelated	Neutral
08-1009_4	The most successful attempts at syntax-enhanced phrasal SMT have directly targeted movement modeling: Zens et al. ( ) modified a phrasal decoder with ITG constraints , while a number of researchers have employed syntax-driven source reordering before decoding begins ( )	DT RBS JJ NNS IN JJ JJ NP VHP RB VVN NN NN NP NP NP ( ) VVN DT JJ NN IN NP NNS , IN DT NN IN NNS VHP VVN JJ NN VVG IN VVG VVZ ( )	BackGround	GRelated	Positive
08-1009_4	Our experimental set-up is modeled after the human evaluation presented in ( )	PP$ JJ NN VBZ VVN IN DT JJ NN VVN IN ( )	Fundamental	Basis	Neutral
08-1009_4	Following ( ) , we provide the annotators with only short sentences: those with source sentences between 10 and 25 tokens long	VVG ( ) , PP VVP DT NNS IN JJ JJ NN DT IN NN NNS IN CD CC CD NNS RB	Fundamental	Idea	Neutral
08-1009_5	These approaches were eventually superseded by tree transducers and tree substitution grammars , which allow translation events to span subtree units , providing several advantages , including the ability to selectively produce uncohesive translations ( )	DT NNS VBD RB VVN IN NN NNS CC NN NN NNS , WDT VVP NN NNS TO NN NN NNS , VVG JJ NNS , VVG DT NN TO RB VV JJ NNS ( )	BackGround	GRelated	Positive
08-1009_6	Syntactic cohesion 1 is the notion that all movement occurring during translation can be explained by permuting children in a parse tree ( )	JJ NN CD VBZ DT NN IN/that DT NN VVG IN NN MD VB VVN IN VVG NNS IN DT VVP NN ( )	BackGround	GRelated	Neutral
08-1009_6	Previous approaches to measuring the cohesion of a sentence pair have worked with a word alignment ( )	JJ NNS TO VVG DT NN IN DT NN NN VHP VVN IN DT NN NN ( )	BackGround	GRelated	Neutral
08-1009_8	If one assumes arbitrary movement is possible , that alone is sufficient to show the problem to be NP-complete ( )	IN PP VVZ JJ NN VBZ JJ , WDT RB VBZ JJ TO VV DT NN TO VB JJ ( )	BackGround	GRelated	Positive
08-1009_9	We test our cohesion-enhanced Moses decoder trained using 688K sentence pairs of Europarl French-English data , provided by the SMT 2006 Shared Task ( )	PP VVP PP$ JJ NN NN VVN VVG JJ NN NNS IN JJ NP NNS , VVN IN DT NP CD NP NP ( )	Fundamental	Basis	Neutral
08-1009_10	Phrase-based decoding ( ) is a dominant formalism in statistical machine translation	JJ VVG ( ) VBZ DT JJ NN IN JJ NN NN	BackGround	GRelated	Neutral
08-1009_10	Early experiments with syntactically-informed phrases ( ) , and syntactic re-ranking of K-best lists ( ) produced mostly negative results	JJ NNS IN JJ NNS ( ) , CC JJ NN IN NP NNS ( ) VVN RB JJ NNS	BackGround	GRelated	Negative
08-1009_10	Restricting phrases to syntactic constituents has been shown to harm performance ( ) , so we tighten our definition of a violation to disregard cases where the only point of overlap is obscured by our phrasal resolution	VVG NNS TO JJ NNS VHZ VBN VVN TO NN NN ( ) , RB PP VV PP$ NN IN DT NN TO VV NNS WRB DT JJ NN IN VV VBZ VVD IN PP$ JJ NN	BackGround	SRelated	Negative
08-1009_11	We compare against an unmodified baseline decoder , as well as a decoder enhanced with a lexical reordering model ( )	PP VVP IN DT JJ NN NN , RB RB IN DT NN VVN IN DT JJ VVG NN ( )	Compare	Compare	Neutral
08-1009_12	We modify the Moses decoder ( ) to translate head-annotated sentences	PP VV DT NN NN ( ) TO VV JJ NNS	Fundamental	Basis	Neutral
08-1009_14	Following Lin and Cherry ( ) , we define a head span to be the projection of a single token e i onto the target phrase sequence: spanH (e i  ,T ,a m) = [a i  ,a i] and the subtree span to be the projection of the subtree rooted at ei: spanS (e i ,T ,a r( l) =   min  a , ,  max a k Consider the simple phrasal translation shown in Figure 1 along with a dependency tree for the English source	VVG NP CC NP ( ) , PP VV DT NN NN TO VB DT NN IN DT JJ JJ NN NP IN DT NN NN JJ NP NN NP NP NP NP SYM NP NP NN NN CC DT NN NN TO VB DT NN IN DT NN VVN IN JJ NNS JJ NP NN NN JJ NN SYM NP DT , , VV DT NN VV DT JJ JJ NN VVN IN NP CD IN IN DT NN NN IN DT JJ NN	Fundamental	Idea	Neutral
08-1009_15	English dependency trees are provided by Minipar ( )	JJ NN NNS VBP VVN IN NP ( )	Fundamental	Basis	Neutral
08-1009_16	Word alignments are provided by GIZA++ ( ) with grow-diag-final combination , with infrastructure for alignment combination and phrase extraction provided by the shared task	NN NNS VBP VVN IN NP ( ) IN JJ NN , IN NN IN NN NN CC NN NN VVN IN DT VVN NN	Fundamental	Basis	Neutral
08-1009_18	We first present our soft cohesion constraint's effect on BLEU score ( ) for both our dev-test and test sets	PP RB VV PP$ JJ NN NNS NN IN NP NN ( ) IN CC PP$ NN CC NN NNS	Fundamental	Basis	Neutral
08-1009_21	Weights for the log-linear model are set using MERT , as implemented by Venugopal and Vogel ( )	NNS IN DT JJ NN VBP VVN VVG NP , RB VVN IN NP CC NP ( )	Fundamental	Idea	Neutral
08-1009_23	Early methods for syntactic SMT held to this assumption in its entirety ( )	JJ NNS IN JJ NP VVD TO DT NN IN PP$ NN ( )	BackGround	GRelated	Neutral
08-1009_26	This will take the form of a check performed each time a hypothesis is extended , similar to the ITG constraint for phrasal SMT ( )	DT MD VV DT NN IN DT NN VVD DT NN DT NN VBZ JJ , JJ TO DT NP NN IN JJ NP ( )	Fundamental	Idea	Neutral
08-1010_0	One approach is to leverage underlying word alignment quality such as in Ayan and Dorr ( )	CD NN VBZ TO VV VVG NN NN NN JJ IN IN NP CC NP ( )	BackGround	SRelated	Positive
08-1010_1	We measure translation performance by the BLEU ( ) and METEOR ( ) scores with multiple translation references	PP VV NN NN IN DT NP ( ) CC NP ( ) NNS IN JJ NN NNS	Fundamental	Basis	Neutral
08-1010_2	In a statistical generative word alignment model ( ) , it is assumed that (i) a random variable a specifies how each target word fj is generated by (therefore aligned to) a source 1 word e aj; and (ii) the likelihood function f (f , a|e) specifies a generative procedure from the source sentence to the target sentence	IN DT JJ JJ NN NN NN ( ) , PP VBZ VVN DT NN DT JJ NN DT VVZ WRB DT NN NN NP VBZ VVN IN JJ VVN NN DT NN CD NN SYM NN CC NN DT NN NN SYM NN , NN VVZ DT JJ NN IN DT NN NN TO DT NN NN	BackGround	SRelated	Neutral
08-1010_3	The language model is a statistical trigram model estimated with Modified Kneser-Ney smoothing ( ) using only English sentences in the parallel training data	DT NN NN VBZ DT JJ NN NN VVN IN NP NP VVG ( ) VVG RB JJ NNS IN DT JJ NN NNS	BackGround	SRelated	Neutral
08-1010_4	Other methods do not depend on word alignments only , such as directly modeling phrase alignment in a joint generative way ( )? pursuing information extraction perspective ( ) , or augmenting with modelbased phrase pair posterior ( )	JJ NNS VVP RB VV IN NN NNS RB , JJ IN RB VVG NN NN IN DT JJ JJ NN ( JJ VVG NN NN NN ( ) , CC VVG IN JJ NN NN NN ( )	BackGround	GRelated	Neutral
08-1010_4	On the other hand , there are valid translation pairs in the training corpus that are not learned due to word alignment errors as shown in Deng and Byrne ( )	IN DT JJ NN , EX VBP JJ NN NNS IN DT NN NN WDT VBP RB VVN JJ TO NN NN NNS RB VVN IN NP CC NP ( )	BackGround	GRelated	Neutral
08-1010_4	The likelihood of those generative procedures can be accumulated to get the likelihood of the phrase pair ( )	DT NN IN DT JJ NNS MD VB VVN TO VV DT NN IN DT NN NN ( )	BackGround	SRelated	Neutral
08-1010_5	In the word-alignment derived phrase extraction approach , precision can be improved by filtering out most of the entries by using a statistical significance test ( )	IN DT NN VVN NN NN NN , NN MD VB VVN IN VVG IN JJS IN DT NNS IN VVG DT JJ NN NN ( )	BackGround	GRelated	Neutral
08-1010_7	This is also the place where linguistic constraints can be applied , say to avoid non-compositional phrases ( )	DT VBZ RB DT NN WRB JJ NNS MD VB VVN , VVP TO VV JJ NNS ( )	BackGround	SRelated	Neutral
08-1010_7	Second , some n-grams themselves carry no linguistic meaning; their phrase translations can be misleading , for example non-compositional phrases ( )	RB , DT NNS PP VVP DT JJ NN PP$ NN NNS MD VB VVG , IN NN JJ NNS ( )	BackGround	GRelated	Neutral
08-1010_9	In the final step 4 (line 15) , parameters {A k , t } are discriminatively trained on a development set using the downhill simplex method ( )	IN DT JJ NN CD NN JJ , NNS NN NN , NN ) VBP RB VVN IN DT NN VVD VVG DT RB JJ NN ( )	Fundamental	Basis	Neutral
08-1010_10	By combining word alignments in two directions using heuristics ( ) , a single set of static word alignments is then formed	IN VVG NN NNS IN CD NNS VVG NNS ( ) , DT JJ NN IN JJ NN NNS VBZ RB VVN	Fundamental	Basis	Neutral
08-1010_10	Two different word alignment models are trained as the baseline , one is symmetric HMM word alignment model , the other is IBM Model-4 as implemented in the GIZA++ toolkit ( )	CD JJ NN NN NNS VBP VVN IN DT NN , PP VBZ JJ NP NN NN NN , DT JJ VBZ NP NP RB VVD IN DT NP NN ( )	Fundamental	Idea	Neutral
08-1010_11	The translation probability can also be discriminatively trained such as in Tillmann and Zhang ( )	DT NN NN MD RB VB RB VVN JJ IN IN NP CC NP ( )	BackGround	GRelated	Neutral
08-1010_12	WPPCR was used as one of the scores in ( ) for phrase extraction	NP VBD VVN IN CD IN DT NNS IN ( ) IN NN NN	Fundamental	Basis	Neutral
08-1010_12	The generic phrase training algorithm follows an information retrieval perspective as in ( ) but aims to improve both precision and recall with the trainable log-linear model	DT JJ NN NN NN VVZ DT NN NN NN IN IN ( ) CC VVZ TO VV DT NN CC NN IN DT JJ JJ NN	Fundamental	Idea	Neutral
08-1010_13	Methods have been proposed , based on syntax , that take advantage of linguistic constraints and alignment of grammatical structure , such as in Yamada and Knight ( ) and Wu ( )	NNS VHP VBN VVN , VVN IN NN , WDT VVP NN IN JJ NNS CC NN IN JJ NN , JJ IN IN NP CC NP ( ) CC NP ( )	BackGround	GRelated	Neutral
08-1010_15	We then train HMM word alignment models ( ) in two directions simultaneously by merging statistics collected in the Algorithm 1 A Generic Phrase Training Procedure E-step from two directions motivated by Zens et al. ( ) with 5 iterations	PP RB VVP NP NN NN NNS ( ) IN CD NNS RB IN VVG NNS VVN IN DT NP CD NP NP NP NP NP NP IN CD NNS VVN IN NP NP NP ( ) IN CD NNS	Fundamental	Idea	Neutral
08-1010_16	Since most phrases appear only a few times in training data , a phrase pair translation is also evaluated by lexical weights ( )?r term weighting ( ) as additional features to avoid overestimation	IN JJS NNS VVP RB DT JJ NNS IN NN NNS , DT NN NN NN VBZ RB VVN IN JJ NNS ( NN NN NN ( ) IN JJ NNS TO VV NN	BackGround	GRelated	Neutral
08-1011_0	We used the SRI Language Modeling Toolkit ( ) to train a five-gram model with modified Kneser-Ney smoothing ( )	PP VVD DT NP NP NP NP ( ) TO VV DT NN NN IN JJ NP VVG ( )	Fundamental	Basis	Neutral
08-1011_1	Based on the source syntax parse tree , for each measure word , we identified its head word by using a toolkit from ( ) which can heuristically identify head words for sub-trees	VVN IN DT NN NN VVP NN , IN DT NN NN , PP VVD PP$ NN NN IN VVG DT NN IN ( ) WDT MD RB VV NN NNS IN NNS	Fundamental	Basis	Neutral
08-1011_2	We used an SMT system similar to Chiang ( ) , in which FBIS corpus is used as the bilingual training data	PP VVD DT NP NN JJ TO NP ( ) , IN WDT NP NN VBZ VVN IN DT JJ NN NNS	Fundamental	Idea	Neutral
08-1011_3	We ran GI-ZA++ ( ) on the training corpus in both directions with IBM model 4 , and then applied the refinement rule described in ( ) to obtain a many-to-many word alignment for each sentence pair	PP VVD NP ( ) IN DT NN NN IN DT NNS IN NP NN CD , CC RB VVD DT NN NN VVN IN ( ) TO VV DT NN NN NN IN DT NN NN	Fundamental	Basis	Neutral
08-1011_4	The most relevant work based on statistical methods to our research might be statistical technologies employed to model issues such as morphology generation ( )	DT RBS JJ NN VVN IN JJ NNS TO PP$ NN MD VB JJ NNS VVN TO NN NNS JJ IN NN NN ( )	BackGround	GRelated	Neutral
08-1011_6	In most statistical machine translation (SMT) models ( ) , some of measure words can be generated without modification or additional processing	IN RBS JJ NN NN NN NNS ( ) , DT IN NN NNS MD VB VVN IN NN CC JJ NN	BackGround	GRelated	Neutral
08-1011_7	In addition to precision and recall , we also evaluate the Bleu score ( ) changes before and after applying our measure word generation method to the SMT output	IN NN TO NN CC NN , PP RB VV DT NP NN ( ) NNS IN CC IN VVG PP$ NN NN NN NN TO DT NP NN	Fundamental	Basis	Neutral
08-1011_8	In our work , the Berkeley parser ( ) was employed to extract syntactic knowledge from the training corpus	IN PP$ NN , DT NP NN ( ) VBD VVN TO VV JJ NN IN DT NN NN	Fundamental	Basis	Neutral
08-1012_0	Specific to our ITG case , the M step becomes: (i+i) exp(pP(E(X ?[XX]) + ax )) p (i+i) exp(pp(E (X) + sax)) exp(pP(E(X - (X X)) + ax)) exp(pP(E (X) + sax)) ' exp(pp(E (X ?C) + ax )) P (l+ i)(e/f) exp(pp(E(X) + sax)) exp(pP(E (e/f) + ac)) exp(pp(E (C) + ma c))' where ip is the digamma function ( ) , s = 3?s the number of right-hand-sides for X , and m is the number of observed phrase pairs in the data	JJ TO PP$ NP NN , DT NP NN NN NN NP NP SYM NN NN NN NN JJ NN SYM JJ NN : NP NP SYM JJ JJ NN SYM NN POS JJ JJ NN SYM NN NN NN NN JJ NN SYM JJ JJ NN SYM JJ JJ NN SYM NN NNS WRB NN VBZ DT NN NN ( ) , NN SYM JJ DT NN IN NNS IN NP , CC NN VBZ DT NN IN JJ NN NNS IN DT NNS	BackGround	SRelated	Neutral
08-1012_1	These word-level alignments are most often obtained using Expectation Maximization on the conditional generative models of Brown et al. ( ) and Vogel et al. ( )	DT JJ NNS VBP RBS RB VVN VVG NN NN IN DT JJ JJ NNS IN NP NP NP ( ) CC NP NP NP ( )	BackGround	GRelated	Neutral
08-1012_1	The traditional estimation method for word alignment models is the EM algorithm ( ) which iteratively updates parameters to maximize the likelihood of the data	DT JJ NN NN IN NN NN NNS VBZ DT JJ NN ( ) WDT RB NNS NNS TO VV DT NN IN DT NNS	BackGround	GRelated	Neutral
08-1012_2	The heuristic method is based on the Non-Compositional Constraint of Cherry and Lin ( )	DT JJ NN VBZ VVN IN DT JJ NN IN NP CC NP ( )	Fundamental	Basis	Neutral
08-1012_3	Kurihara and Sato ( ) describe VB for PCFGs , showing the only need is to change the M step of the EM algorithm	NP CC NP ( ) VV NP IN NP , VVG DT JJ NN VBZ TO VV DT NP NN IN DT JJ NN	BackGround	SRelated	Neutral
08-1012_4	Minimum Error Rate training ( ) over BLEU was used to optimize the weights for each of these models over the development test data	JJ NP NP NN ( ) IN NP VBD VVN TO VV DT NNS IN DT IN DT NNS IN DT NN NN NNS	Fundamental	Basis	Positive
08-1012_5	Like Zhang and Gildea ( ) , it is used to prune bitext cells rather than score phrases	IN NP CC NP ( ) , PP VBZ VVN TO VV NN NNS RB IN NN NNS	Fundamental	Idea	Neutral
08-1012_5	Our pruning differs from Zhang and Gildea ( ) in two major ways	PP$ VVG VVZ IN NP CC NP ( ) IN CD JJ NNS	Compare	Compare	Neutral
08-1012_5	The tic-tac-toe pruning algorithm ( ) uses dynamic programming to compute the product of inside and outside scores for all cells in O(n 4) time	DT NN VVG NN ( ) VVZ JJ NN TO VV DT NN IN JJ CC JJ NNS IN DT NNS IN NP JJ NN	BackGround	SRelated	Neutral
08-1012_5	Figure 2 compares the speed of the fast tic-tac-toe algorithm against the algorithm in Zhang and Gildea ( )	NN CD VVZ DT NN IN DT JJ NN NN IN DT NN IN NP CC NP ( )	Compare	Compare	Neutral
08-1013_0	We applied the decompounding algorithm proposed in Adda-Decker ( ) to our corpus to extract such compounds	PP VVD DT VVG NN VVN IN NP ( ) TO PP$ NN TO VV JJ NNS	Fundamental	Basis	Neutral
08-1013_0	Our experiments are based on word lattice output from the LIMSI German broadcast news transcription system ( ) , which employs 4-gram backoff language models	PP$ NNS VBP VVN IN NN NN NN IN DT NP JJ NN NN NN NN ( ) , WDT VVZ JJ NN NN NNS	Fundamental	Basis	Neutral
08-1013_0	The evaluation scheme was taken from McTait and Adda-Decker ( )	DT NN NN VBD VVN IN NP CC NP ( )	Fundamental	Basis	Neutral
08-1013_1	Beutler et al. ( ) pursued a similar approach	NP NP NP ( ) VVN DT JJ NN	BackGround	GRelated	Neutral
08-1013_2	In order to compute the probability of a parse tree , it is transformed to a flat dependency tree similar to the syntax graph representation used in the TIGER treebank Brants et al ( )	IN NN TO VV DT NN IN DT VVP NN , PP VBZ VVN TO DT JJ NN NN JJ TO DT NN NN NN VVN IN DT NP NN NP NP NP ( )	Fundamental	Idea	Neutral
08-1013_3	Other linguistically inspired language models like Chelba and Jelinek ( ) and Roark ( ) have been applied to continuous speech recognition	JJ RB VVN NN NNS IN NP CC NP ( ) CC NP ( ) VHP VBN VVN TO JJ NN NN	BackGround	GRelated	Neutral
08-1013_4	To extract such word clusters we used suffix arrays proposed in Ya-mamoto and Church ( ) and the pointwise mutual information measure	TO VV JJ NN NNS PP VVD NN NNS VVN IN NP CC NP ( ) CC DT JJ JJ NN NN	Fundamental	Basis	Neutral
08-1013_5	More accurate statistical models of natural language have mainly been developed in the field of statistical parsing , e.g.Collins ( ) , Charniak ( ) and Ratnaparkhi ( )	RBR JJ JJ NNS IN JJ NN VHP RB VBN VVN IN DT NN IN JJ VVG , NNS ( ) , NP ( ) CC NP ( )	BackGround	GRelated	Positive
08-1013_6	Our grammar incorporates many ideas from existing linguistic work , e.g.Miiller ( ) , Muller ( ) , Crysmann ( ) , Crysmann ( )	PP$ NN VVZ JJ NNS IN VVG JJ NN , NN ( ) , NP ( ) , NP ( ) , NP ( )	Fundamental	Idea	Neutral
08-1013_8	Our main source of dictionary information was Duden ( )	PP$ JJ NN IN NN NN VBD NP ( )	Fundamental	Basis	Neutral
08-1013_9	This improvement is statistically significant on a level of < 0.1% for both the Matched Pairs Sentence-Segment Word Error test (MAPSSWE) and McNemar's test ( )	DT NN VBZ RB JJ IN DT NN IN SYM CD IN CC DT NP NP NP NP NP NN NN CC NP NN ( )	Compare	Compare	Positive
08-1013_10	Our particular variant requires that constituents (phrases) be continuous , but it provides a mechanism for dealing with discontinuities as present e.g.in the German main clause , see Kaufmann and Pfister ( )	PP$ JJ NN VVZ IN/that NNS NN VB JJ , CC PP VVZ DT NN IN VVG IN NNS IN JJ NN DT JJ JJ NN , VVP NP CC NP ( )	BackGround	SRelated	Neutral
08-1013_16	We used the Head-driven Phrase Structure Grammar (HPSG , see Pollard and Sag ( )) formalism to develop a precise large-coverage grammar for German	PP VVD DT NP NP NP NP NN , VVP NP CC NP ( JJ NN TO VV DT JJ NN NN IN JJ	Fundamental	Basis	Neutral
08-1014_0	Natural language processing research has addressed a number of these issues as individual problems: automatic punctuation ( ) , text segmentation ( ) disfluency repair ( ) and error correction ( )	JJ NN NN NN VHZ VVN DT NN IN DT NNS IN JJ JJ JJ NN ( ) , NN NN ( ) NN NN ( ) CC NN NN ( )	BackGround	GRelated	Neutral
08-1014_1	Following Strzalkowski and Brandow ( ) and Peters and Drexel ( ) we have implemented a transformation-based learning (TBL) algorithm ( )	VVG NP CC NP ( ) CC NP CC NP ( ) PP VHP VVN DT JJ VVG NN NN ( )	Fundamental	Idea	Neutral
08-1014_3	The recognition output is auto-punctuated by a method similar in spirit to the one proposed by Liu et al. ( ) before being passed to the transformation model	DT NN NN VBZ JJ IN DT NN JJ IN NN TO DT CD VVN IN NP NP NP ( ) IN VBG VVN TO DT NN NN	Fundamental	Idea	Neutral
08-1014_5	Deviating from Peters and Drexel ( ) , in the special case of an empty target sequence , i.e	VVG IN NP CC NP ( ) , IN DT JJ NN IN DT JJ NN NN , NN	Compare	Compare	Neutral
08-1014_5	Again deviating from Peters and Drexel ( ) , we consider two rules as overlapping if the left-hand-side of one is a contiguous subsequence of the other	RB VVG IN NP CC NP ( ) , PP VVP CD NNS IN VVG IN DT NN IN CD VBZ DT JJ NN IN DT JJ	Compare	Compare	Neutral
08-1014_6	The work of Ringger and Allen ( ) is similar in spirit to this method , but uses a factored source-channel model	DT NN IN NP CC NP ( ) VBZ JJ IN NN TO DT NN , CC VVZ DT VVN NN NN	Fundamental	Idea	Neutral
08-1015_0	before , during , etc.) ( )	RB , IN , JJ ( )	NULL	NULL	NULL
08-1015_1	Queries are generated artificially using a method similar to Berger and Lafferty ( ) and used in Fleischman and Roy ( )	NNS VBP VVN RB VVG DT NN JJ TO NP CC NP ( ) CC VVN IN NP CC NP ( )	Fundamental	Idea	Neutral
08-1015_2	Recent work in automatic image annotation ( ) and natural language processing ( ) , however , have demonstrated the advantages of using hierarchical Bayesian models for related tasks	JJ NN IN JJ NN NN ( ) CC JJ NN NN ( ) , RB , VHP VVN DT NNS IN VVG JJ NP NNS IN JJ NNS	BackGround	GRelated	Positive
08-1015_4	We use the system of Bouthemy et al. ( ) which computes the camera motion using the parameters of a two-dimensional affine model to fit every pair of sequential frames in a video	PP VVP DT NN IN NP NP NP ( ) WDT VVZ DT NN NN VVG DT NNS IN DT JJ JJ NN TO VV DT NN IN JJ NNS IN DT NN	Fundamental	Basis	Neutral
08-1015_5	The traditional text-only language models (which are also used below as baseline comparisons) are generated with the SRI language modeling toolkit ( ) using Chen and Goodman's modified Kneser-Ney discounting and interpolation ( )	DT JJ JJ NN NNS NN VBP RB VVN IN IN NN NN VBP VVN IN DT NP NN NN NN ( ) VVG NP CC NP JJ NP NN CC NN ( )	Fundamental	Basis	Neutral
08-1015_6	The method is based on the use of grounded language models to repre-sent the relationship between words and the non-linguistic context to which they refer ( )	DT NN VBZ VVN IN DT NN IN VVN NN NNS TO VV DT NN IN NNS CC DT JJ NN TO WDT PP VVP ( )	Fundamental	Basis	Neutral
08-1015_6	Although these correlations are not perfect , experiments have shown that baseball events can be classified using such features ( )	IN DT NNS VBP RB JJ , NNS VHP VVN IN/that NN NNS MD VB VVN VVG JJ NNS ( )	BackGround	GRelated	Neutral
08-1015_6	( )	( )	NULL	NULL	NULL
08-1015_8	Thus , for a robot operating in a laboratory setting , words for colors and shapes may be grounded in the outputs of its computer vision system ( ); while for a simulated agent operating in a virtual world , words for actions and events may be mapped to representations of the agent's plans or goals ( )	RB , IN DT NN VVG IN DT NN NN , NNS IN NNS CC NNS MD VB VVN IN DT NNS IN PP$ NN NN NN ( JJ NN IN DT JJ NN VVG IN DT JJ NN , NNS IN NNS CC NNS MD VB VVN TO NNS IN DT JJ NNS CC NNS ( )	BackGround	GRelated	Neutral
08-1015_9	Previous work has examined applying models often used in MT to the paired corpus described above ( )	JJ NN VHZ VVN VVG NNS RB VVN IN NP TO DT VVN NN VVN IN ( )	BackGround	GRelated	Neutral
08-1015_11	We follow previous work in sports video processing ( ) and define an event in a baseball video as any sequence of shots starting with a pitching-scene and continuing for four subsequent shots	PP VVP JJ NN IN NNS JJ NN ( ) CC VV DT NN IN DT NN NN IN DT NN IN NNS VVG IN DT NN CC VVG IN CD JJ NNS	Fundamental	Idea	Neutral
08-1015_12	Because these transcriptions are not necessarily time synched with the audio , we use the method described in Hauptmann and Witbrock ( ) to align the closed captioning to the announcers' speech	IN DT NNS VBP RB RB NN VVN IN DT NN , PP VVP DT NN VVN IN NP CC NP ( ) TO VV DT JJ VVG TO DT NP NN	Fundamental	Basis	Neutral
08-1015_13	Recent work in video surveillance has demonstrated the benefit of representing complex events as temporal relations between lower level subevents ( )	JJ NN IN JJ NN VHZ VVN DT NN IN VVG JJ NNS IN JJ NNS IN JJR NN NNS ( )	BackGround	SRelated	Positive
08-1015_18	Recognizing speech in broadcast video is a necessary precursor to many multimodal applications such as video search and summarization ( )	VVG NN IN NN NN VBZ DT JJ NN TO JJ JJ NNS JJ IN JJ NN CC NN ( )	BackGround	GRelated	Neutral
08-1015_21	Shot detection and segmentation is a well studied problem; in this work we use the method of Tardini et al. ( )	NN NN CC NN VBZ DT RB VVN NN IN DT NN PP VVP DT NN IN NP NP NP ( )	Fundamental	Basis	Neutral
08-1015_22	Although performance is often reasonable in controlled environments (such as studio news rooms) , automatic speech recognition (ASR) systems have significant difficulty in noisier settings (such as those found in live sports broadcasts) ( )	IN NN VBZ RB JJ IN JJ NNS NN IN NN NN NN , JJ NN NN NN NNS VHP JJ NN IN JJR NNS NN IN DT VVN IN JJ NNS NN ( )	BackGround	GRelated	Negative
08-1015_22	Such video IR systems often use speech transcriptions to index segments of video in much the same way that words are used to index text documents ( )	JJ JJ NN NNS RB VVP NN NNS TO NN NNS IN NN IN RB DT JJ NN IN/that NNS VBP VVN TO NN NN NNS ( )	BackGround	GRelated	Neutral
08-1015_23	The WEKA machine learning package is used to train a boosted decision tree to classify these frames into one of three categories: pitching-scene , field-scene , other ( )	DT NP NN VVG NN VBZ VVN TO VV DT VVN NN NN TO VV DT NNS IN CD IN CD NN NN , NN , JJ ( )	Fundamental	Basis	Neutral
08-1016_0	Ando and Lee's ( ) kanji segmenter.) On the other hand , modelling only partial words helps the segmenter handle long , infrequent words	NP CC NP ( ) FW NN IN DT JJ NN , VVG RB JJ NNS VVZ DT NN VV JJ , JJ NNS	NULL	NULL	NULL
08-1016_2	Finite-state models ( ) might be more compact	JJ NNS ( ) MD VB RBR JJ	BackGround	MRelated	Positive
08-1016_3	Brent ( ) and Venkataraman ( ) present incremental splitting algorithms with BF about 82% 3 on the Bernstein-Ratner (BR87) corpus of infant-directed English with disfluencies and interjections removed ( )	NP ( ) CC NP ( ) JJ JJ NN NNS IN NP IN CD CD IN DT NP NN NN IN JJ NNS IN NNS CC NNS VVN ( )	BackGround	GRelated	Neutral
08-1016_4	Child-directed speech displays helpful features such as shorter phrases and fewer reductions ( )	JJ NN VVZ JJ NNS JJ IN JJR NNS CC JJR NNS ( )	BackGround	GRelated	Positive
08-1016_6	Learning to segment words is an old problem , with extensive prior work surveyed in ( )	VVG TO NN NNS VBZ DT JJ NN , IN JJ JJ NN VVN IN ( )	BackGround	GRelated	Neutral
08-1016_6	To build unsupervised algorithms , Brent and Cartwright suggested ( ) inferring phonotac-tic constraints from phone sequences observed at phrase boundaries	TO VV JJ NNS , NP CC NP VVD ( ) VVG JJ NNS IN NN NNS VVD IN NN NNS	BackGround	GRelated	Neutral
08-1016_7	Feature-based or gestural phonology ( ) might help model segmental variation	JJ CC JJ NN ( ) MD VV NN JJ NN	BackGround	MRelated	Positive
08-1016_8	Simple supervised algorithms perform extremely well ( ) , but don't address our main goal: learning how to segment	JJ JJ NNS VVP RB RB ( ) , CC VVD VV PP$ JJ JJ VVG WRB TO NN	BackGround	GRelated	Positive
08-1016_8	Statistics of phone trigrams provide sufficient information to segment adult conversational speech (dictionary transcriptions with simulated phonology) with about 90% precision and 93% recall ( ) , see also ( )	NNS IN NN NNS VVP JJ NN TO NN NN JJ NN NN NNS IN JJ NN IN RB CD NN CC CD NN ( ) , VVP RB ( )	BackGround	GRelated	Neutral
08-1016_8	Early results using neural nets by Cairns et al. ( ) and Christiansen et al ( ) are discouraging	JJ NNS VVG JJ NNS IN NP NP NP ( ) CC NP NP NP ( ) VBP VVG	BackGround	GRelated	Negative
08-1016_8	See also ( )	VVP RB ( )	BackGround	SRelated	Neutral
08-1016_9	Word segmentation experiments by Christiansen and Allen ( ) and Harrington et al. ( ) simulated the effects of pronunciation variation and/or recognizer error.	NN NN NNS IN NP CC NP ( ) CC NP NP NP ( ) JJ DT NNS IN NN NN NN NN NN	BackGround	GRelated	Neutral
08-1016_10	Attempts to segment transcriptions without pauses , e.g. ( ) , have worked poorly	NNS TO NN NNS IN NNS , FW ( ) , VHP VVN RB	BackGround	GRelated	Negative
08-1016_11	Disfluen-cies in conversational speech create pauses where you might not expect them , e.g.immediately following the definite article ( )	NNS IN JJ NN VV NNS WRB PP MD RB VV PP , RB VVG DT JJ NN ( )	NULL	NULL	NULL
08-1016_12	The other two English dictionary transcriptions were produced in a similar way from the Buckeye corpus ( ) and Mississippi State's corrected version of the LDC's Switchboard transcripts ( )	DT JJ CD JJ NN NNS VBD VVN IN DT JJ NN IN DT NP NN ( ) CC NP NP VVD NN IN DT NP NP NNS ( )	Fundamental	Basis	Neutral
08-1016_15	The most recent algorithm ( ) achieves a BF of 85.8% using a Dirichlet Process bigram model , estimated using a Gibbs sampling algorithm	DT RBS JJ NN ( ) VVZ DT NP IN CD VVG DT NP NP NN NN , VVD VVG DT NP NN NN	BackGround	GRelated	Neutral
08-1016_16	Supervised phonotactic methods date back at least to ( ) , see also ( )	JJ JJ NNS VVP RB IN JJS TO ( ) , VVP RB ( )	BackGround	SRelated	Neutral
08-1016_17	This issue was noted by Harrington et al. ( ) who used a list of known very short words to detect these cases	DT NN VBD VVN IN NP NP NP ( ) WP VVD DT NN IN VVN RB JJ NNS TO VV DT NNS	BackGround	SRelated	Neutral
08-1016_18	Prosody , stress , and other sub-phonemic cues might disambiguate some problem situations ( )	NN , NN , CC JJ JJ NNS MD VV DT NN NNS ( )	BackGround	MRelated	Neutral
08-1016_19	Some words are "massively" reduced ( ) , going well beyond standard phonological rules	DT NNS VBP JJ VVN ( ) , VVG RB IN JJ JJ NNS	BackGround	GRelated	Positive
08-1016_20	Claims that humans can extract words without pauses seem to be based on psychological experiments such as ( ) which conflate words and morphemes	VVZ IN/that NNS MD VV NNS IN NNS VVP TO VB VVN IN JJ NNS JJ IN ( ) WDT VVP NNS CC NNS	BackGround	GRelated	Neutral
08-1016_22	For example , Figure 1 shows a transcribed phrase from the Buckeye corpus ( ) and the automatically segmented output	IN NN , NN CD NNS DT VVN NN IN DT NP NN ( ) CC DT RB VVN NN	Fundamental	Basis	Neutral
08-1016_27	Even then , explicit boundaries seem to improve performance ( )	RB RB , JJ NNS VVP TO VV NN ( )	BackGround	GRelated	Positive
08-1016_30	Segmentation by adults is sensitive to phono-tactic constraints ( )	NN IN NNS VBZ JJ TO JJ NNS ( )	BackGround	GRelated	Neutral
08-1016_31	The Spanish corpus was produced in a similar way from the Callhome Spanish dataset ( ) , removing all accents	DT JJ NN VBD VVN IN DT JJ NN IN DT NP JJ NN ( ) , VVG DT NNS	Fundamental	Basis	Neutral
08-1017_0	Query expansion ( ) is a commonly used strategy to bridge the vocabulary gaps by expanding original queries with related terms	NP NN ( ) VBZ DT RB VVN NN TO VV DT NN NNS IN VVG JJ NNS IN JJ NNS	BackGround	GRelated	Neutral
08-1017_1	The more common words the definitions of two terms have , the more similar these terms are ( )	DT JJR JJ NNS DT NNS IN CD NNS VHP , DT RBR JJ DT NNS VBP ( )	BackGround	SRelated	Neutral
08-1017_3	Most information retrieval models ( ) compute relevance scores based on matching of terms in queries and documents	JJS NN NN NNS ( ) VV NN NNS VVN IN VVG IN NNS IN NNS CC NNS	BackGround	GRelated	Neutral
08-1017_3	Model Axiomatic approaches have recently been proposed and studied to develop retrieval functions ( )	NP NP NNS VHP RB VBN VVN CC VVN TO VV NN NNS ( )	BackGround	GRelated	Neutral
08-1017_3	In ( ) , several axiomatic retrieval functions have been derived based on a set of basic formalized retrieval constraints and an inductive definition of the retrieval function space	IN ( ) , JJ JJ NN NNS VHP VBN VVN VVN IN DT NN IN JJ VVN NN NNS CC DT JJ NN IN DT NN NN NN	BackGround	SRelated	Neutral
08-1017_3	In this paper , we use the best performing function derived in axiomatic retrieval models , i.e , F2-EXP in ( ) with a fixed parameter value (b = 0.5)	IN DT NN , PP VVP DT JJS NN NN VVN IN JJ NN NNS , NNS , NP IN ( ) IN DT VVN NN NN NN SYM NN	Fundamental	Basis	Positive
08-1017_4	Expanded terms are often selected from either co-occurrence-based thesauri ( ) or handcrafted thesauri ( ) or both ( )	JJ NNS VBP RB VVN IN DT JJ NNS ( ) CC VVN NNS ( ) CC DT ( )	BackGround	GRelated	Neutral
08-1017_4	In this paper , we re-examine the problem of query expansion using lexical resources with the recently proposed axiomatic approaches ( )	IN DT NN , PP VV DT NN IN NN NN VVG JJ NNS IN DT RB VVN JJ NNS ( )	Fundamental	Basis	Neutral
08-1017_4	According to the retrieval performance , the proposed similarity function is significantly better than simple mutual information based similarity function , while it is comparable to the function proposed in ( )	VVG TO DT NN NN , DT VVN NN NN VBZ RB JJR IN JJ JJ NN VVN NN NN , IN PP VBZ JJ TO DT NN VVN IN ( )	Compare	Compare	Positive
08-1017_4	To overcome this limitation , in ( ) , we proposed a set of semantic term matching constraints and modified the previously derived axiomatic functions to make them satisfy these additional constraints	TO VV DT NN , RB ( ) , PP VVD DT NN IN JJ NN VVG NNS CC VVN DT RB VVN JJ NNS TO VV PP VV DT JJ NNS	BackGround	SRelated	Positive
08-1017_4	In our previous study ( ) , term similarity function s is derived based on the mutual information of terms over collections that are constructed under the guidance of a set of term semantic similarity constraints	IN PP$ JJ NN ( ) , NN NN NN NN VBZ VVN VVN IN DT JJ NN IN NNS IN NNS WDT VBP VVN IN DT NN IN DT NN IN NN JJ NN NNS	BackGround	SRelated	Neutral
08-1017_4	The parameter sensitivity is similar to the observations described in ( ) and will not be discussed in this paper	DT NN NN VBZ JJ TO DT NNS VVN IN ( ) CC MD RB VB VVN IN DT NN	Fundamental	Idea	Neutral
08-1017_4	s MIBL uses the collection itself to compute the mutual information , while s MIImp uses the working sets con-structed based on several constraints ( )	JJ NP VVZ DT NN PP TO VV DT JJ NN , IN NN NP VVZ DT VVG NNS VVD VVN IN JJ NNS ( )	BackGround	SRelated	Neutral
08-1017_4	We first compare the retrieval performance of query expansion with different similarity functions using short keyword (i.e. , title-only) queries , because query expansion techniques are often more effective for shorter queries ( )	PP RB VVP DT NN NN IN NN NN IN JJ NN NNS VVG JJ NN NN , JJ NNS , IN NN NN NNS VBP RB RBR JJ IN JJR NNS ( )	BackGround	SRelated	Positive
08-1017_7	In this paper , we study several term similarity functions that exploit various information from two lexical resources , i.e. , WordNet and dependency-thesaurus constructed by Lin ( ) , and then incorporate these similarity functions into the axiomatic retrieval framework	IN DT NN , PP VVP JJ NN NN NNS WDT VVP JJ NN IN CD JJ NNS , FW , NN CC NN VVN IN NP ( ) , CC RB VV DT NN NNS IN DT JJ NN NN	Fundamental	Basis	Neutral
08-1017_7	In this section , we discuss a set of term similarity functions that exploit the information stored in two lexical resources: WordNet ( ) and dependency-based thesaurus ( )	IN DT NN , PP VVP DT NN IN NN NN NNS WDT VVP DT NN VVD IN CD JJ NN NP ( ) CC JJ NN ( )	Fundamental	Basis	Neutral
08-1017_7	Another lexical resource we study in the paper is the dependency-based thesaurus provided by Lin 1 ( )	DT JJ NN PP VVP IN DT NN VBZ DT JJ NN VVN IN NP CD ( )	Fundamental	Basis	Neutral
08-1017_11	The most commonly used lexical resource is WordNet ( ) , which is a hand-crafted lexical system developed at Princeton University	DT RBS RB VVN JJ NN VBZ NP ( ) , WDT VBZ DT NN JJ NN VVN IN NP NP	BackGround	SRelated	Neutral
08-1017_18	However , previous studies failed to show any significant gain in retrieval performance when queries are expanded with terms selected from WordNet ( )	RB , JJ NNS VVD TO VV DT JJ NN IN NN NN WRB NNS VBP VVN IN NNS VVN IN NP ( )	BackGround	GRelated	Negative
08-1017_18	By incorporating this similarity function into the axiomatic retrieval models , we show that query expansion using the information from only WordNet can lead to significant improvement of retrieval performance , which has not been shown in the previous studies ( )	IN VVG DT NN NN IN DT JJ NN NNS , PP VVP IN/that NN NN VVG DT NN IN JJ NP MD VV TO JJ NN IN NN NN , WDT VHZ RB VBN VVN IN DT JJ NNS ( )	Compare	Compare	Negative
08-1017_20	Voorhees ( ) showed that using WordNet for word sense disambiguation degrade the retrieval performance	NP ( ) VVD IN/that VVG NN IN NN NN NN VVP DT NN NN	BackGround	GRelated	Neutral
08-1018_0	Stemming is related to query expansion or query reformulation ( ) , although the latter is not limited to word variants	VVG VBZ VVN TO VV NN CC NN NN ( ) , IN DT NN VBZ RB VVN TO NN NNS	BackGround	GRelated	Neutral
08-1018_1	Because err(W) is a convex function of W , it has a global minimum and obtains its minimum when the gradient is zero ( )	IN NN VBZ DT JJ NN IN NP , PP VHZ DT JJ NN CC VVZ PP$ NN WRB DT NN VBZ CD ( )	BackGround	SRelated	Neutral
08-1018_2	Therefore , we use the forward-backward algorithm ( ) to calculate P(e ij) in a more efficient way	RB , PP VVP DT JJ NN ( ) TO VV JJ NN IN DT RBR JJ NN	Fundamental	Basis	Neutral
08-1018_2	There are several regression models , ranging from the simplest linear regression model to non-linear alternatives , such as a neural network () , a Regression SVM ( )	EX VBP JJ NN NNS , VVG IN DT JJS JJ NN NN TO JJ NNS , JJ IN DT JJ NN NN , DT NP NP ( )	BackGround	GRelated	Neutral
08-1018_3	This inconsistency may result in severe problems when the scales of feature values vary dramatically ( )	DT NN MD VV IN JJ NNS WRB DT NNS IN NN NNS VVP RB ( )	BackGround	SRelated	Neutral
08-1018_6	Then we do corpus analysis to filter out the words which are clustered incorrectly , according to word distributional similarity , following ( )	RB PP VVP NN NN TO NN IN DT NNS WDT VBP VVN RB , VVG TO NN JJ NN , VVG ( )	Fundamental	Idea	Neutral
08-1018_7	UMASS: This is the result reported in ( ) using Porter stemming for both document and query terms	NP NP VBZ DT NN VVN IN ( ) VVG NP VVG IN DT NN CC NN NNS	BackGround	SRelated	Neutral
08-1018_8	Most stemmers , such as the Porter stemmer ( ) and Krovetz stemmer ( ) , deal with stemming by stripping word suffixes according to a set of morphological rules	JJS NNS , JJ IN DT NP NP ( ) CC NP NP ( ) , NN IN VVG IN VVG NN NNS VVG TO DT NN IN JJ NNS	BackGround	GRelated	Neutral
08-1018_8	Among them , the Porter stemmer ( ) is the most widely used	IN PP , DT NP NP ( ) VBZ DT RBS RB VVN	BackGround	SRelated	Neutral
08-1018_10	The second feature is an extension to point-wise mutual information ( ) , defined as follows: ^P(controlling...acidic ,rain I window)^ P(controlling) P(acidic)P(rain) where P(controlling...acidic...rainlwindow) is the co-occurrence probability of the trigram containing acidic within a predefined window (50 words)	DT JJ NN VBZ DT NN TO JJ JJ NN ( ) , VVN IN JJ JJ NN NN NN NP NP WRB NP VBZ DT NN NN IN DT NN VVG JJ IN DT JJ NN NN NN	BackGround	SRelated	Neutral
08-1018_11	The Indri 2.5 search engine ( ) is used as our basic retrieval system	DT NP CD NN NN ( ) VBZ VVN IN PP$ JJ NN NN	Fundamental	Basis	Neutral
08-1018_13	To better determine stemming rules , Xu and Croft ( ) propose a selective stemming method based on corpus analysis	TO RBR VV VVG NNS , NP CC NP ( ) VV DT JJ VVG NN VVN IN NN NN	BackGround	GRelated	Positive
08-1018_13	Xu and Croft ( ) create equivalence clusters of words which are morphologically similar and occur in similar contexts	NP CC NP ( ) VV NN NNS IN NNS WDT VBP RB JJ CC VV IN JJ NNS	BackGround	GRelated	Neutral
08-1018_13	This approach is similar to the work of Xu and Croft ( ) , and can be considered as another state-of-the-art result	DT NN VBZ JJ TO DT NN IN NP CC NP ( ) , CC MD VB VVN IN DT JJ NN	Fundamental	Idea	Neutral
08-1019_0	Question answering ( ) relates to question search	NN NN ( ) VVZ TO VV NN	BackGround	GRelated	Neutral
08-1019_1	Conventional vector space models are used to calculate the statistical similarity and WordNet ( ) is used to estimate the semantic similarity	JJ NN NN NNS VBP VVN TO VV DT JJ NN CC NP ( ) VBZ VVN TO VV DT JJ NN	BackGround	GRelated	Neutral
08-1019_3	Note that the root node of a question tree is associated with empty string as the definition of prefix tree requires ( )	NN IN/that DT NN NN IN DT NN NN VBZ VVN IN JJ NN IN DT NN IN NN NN VVZ ( )	BackGround	SRelated	Neutral
08-1019_4	Sneiders ( ) proposed template based FAQ retrieval systems	NP ( ) VVN NN VVN NP NN NNS	BackGround	GRelated	Neutral
08-1019_6	The MDL-based tree cut model was originally introduced for handling the problem of generalizing case frames using a thesaurus ( )	DT JJ NN NN NN VBD RB VVN IN VVG DT NN IN VVG NN NNS VVG DT NN ( )	BackGround	SRelated	Neutral
08-1019_7	Jeon and Bruce ( ) proposed a mixture model for fixing the lexical chasm between questions	NP CC NP ( ) VVN DT NN NN IN VVG DT JJ NN IN NNS	BackGround	SRelated	Neutral
08-1019_8	For example , Jeon et al. ( ) compared four different retrieval methods , i.e	IN NN , NP NP NP ( ) VVN CD JJ NN NNS , NNS	BackGround	GRelated	Neutral
08-1019_10	MDL is a principle of data compression and statistical estimation from information theory ( )	NP VBZ DT NN IN NN NN CC JJ NN IN NN NN ( )	BackGround	GRelated	Neutral
08-1019_13	FAQ Finder ( ) heuristically combines statistical similarities and semantic similarities between questions to rank FAQs	NP NP ( ) RB VVZ JJ NNS CC JJ NNS IN NNS TO VV NP	BackGround	GRelated	Neutral
08-1019_14	Harabagiu et al. ( ) used a Question Answer Database (known as QUAB) to support interactive question answering	NP NP NP ( ) VVN DT NP NP NP NN IN NP TO VV JJ NN NN	BackGround	GRelated	Neutral
08-1019_15	A BaseNP is defined as a simple and non-recursive noun phrase ( )	DT NP VBZ VVN IN DT JJ CC JJ NN NN ( )	BackGround	GRelated	Neutral
08-1019_16	Lai et al. ( ) proposed an approach to automatically mine FAQs from the Web	NP NP NP ( ) VVN DT NN TO RB JJ NP IN DT NP	BackGround	GRelated	Neutral
08-1020_0	Previous work has shown that modeling the relation between personality and language is far from trivial ( ) , suggesting that the control of personality is a harder problem than the control of data-driven variation dimensions	JJ NN VHZ VVN IN/that VVG DT NN IN NN CC NN VBZ RB IN JJ ( ) , VVG IN/that DT NN IN NN VBZ DT JJR NN IN DT NN IN JJ NN NNS	BackGround	GRelated	Neutral
08-1020_1	One line of work has primarily focused on gram-maticality and naturalness , scoring the overgener-ation phase with a SLM , and evaluating against a gold-standard corpus , using string or tree-match metrics ( )	CD NN IN NN VHZ RB VVN IN NN CC NN , VVG DT NN NN IN DT NP , CC VVG IN DT NN NN , VVG NN CC NN NNS ( )	BackGround	GRelated	Neutral
08-1020_3	While handcrafted rule-based approaches are limited to variation along a small number of discrete points ( ) , we learn models that predict parameter values for any arbitrary value on the variation dimension scales	IN VVN JJ NNS VBP VVN TO NN IN DT JJ NN IN JJ NNS ( ) , PP VVP NNS WDT VVP NN NNS IN DT JJ NN IN DT NN NN NNS	BackGround	GRelated	Negative
08-1020_5	Following Mairesse and Walker ( ) , two expert judges (not the authors) familiar with the Big Five adjectives (Table 1) evaluate the personality of each utterance using the Ten-Item Personality Inventory ( ) , and also judge the utterance's naturalness	VVG NP CC NP ( ) , CD NN NNS VVP DT NN JJ IN DT JJ CD NNS JJ NP VV DT NN IN DT NN VVG DT NP NP NP ( ) , CC RB VV DT NNS NN	Fundamental	Idea	Neutral
08-1020_5	Subjects evaluate the naturalness and personality of each utterance using the TIPI ( )	NNS VVP DT NN CC NN IN DT NN VVG DT NP ( )	Fundamental	Basis	Neutral
08-1020_7	We present a new method for generating linguistic variation projecting multiple personality traits continuously , by combining and extending previous research in statistical natural language generation ( )	PP VVP DT JJ NN IN VVG JJ NN VVG JJ NN NNS RB , IN VVG CC VVG JJ NN IN JJ JJ NN NN ( )	Fundamental	Basis	Neutral
08-1020_8	Langkilde and Knight ( ) first applied SLMs to statistical natural language generation (SNLG) , showing that high quality paraphrases can be generated from an underspecified representation of meaning , by first applying a very undercon-strained , rule-based overgeneration phase , whose outputs are then ranked by an SLM scoring phase	NP CC NP ( ) RB VVN NP TO JJ JJ NN NN NN , VVG IN/that JJ NN NNS MD VB VVN IN DT JJ NN IN NN , IN RB VVG DT RB JJ , JJ NN NN , WP$ NNS VBP RB VVN IN DT NP VVG NN	BackGround	GRelated	Positive
08-1020_10	Over the last 20 years , statistical language models (SLMs) have been used successfully in many tasks in natural language processing , and the data available for modeling has steadily grown ( )	IN DT JJ CD NNS , JJ NN NNS JJ VHP VBN VVN RB IN JJ NNS IN JJ NN NN , CC DT NNS JJ IN NN VHZ RB VVN ( )	BackGround	GRelated	Positive
08-1020_12	both introverted and extraverted personality types ( )	DT VVN CC JJ NN NNS ( )	NULL	NULL	NULL
08-1020_12	See ( ) for more detail	VV ( ) IN JJR NN	BackGround	SRelated	Neutral
08-1020_12	Section 3.2 shows that humans accurately perceive the intended variation , and Section 3.3 compares P ersonage-PE (trained) with P ersonage ( )	NN CD VVZ IN/that NNS RB VVP DT JJ NN , CC NP CD VVZ NN NN NN IN NN NN ( )	Compare	Compare	Neutral
08-1020_12	We start with the P ersonage generator ( ) , which generates recommendations and comparisons of restaurants	PP VVP IN DT NN NN NN ( ) , WDT VVZ NNS CC NNS IN NNS	Fundamental	Basis	Neutral
08-1020_12	Mairesse and Walker ( ) show that this approach generates utterances that are perceptibly different along the extraversion dimension	NP CC NP ( ) VVP IN/that DT NN VVZ NNS WDT VBP RB JJ IN DT NN NN	BackGround	SRelated	Neutral
08-1020_12	Table 9 shows that the average naturalness is 3.98 out of 7 , which is significantly lower (p < .05) than the naturalness of handcrafted and randomly generated utterances reported by Mairesse and Walker ( )	NN CD VVZ IN/that DT JJ NN VBZ CD IN IN CD , WDT VBZ RB JJR NN SYM NN IN DT NN IN VVN CC RB VVN NNS VVN IN NP CC NP ( )	Compare	Compare	Positive
08-1020_12	isard et al. ( ) and Mairesse and Walker ( ) also propose a personality generation method , in which a data-driven personality model selects the best utterance from a large candidate set	NP NP NP ( ) CC NP CC NP ( ) RB VV DT NN NN NN , IN WDT DT JJ NN NN VVZ DT JJS NN IN DT JJ NN NN	BackGround	SRelated	Positive
08-1020_12	Even though the parameters of P ersonage-PE were suggested by psychological studies ( ) , some of them are not modeled successfully by our approach , and thus omitted from Tables 3 and 4	RB IN DT NNS IN NN NN VBD VVN IN JJ NNS ( ) , DT IN PP VBP RB VVN RB IN PP$ NN , CC RB VVN IN NP CD CC CD	BackGround	SRelated	Neutral
08-1020_14	Third , there are many studies linking personality to linguistic variables ( )	JJ , EX VBP JJ NNS VVG NN TO JJ NNS ( )	BackGround	GRelated	Neutral
08-1020_14	These parameters are derived from psychological studies identifying linguistic markers of the Big Five traits ( )	DT NNS VBP VVN IN JJ NNS VVG JJ NNS IN DT JJ CD NNS ( )	Fundamental	Basis	Neutral
08-1020_14	These correlations are unexpectedly high; in corpus analyses , significant correlations as low as .05 to .10 are typically observed between personality and linguistic markers ( )	DT NNS VBP RB JJ IN NN NNS , JJ NNS IN JJ IN CD TO CD VBP RB VVN IN NN CC JJ NNS ( )	BackGround	GRelated	Positive
08-1020_15	Another thread investigates SNLG scoring models trained using higher-level linguistic features to replicate human judgments of utterance quality ( )	DT NN VVZ NP VVG NNS VVN VVG JJ JJ NNS TO VV JJ NNS IN NN NN ( )	BackGround	GRelated	Neutral
08-1020_17	A third SNLG approach eliminates the overgeneration phase ( )	DT JJ NP NN VVZ DT NN NN ( )	BackGround	GRelated	Neutral
08-1020_21	There is only one other similar evaluation of an SNLG ( )	EX VBZ RB CD JJ JJ NN IN DT NP ( )	BackGround	SRelated	Neutral
08-1020_24	We compare various learning algorithms using the Weka toolkit ( )	PP VVP JJ VVG NNS VVG DT NP NN ( )	Fundamental	Basis	Neutral
08-1021_0	Similar strategies with parse trees are pursued in ( ) , and error templates are utilized in ( ) for a word processor	JJ NNS IN VVP NNS VBP VVN IN ( ) , CC NN NNS VBP VVN IN ( ) IN DT NN NN	BackGround	GRelated	Neutral
08-1021_1	Research on automatic grammar correction has been conducted on a number of different parts-of-speech , such as articles ( ) and prepositions ( )	NN IN JJ NN NN VHZ VBN VVN IN DT NN IN JJ NN , JJ IN NNS ( ) CC NNS ( )	BackGround	GRelated	Neutral
08-1021_1	Automatic error detection has been performed on other parts-of-speech , e.g. , articles ( ) and prepositions ( )	JJ NN NN VHZ VBN VVN IN JJ NN , FW , NNS ( ) CC NNS ( )	BackGround	GRelated	Neutral
08-1021_2	For example , the sentence "My father is *work in the laboratory" is parsed ( ) as: The progressive form "working" is substituted with its bare form , which happens to be also a noun	IN NN , DT NN NN NN VBZ NN IN DT NN VBZ VVN ( ) NN DT JJ NN NN VBZ VVN IN PP$ JJ NN , WDT VVZ TO VB RB DT NN	BackGround	SRelated	Neutral
08-1021_2	After parsing the corpus ( ) , we artificially introduced verb form errors into these sentences , and observed the resulting "disturbances" to the parse trees	IN VVG DT NN ( ) , PP RB VVD NN NN NNS IN DT NNS , CC VVD DT VVG NN TO DT VVP NNS	Fundamental	Basis	Neutral
08-1021_2	Using these patterns , we introduced verb form errors into Aquaint , then re-parsed the corpus ( ) , and compiled the changes in the "disturbed" trees into a catalog	VVG DT NNS , PP VVD NN NN NNS IN NN , RB VVD DT NN ( ) , CC VVN DT NNS IN DT JJ NNS IN DT NN	Fundamental	Basis	Neutral
08-1021_4	Errors in verb forms have been covered as part of larger systems such as ( ) , but we believe that their specific research challenges warrant more detailed examination	NNS IN NN NNS VHP VBN VVN IN NN IN JJR NNS JJ IN ( ) , CC PP VVP IN/that PP$ JJ NN NNS VVP RBR JJ NN	BackGround	GRelated	Neutral
08-1021_5	For example , in the Japanese Learners of English corpus ( ) , errors related to verbs are among the most frequent categories	IN NN , IN DT JJ NP IN NP NN ( ) , NNS VVN TO NNS VBP IN DT RBS JJ NNS	BackGround	GRelated	Neutral
08-1021_5	A maximum entropy model , using lexical and POS features , is trained in ( ) to recognize a variety of errors	DT JJ NN NN , VVG JJ CC NP VVZ , VBZ VVN IN ( ) TO VV DT NN IN NNS	BackGround	GRelated	Neutral
08-1021_7	ual evaluation of HKUST is 0.76 , corresponding to "substantial agreement" between the two evalu-ators ( )	JJ NN IN NP VBZ CD , JJ TO JJ NN IN DT CD NNS ( )	BackGround	SRelated	Neutral
08-1021_8	Hand-crafted error production rules (or "mal-rules") , augmenting a context-free grammar , are designed for a writing tutor aimed at deaf students ( )	NN NN NN NNS NN NN , VVG DT JJ NN , VBP VVN IN DT VVG NN VVN IN JJ NNS ( )	BackGround	GRelated	Neutral
08-1021_9	An approach combining a hand-crafted context-free grammar and stochastic probabilities is pursued in ( ) , but it is designed for a restricted domain only	DT NN VVG DT NN JJ NN CC JJ NNS VBZ VVN IN ( ) , CC PP VBZ VVN IN DT JJ NN RB	BackGround	GRelated	Negative
08-1022_0	The OpenCCG surface realizer is based on Steed-man's ( ) version of CCG elaborated with Baldridge and Kruijff's multi-modal extensions for lexically specified derivation control ( ) and hybrid logic dependency semantics ( )	DT NP NN NN VBZ VVN IN NP ( ) NN IN NP VVD IN NP CC NP JJ NNS IN RB VVN NN NN ( ) CC JJ NN NN NNS ( )	BackGround	GRelated	Neutral
08-1022_0	 Internally , such graphs are represented using Hybrid Logic Dependency Semantics (HLDS) , a dependency-based approach to representing linguistic meaning developed by Baldridge and Kruijff ( )	RB , JJ NNS VBP VVN VVG NP NP NP NP NN , DT JJ NN TO VVG JJ NN VVN IN NP CC NP ( )	BackGround	GRelated	Neutral
08-1022_3	A relatively recent technique for lexical category assignment is supertagging ( ) , a preprocessing step to parsing that assigns likely categories based on word and part-of-speech (POS) contextual information	DT RB JJ NN IN JJ NN NN VBZ JJ ( ) , DT VVG NN TO VVG DT VVZ JJ NNS VVN IN NN CC NN NN JJ NN	BackGround	GRelated	Neutral
08-1022_4	We have introduced a novel type of supertagger , which we have dubbed a hypertagger , that assigns CCG category labels to elementary predications in a structured semantic representation with high accuracy at several levels of tagging ambiguity in a fashion reminiscent of ( )	PP VHP VVN DT NN NN IN NN , WDT PP VHP VVN DT NN , WDT VVZ NP NN NNS TO JJ NNS IN DT JJ JJ NN IN JJ NN IN JJ NNS IN VVG NN IN DT NN NN IN ( )	Fundamental	Basis	Neutral
08-1022_5	Assigned categories are instantiated in OpenCCG's chart realizer where , together with a treebank-derived syntactic grammar ( ) and a factored language model ( ) , they constrain the English word-strings that are chosen to express the LF	JJ NNS VBP VVN IN NP NN NN WRB , RB IN DT JJ JJ NN ( ) CC DT VVN NN NN ( ) , PP VV DT JJ NNS WDT VBP VVN TO VV DT NP	Fundamental	Basis	Neutral
08-1022_5	OpenCCG implements a symbolic-statistical chart realization algorithm ( ) combining (1) a theoretically grounded approach to syntax and semantic composition with (2) factored language models ( ) for making choices among the options left open by the grammar	NP VVZ DT JJ NN NN NN ( ) VVG NN DT RB VVN NN TO NN CC JJ NN IN JJ VVN NN NNS ( ) IN VVG NNS IN DT NNS VVD JJ IN DT NN	BackGround	GRelated	Neutral
08-1022_6	In HLDS , hybrid logic ( ) terms are used to describe dependency graphs	IN NP , JJ NN ( ) NNS VBP VVN TO VV NN NNS	BackGround	GRelated	Neutral
08-1022_7	A separate transformation then uses around two dozen generalized templates to add logical forms to the categories , in a fashion reminiscent of ( )	DT JJ NN RB VVZ RB CD NN VVD NNS TO VV JJ NNS TO DT NNS , IN DT NN NN IN ( )	BackGround	SRelated	Neutral
08-1022_8	To illustrate the input to OpenCCG , consider the semantic dependency graph in Figure 1 , which is taken from section 00 of a Propbank-enhanced version of the CCGbank ( )	TO VV DT NN TO NP , VVP DT JJ NN NN IN NP CD , WDT VBZ VVN IN NN CD IN DT JJ NN IN DT NP ( )	Fundamental	Basis	Neutral
08-1022_9	Even with the current incomplete set of semantic templates , the hy-pertagger brings realizer performance roughly up to state-of-the-art levels , as our overall test set B LEU score ( ) slightly exceeds that of Cahill and van Genabith ( ) , though at a coverage of 96% instead of98%	RB IN DT JJ JJ NN IN JJ NNS , DT NN VVZ NN NN RB RB TO JJ NNS , IN PP$ JJ NN VVN NP NP NN ( ) RB VVZ IN/that IN NP CC NP NP ( ) , RB IN DT NN IN CD RB JJ	Compare	Compare	Negative
08-1022_9	In this regard , our approach is more similar to the ones pursued more recently by Carroll , Oepen and Velldal ( ) , Nakanishi et al. ( ) and Cahill and van Genabith ( ) with HPSG and LFG grammars	IN DT NN , PP$ NN VBZ RBR JJ TO DT NNS VVD RBR RB IN NP , NP CC NP ( ) , NP NP NP ( ) CC NP CC NP NP ( ) IN NP CC NP NNS	Fundamental	Idea	Neutral
08-1022_10	Our approach follows Langkilde-Geary ( ) and Callaway ( ) in aiming to leverage the Penn Treebank to develop a broad-coverage surface re-alizer for English	PP$ NN VVZ NP ( ) CC NP ( ) IN VVG TO VV DT NP NP TO VV DT NN NN NN IN NP	Fundamental	Idea	Positive
08-1022_11	One way of performing lexical assignment is simply to hypothesize all possible lexical categories and then search for the best combination thereof , as in the CCG parser in ( ) or the chart realizer in ( )	CD NN IN VVG JJ NN VBZ RB TO VV DT JJ JJ NNS CC RB NN IN DT JJS NN RB , RB IN DT NP NN IN ( ) CC DT NN NN IN ( )	BackGround	GRelated	Neutral
08-1022_13	Supertagging has been more recently extended to a multitagging paradigm in CCG ( ) , leading to extremely efficient parsing with state-of-the-art dependency recovery ( )	NP VHZ VBN RBR RB VVN TO DT VVG NN IN NP ( ) , VVG TO RB JJ VVG IN JJ NN NN ( )	BackGround	GRelated	Positive
08-1022_14	Clark ( ) notes in his parsing experiments that the POS tags of the surrounding words are highly informative	NP ( ) NNS IN PP$ VVG NNS IN/that DT NP NNS IN DT VVG NNS VBP RB JJ	BackGround	SRelated	Neutral
08-1022_16	White et al. ( ) describe an ongoing effort to engineer a grammar from the CCGbank ( ) ?a corpus of CCG derivations derived from the Penn Treebank ?suitable for realization with OpenCCG	NP NP NP ( ) VV DT JJ NN TO VV DT NN IN DT NP ( ) NN NN IN NP NNS VVN IN DT NP NP NN IN NN IN NP	BackGround	SRelated	Neutral
08-1022_19	While these models are considerably smaller than the ones used in ( ) , the training data does have the advantage of being in the same domain and genre (using larger n-gram models remains for future investigation)	IN DT NNS VBP RB JJR IN DT NNS VVN IN ( ) , DT NN NN VVZ VH DT NN IN VBG IN DT JJ NN CC NN VVG JJR NN NNS VVZ IN JJ NN	Compare	Compare	Neutral
08-1022_19	We caution , however , thatitremains unclear how meaningful it is to directly compare these scores when the realizer inputs vary considerably in their specificity , as Langkilde-Geary's ( ) experiments dramatically illustrate	PP VVP , RB , NNS JJ WRB JJ PP VBZ TO RB VV DT NNS WRB DT NN NNS VVP RB IN PP$ NN , IN NP ( ) NNS RB VVP	BackGround	SRelated	Neutral
08-1022_20	In the two-stage mode , a packed forest of all possible realizations is created in the first stage; in the second stage , the packed representation is unpacked in bottom-up fashion , with scores assigned to the edge for each sign as it is unpacked , much as in ( )	IN DT NN NN , DT JJ NN IN DT JJ NNS VBZ VVN IN DT JJ NN IN DT JJ NN , DT JJ NN VBZ VVN IN JJ NN , IN NNS VVN TO DT NN IN DT NN IN PP VBZ VVN , RB IN IN ( )	BackGround	SRelated	Neutral
08-1022_21	Moreover , the overall Bleu ( ) and Meteor ( ) scores , as well as numbers of exact string matches (as measured against to the original sentences in the CCGbank) are higher for the hypertagger-seeded realizer than for the preexisting realizer	RB , DT JJ NP ( ) CC NP ( ) NNS , RB RB IN NNS IN JJ NN VVZ NNS VVN IN TO DT JJ NNS IN DT NP VBP JJR IN DT JJ NN IN IN DT VVG NN	BackGround	SRelated	Neutral
08-1022_22	We used Zhang Le's maximum entropy toolkit 4 for training the hypertagging model , which uses an implementation of Limited-memory BFGS , an approximate quasi-Newton optimization method from the numerical optimization literature ( )	PP VVD NP NP JJ NN NN CD IN VVG DT NN NN , WDT VVZ DT NN IN NP NP , DT JJ NN NN NN IN DT JJ NN NN ( )	Fundamental	Basis	Neutral
08-1022_24	Example (1) shows how numbered semantic roles , taken from PropBank ( ) when available , are added to the category of an active voice , past tense transitive verb , where *pred* is a placeholder for the lexical predicate; examples (2) and (3) show how more specific relations are introduced in the category for determiners and the category for the possessive's , respectively	NN NN VVZ WRB VVN JJ NNS , VVN IN NP ( ) WRB JJ , VBP VVN TO DT NN IN DT JJ NN , JJ JJ JJ NN , WRB NN VBZ DT NN IN DT JJ NN NNS JJ CC JJ NN WRB JJR JJ NNS VBP VVN IN DT NN IN NNS CC DT NN IN DT NNS , RB	Fundamental	Basis	Neutral
08-1022_26	In lexicalized grammatical formalisms such as Lexicalized Tree Adjoining Grammar ( ) , Combinatory Categorial Grammar ( ) and Head-Driven Phrase-Structure Grammar ( ) , it is possible to separate lexical category assignment ?the assignment of informative syntactic categories to linguistic objects such as words or lexical predicates ?from the combinatory processes that make use of such categories ?such as parsing and surface realization	IN JJ JJ NNS JJ IN NP NP NP NP ( ) , NP NP NP ( ) CC NP NP NP ( ) , PP VBZ JJ TO VV JJ NN NN NN NN IN JJ JJ NNS TO JJ NNS JJ IN NNS CC JJ NNS VVG DT JJ NNS WDT VVP NN IN JJ NNS NN IN VVG CC NN NN	BackGround	GRelated	Neutral
08-1022_29	The language models were created using the SRILM toolkit ( ) on the standard training sections (2-21) of the CCGbank , with sentence-initial words (other than proper names) uncapital-ized	DT NN NNS VBD VVN VVG DT NP NN ( ) IN DT JJ NN NNS JJ IN DT NP , IN JJ NNS NN IN JJ JJ NN	Fundamental	Basis	Neutral
08-1022_32	2 Second , we compare a hypertagger-augmented version of OpenCCG's chart realizer with the preexisting chart realizer ( ) that simply instantiates the chart with all possible CCG categories (subject to frequency cutoffs) for each input LF predicate	CD RB , PP VVP DT JJ NN IN NP NN NN IN DT VVG NN NN ( ) WDT RB VVZ DT NN IN DT JJ NP NNS NN TO NN NN IN DT NN NP NN	Compare	Compare	Neutral
08-1022_32	Following White et al. ( ) , we use factored tri-gram models over words , part-of-speech tags and supertags to score partial and complete realizations	VVG NP NP NP ( ) , PP VVP VVN NN NNS IN NNS , NN NNS CC NNS TO NN JJ CC JJ NNS	Fundamental	Idea	Neutral
08-1022_32	In White et al.'s ( ) initial investigation of scaling up OpenCCG for broad coverage realization , test set   grammar complete oracle / best dev (00)  dev?9.1%/47.8% train?7.5%/22.6% all categories observed more often than a threshold frequency were instantiated for lexical predicates; for unseen words , a simple smoothing strategy based on the part of speech was employed , assigning the most frequent categories for the POS	IN NP NP NNS ( ) JJ NN IN VVG RP NN IN JJ NN NN , NN VVD NN JJ NN SYM JJS NP JJ NN NN DT NNS VVD RBR RB IN DT NN NN VBD VVN IN JJ NN IN JJ NNS , DT JJ VVG NN VVN IN DT NN IN NN VBD VVN , VVG DT RBS JJ NNS IN DT NP	BackGround	SRelated	Neutral
08-1022_33	Additionally , to realize a wide range of paraphrases , OpenCCG implements an algorithm for efficiently generating from disjunctive logical forms ( )	RB , TO VV DT JJ NN IN NNS , NP VVZ DT NN IN RB VVG IN JJ JJ NNS ( )	BackGround	GRelated	Neutral
08-1023_0	Informally , a packed parse forest , or forest in short , is a compact representation of all the derivations (i.e. , parse trees) for a given sentence under a context-free grammar ( )	RB , DT VVN VVP NN , CC NN IN JJ , VBZ DT JJ NN IN PDT DT NNS NN , VVP NN IN DT VVN NN IN DT JJ NN ( )	BackGround	SRelated	Neutral
08-1023_1	Such a forest has a structure of a hypergraph ( ) , where items like NP 0  3 are called nodes , and deductive steps like (*) correspond to hyperedges	PDT DT NN VHZ DT NN IN DT NN ( ) , WRB NNS IN NP CD CD VBP VVN NNS , CC JJ NNS IN NN VV TO NNS	BackGround	GRelated	Neutral
08-1023_1	Both tasks can be done efficiently by forest-based algorithms based on k-best parsing ( )	DT NNS MD VB VVN RB IN JJ NNS VVN IN NN VVG ( )	BackGround	GRelated	Neutral
08-1023_1	Basically , cube pruning works bottom up in a forest , keeping at most k +LM items at each node , and uses the best-first expansion idea from the Algorithm 2 of Huang and Chiang ( ) to speed up the computation	RB , NN VVG NNS VV RP IN DT NN , VVG IN JJS NN NN NNS IN DT NN , CC VVZ DT JJ NN NN IN DT NP CD IN NP CC NP ( ) TO VV RP DT NN	Fundamental	Idea	Neutral
08-1023_1	For k-best search after getting 1-best derivation , we use the lazy Algorithm 3 of Huang and Chiang ( ) that works backwards from the root node , incrementally computing the second , third , through the kth best alternatives	IN JJ NN IN VVG JJ NN , PP VVP DT JJ NP CD IN NP CC NP ( ) WDT VVZ RB IN DT NN NN , RB VVG DT JJ , JJ , IN DT NN JJS NNS	Fundamental	Basis	Neutral
08-1023_3	However , a k-best list , with its limited scope , often has too few variations and too many redundancies; for example , a 50-best list typically encodes a combination of 5 or 6 binary ambiguities (since 2 5 6) , and many subtrees are repeated across different parses ( )	RB , DT NN NN , IN PP$ JJ NN , RB VHZ RB JJ NNS CC RB JJ NN IN NN , DT JJ NN RB VVZ DT NN IN CD CC CD JJ NNS NN CD CD JJ , CC JJ NNS VBP VVN IN JJ VVZ ( )	BackGround	GRelated	Negative
08-1023_3	We use the pruning algorithm of ( ) that is very similar to the method based on marginal probability ( ) , except that it prunes hyperedges as well as nodes	PP VVP DT VVG NN IN ( ) WDT VBZ RB JJ TO DT NN VVN IN JJ NN ( ) , IN WDT PP VVZ NNS RB RB IN NNS	Fundamental	Basis	Neutral
08-1023_3	Following Huang ( ) , we modify the parser to output a packed forest for each sentence	VVG NP ( ) , PP VV DT NN TO NN DT JJ NN IN DT NN	Fundamental	Idea	Neutral
08-1023_4	The corresponding BLEU score of Pharaoh ( ) is 0.2182 on this dataset	DT JJ NP NN IN NN ( ) VBZ CD IN DT NN	BackGround	SRelated	Neutral
08-1023_5	We use the standard minimum error-rate training ( ) to tune the feature weights to maximize the system's BLEU score on the dev set	PP VVP DT JJ JJ NN NN ( ) TO VV DT NN NNS TO VV DT JJ NP NN IN DT NP NN	Fundamental	Basis	Neutral
08-1024_0	Such approaches have been shown to be effective in log-linear word-alignment models where only a small supervised corpus is available ( )	JJ NNS VHP VBN VVN TO VB JJ IN JJ NN NNS WRB RB DT JJ JJ NN VBZ JJ ( )	BackGround	GRelated	Positive
08-1024_1	Promising features might include those over source side reordering rules ( ) or source context features ( )	JJ NNS MD VV DT IN NN NN VVG NNS ( ) CC NN NN NNS ( )	BackGround	MRelated	Positive
08-1024_3	For this reason , to our knowledge , all discriminative models proposed to date either side-step the problem by choosing simple model and feature structures , such that spurious ambiguity is lessened or removed entirely ( ) , or else ignore the problem and treat derivations as translations ( )	IN DT NN , TO PP$ NN , DT JJ NNS VVN TO VV DT NN DT NN IN VVG JJ NN CC NN NNS , PDT DT JJ NN VBZ VVN CC VVN RB ( ) , CC RB VV DT NN CC VV NNS IN NNS ( )	BackGround	GRelated	Neutral
08-1024_3	To our knowledge no systems directly address Problem 1 , instead choosing to ignore the problem by using one or a small handful of reference derivations in an n-best list ( ) , or else making local independence assumptions which side-step the issue ( )	TO PP$ NN DT NNS RB VVP NP CD , RB VVG TO VV DT NN IN VVG CD CC DT JJ NN IN NN NNS IN DT JJ NN ( ) , CC RB VVG JJ NN NNS WDT VVP DT NN ( )	BackGround	GRelated	Negative
08-1024_4	The standard solution is to approximate the maximum probability translation using a single derivation ( )	DT JJ NN VBZ TO VV DT JJ NN NN VVG DT JJ NN ( )	BackGround	SRelated	Neutral
08-1024_5	A synchronous context free grammar (SCFG) consists of paired CFG rules with co-indexed nonterminals ( )	DT JJ NN JJ NN NN VVZ IN VVN NN NNS IN JJ NNS ( )	BackGround	GRelated	Neutral
08-1024_6	Both the global models ( ) use fairly small training sets , and there is no evidence that their techniques will scale to larger data sets	CC DT JJ NNS ( ) VV RB JJ NN NNS , CC EX VBZ DT NN IN/that PP$ NNS MD VV TO JJR NN NNS	BackGround	GRelated	Negative
08-1024_7	2 This results in the following log-likelihood objective and corresponding gradient: L =  ?logp A (e|f) + J>gp o(A fc) (4) d L a 2 (5) In order to train the model , we maximise equation (4) using L-BFGS ( )	LS DT NNS IN DT VVG NN NN CC JJ NP NP SYM NN DT NN SYM NP NP NP NP SYM NP DT CD NN IN NN TO VV DT NN , PP VVP NN NN VVG NP ( )	Fundamental	Basis	Neutral
08-1024_9	Our findings echo those observed for latent variable log-linear models successfully used in monolingual parsing ( )	PP$ NNS VVP DT VVN IN JJ JJ JJ NNS RB VVN IN JJ VVG ( )	BackGround	SRelated	Positive
08-1024_9	This method has been demonstrated to be effective for (non-convex) log-linear models with latent variables ( )	DT NN VHZ VBN VVN TO VB JJ IN JJ JJ NNS IN JJ NNS ( )	BackGround	GRelated	Positive
08-1024_14	This is an instance of the ITG alignment algorithm ( )	DT VBZ DT NN IN DT NP NN NN ( )	BackGround	SRelated	Neutral
08-1025_0	Charniak and Johnson ( ) use a PCFG to do a pass of inside-outside parsing to reduce the state space of a subsequent lexicalized n-best parsing algorithm to produce parses that are further re-ranked by a MaxEnt model	NP CC NP ( ) VV DT NN TO VV DT NN IN NN VVG TO VV DT NN NN IN DT JJ JJ NN VVG NN TO VV VVZ WDT VBP RBR VVN IN DT NP NN	BackGround	GRelated	Neutral
08-1026_0	In the past few years , a "standard model" of scope underspecification has emerged: A range of formalisms from Underspecified DRT ( ) to dominance graphs ( ) have offered mechanisms to specify the "semantic material" of which the semantic representations are built up , plus dominance or outscoping relations between these building blocks	IN DT JJ JJ NNS , DT NN NN IN NN NN VHZ JJ DT NN IN NNS IN NP NP ( ) TO NN NNS ( ) VHP VVN NNS TO VV DT JJ NN IN WDT DT JJ NNS VBP VVN RP , CC NN CC NN NNS IN DT NN NNS	BackGround	GRelated	Neutral
08-1026_0	In this paper , we consider dominance graphs ( ) as one representative of this class	IN DT NN , PP VVP NN NNS ( ) IN CD NN IN DT NN	Fundamental	Basis	Neutral
08-1026_1	Furthermore , there are algorithms for determinizing weighted tree automata ( ) , which could be applied as preprocessing steps for wRTGs	RB , EX VBP NNS IN VVG JJ NN NN ( ) , WDT MD VB VVN IN VVG NNS IN NNS	BackGround	MRelated	Neutral
08-1026_1	The algorithms generalize easily to weights that are taken from an arbitrary ordered semiring ( ) and to computing minimal-weight rather than maximal-weight configurations	DT NNS VV RB TO NNS WDT VBP VVN IN DT JJ VVN NN ( ) CC TO VVG NN RB IN NN NNS	BackGround	SRelated	Neutral
08-1026_2	Underspecification ( ) has become the standard approach to dealing with scope ambiguity in large-scale hand-written grammars	NP ( ) VHZ VVN DT JJ NN TO VVG IN NN NN IN JJ JJ NNS	BackGround	GRelated	Neutral
08-1026_3	Redundancy elimination ( ) is the problem of deriving from an USR U another USR U ' , such that the readings of U ' are a proper subset of the read-ings of U  , but every reading in U is semantically equivalent to some reading in U '	NN NN ( ) VBZ DT NN IN VVG IN DT NP NP DT NP NP POS , JJ IN/that DT NNS IN NP POS VBP DT JJ NN IN DT NNS IN NP , CC DT NN IN NP VBZ RB JJ TO DT NN IN NP POS	BackGround	GRelated	Neutral
08-1026_4	Regular tree grammars ( ) are a standard approach for specifying sets of trees in theoretical computer science , and are closely related to regular tree transducers as used e.g.in recent work on statistical MT ( ) and grammar formalisms ( )	JJ NN NNS ( ) VBP DT JJ NN IN VVG NNS IN NNS IN JJ NN NN , CC VBP RB VVN TO JJ NN NNS IN VVN NN JJ NN IN JJ NP ( ) CC NN NNS ( )	BackGround	GRelated	Neutral
08-1026_4	See Comon et al. ( ) for more details	VV NP NP NP ( ) IN JJR NNS	BackGround	SRelated	Neutral
08-1026_5	The Rondane treebank is a "Redwoods style" treebank ( ) containing MRS-based underspecified representations for sentences from the tourism domain , and is distributed together with the English Resource Grammar (ERG) ( )	DT NP NN VBZ DT JJ NN NN ( ) VVG JJ JJ NNS IN NNS IN DT NN NN , CC VBZ VVN RB IN DT NP NP NP NN ( )	BackGround	GRelated	Neutral
08-1026_7	On the theoretical side , Ebert ( ) has shown that none of the major underspecification formalisms are expressively complete , i.e	IN DT JJ NN , NP ( ) VHZ VVN DT NN IN DT JJ NN NNS VBP RB JJ , NN	BackGround	GRelated	Neutral
08-1026_7	Because every finite tree language is regular , RTGs constitute an expressively complete underspecifica-tion formalism in the sense of Ebert ( ): They can represent arbitrary subsets of the original set of readings	IN DT JJ NN NN VBZ JJ , NNS VVP DT RB JJ NN NN IN DT NN IN NP ( NN PP MD VV JJ NNS IN DT JJ NN IN NNS	BackGround	SRelated	Neutral
08-1026_7	Ebert ( ) proves that no expressively complete underspecifi-cation formalism can be compact , i.e	NP ( ) VVZ IN/that DT RB JJ NN NN MD VB JJ , NN	BackGround	SRelated	Neutral
08-1026_9	The precise definition of dominance nets is not important here , but note that virtually all underspecified descriptions that are produced by current grammars are nets ( )	DT JJ NN IN NN NNS VBZ RB JJ RB , CC VVP IN/that RB DT JJ NNS WDT VBP VVN IN JJ NNS VBP NNS ( )	BackGround	GRelated	Neutral
08-1026_11	A weighted regular tree grammar (wRTG) ( ) is a 5-tuple G = (S ,N ,Z , R , c) such that G' = (S , N , Z , R) is a regular tree grammar and c : R ?R is a function that assigns each production rule a weight	DT JJ JJ NN NN NN ( ) VBZ DT NP NP SYM NP NP NP , NP , NN JJ IN/that NP SYM NNS , NP , NP , NP VBZ DT JJ NN NN CC LS : SYM NN VBZ DT NN WDT VVZ DT NN NN DT NN	BackGround	SRelated	Neutral
08-1026_12	Furthermore , we show how to define a PCFG-style cost model on RTGs and compute best readings of deterministic RTGs efficiently , and illustrate this model on a machine learning based model of scope preferences ( )	RB , PP VVP WRB TO VV DT NP NN NN IN NP CC VV JJS NNS IN JJ NNS RB , CC VVP DT NN IN DT NN VVG VVN NN IN NN NNS ( )	Fundamental	Basis	Neutral
08-1026_12	Weighted dominance graphs can be used to encode the standard models of scope preferences ( )	JJ NN NNS MD VB VVN TO VV DT JJ NNS IN NN NNS ( )	BackGround	GRelated	Neutral
08-1026_13	Tree automata are related to tree transducers as used e.g.in statistical machine translation ( ) exactly like finite-state string automata are related to finite-state string transducers , i.e	NN NN VBP VVN TO NN NNS IN VVN NN JJ NN NN ( ) RB IN JJ NN NN VBP VVN TO JJ NN NNS , NNS	BackGround	GRelated	Neutral
08-1026_13	For example , Knight and Graehl ( ) present an algorithm to extract the best derivation of a wRTG in time O(t + n log n) where n is the number of nonterminals and t is the number of rules	IN NN , NP CC NP ( ) VV DT NN TO VV DT JJS NN IN DT NN IN NN NP SYM NN NN NN WRB NN VBZ DT NN IN NN CC NN VBZ DT NN IN NNS	BackGround	GRelated	Neutral
08-1026_14	One can then strengthen the underspecified description to efficiently eliminate subsets of readings that were not intended in the given context ( ); so when the individual readings are eventually computed , the number of remaining readings is much smaller and much closer to the actual perceived ambiguity of the sentence	PP MD RB VV DT JJ NN TO RB VV NNS IN NNS WDT VBD RB VVN IN DT VVN NN ( JJ RB WRB DT JJ NNS VBP RB VVN , DT NN IN VVG NNS VBZ RB JJR CC RB RBR TO DT JJ VVN NN IN DT NN	BackGround	GRelated	Neutral
08-1026_15	Indeed , for one particular grammar formalism it has even been shown that the parse chart contains an isomorphic image of a dominance chart ( )	RB , IN CD JJ NN NN PP VHZ RB VBN VVN IN/that DT VVP NN VVZ DT JJ NN IN DT NN NN ( )	BackGround	GRelated	Neutral
08-1026_16	We show that the "dominance charts" proposed by Koller and Thater ( ) can be naturally seen as regular tree grammars; using their algorithm , classical underspecified descriptions (dominance graphs) can be translated into RTGs that describe the same sets of readings	PP VVP IN/that DT NN NN VVN IN NP CC NP ( ) MD VB RB VVN IN JJ NN NN VVG PP$ NN , JJ JJ NNS NN NN MD VB VVN IN NP WDT VVP DT JJ NNS IN NNS	Fundamental	Basis	Neutral
08-1026_16	This simplifies semantics construction , and current algorithms ( ) support the efficient enumeration of readings from an USR when it is necessary	DT VVZ NNS NN , CC JJ NNS ( ) VV DT JJ NN IN NNS IN DT NP WRB PP VBZ JJ	BackGround	GRelated	Neutral
08-1026_16	Koller and Thater ( ) demonstrate how to compute a dominance chart from a dominance graph D by tabulating how a subgraph can be decomposed into smaller subgraphs by removing what they call a "free fragment"	NP CC NP ( ) VV WRB TO VV DT NN NN IN DT NN NN NP IN VVG WRB DT NN MD VB VVN IN JJR NNS IN VVG WP PP VVP DT JJ NN	BackGround	SRelated	Neutral
08-1026_16	In the worst case , the dominance chart of a dominance graph with n fragments has O(2 n) production rules ( ) , i.e	IN DT JJS NN , DT NN NN IN DT NN NN IN NN NNS VHZ JJ JJ NN NNS ( ) , NN	BackGround	SRelated	Negative
08-1026_18	This has been a very successful approach , but recent algorithms for eliminating subsets of readings have pushed the expres-sive power of these formalisms to their limits; for instance , Koller and Thater ( ) speculate that further improvements over their (incomplete) redundancy elimination algorithm require a more expressive formalism than dominance graphs	DT VHZ VBN DT RB JJ NN , CC JJ NNS IN VVG NNS IN NNS VHP VVN DT JJ NN IN DT NNS TO PP$ NN IN NN , NP CC NP ( ) VV IN/that JJR NNS IN PP$ JJ NN NN NN VVP DT RBR JJ NN IN NN NNS	BackGround	GRelated	Negative
08-1026_18	We exploit this increase in expressive power in presenting a novel redundancy elimination algorithm that is simpler and more powerful than the one by Koller and Thater ( ); in our algorithm , redundancy elimination amounts to intersection of regular tree languages	PP VVP DT NN IN JJ NN IN VVG DT JJ NN NN NN WDT VBZ JJR CC RBR JJ IN DT CD IN NP CC NP ( NN IN PP$ NN , NN NN NNS TO NN IN JJ NN NNS	Compare	Compare	Negative
08-1026_18	Ebert shows that the classical dominance-based underspecification formalisms , such as MRS , Hole Semantics , and dominance graphs , are all expressively incomplete , which Koller and Thater ( ) speculate might be a practical problem for algorithms that strengthen USRs to remove unwanted readings	NP VVZ IN/that DT JJ JJ NN NNS , JJ IN NP , NP NP , CC NN NNS , VBP RB RB JJ , WDT NP CC NP ( ) VVP MD VB DT JJ NN IN NNS WDT VVP NNS TO VV JJ NNS	BackGround	GRelated	Neutral
08-1026_18	Koller and Thater ( ) define semantic equivalence in terms of a rewrite system that specifies under what conditions two quantifiers may exchange their positions without changing the meaning of the semantic representation	NP CC NP ( ) VV JJ NN IN NNS IN DT VV NN WDT VVZ IN WP NNS CD NNS MD VV PP$ NNS IN VVG DT NN IN DT JJ NN	BackGround	SRelated	Neutral
08-1026_18	 Based on this definition , Koller and Thater ( ) present an algorithm (henceforth , KT06) that deletes rules from a dominance chart and thus removes subsets of readings from the USR	VVN IN DT NN , NP CC NP ( ) VV DT NN NN , NN WDT VVZ NNS IN DT NN NN CC RB VVZ NNS IN NNS IN DT NP	BackGround	SRelated	Neutral
08-1026_18	4 of Koller and Thater ( ) completely , whereas KT06 won't	CD IN NP CC NP ( ) RB , IN NP NP	NULL	NULL	NULL
08-1026_18	We use a slightly weaker version of the rewrite system that Koller and Thater ( ) used in their evaluation	PP VVP DT RB JJR NN IN DT VV NN IN/that NP CC NP ( ) VVN IN PP$ NN	Fundamental	Basis	Neutral
08-1026_20	For instance , both Combinatory Categorial Grammars ( ) and synchronous grammars ( ) represent syntactic and semantic ambiguity as part of the same parse chart	IN NN , CC NP NP NP ( ) CC JJ NNS ( ) VV JJ CC JJ NN IN NN IN DT JJ VVP NN	BackGround	GRelated	Neutral
08-1026_21	An important class of dominance graphs are hy-pernormally connected dominance graphs , or dominance nets ( )	DT JJ NN IN NN NNS VBP RB VVN NN NNS , CC NN NNS ( )	BackGround	GRelated	Neutral
08-1026_24	It is also useful in applications beyond semantic construction , e.g.in discourse parsing ( )	PP VBZ RB JJ IN NNS IN JJ NN , NN NN VVG ( )	BackGround	GRelated	Positive
08-1026_27	The problem of computing the best tree is NP-complete ( )	DT NN IN VVG DT JJS NN VBZ JJ ( )	BackGround	GRelated	Neutral
08-1027_2	Since ( ) , numerous works have used patterns for discovery and identification of instances of semantic relationships (e.g. , ( ))	IN ( ) , JJ NNS VHP VVN NNS IN NN CC NN IN NNS IN JJ NNS NN , ( NN	BackGround	GRelated	Neutral
08-1027_3	To improve results , some systems utilize additional manually constructed semantic resources such as WordNet (WN) ( )	TO VV NNS , DT NNS VV JJ RB VVN JJ NNS JJ IN NP NP ( )	BackGround	GRelated	Neutral
08-1027_3	Among the 15 systems presented by the 14 SemEval teams , some utilized the manually provided WordNet tags for the dataset pairs (e.g. , ( ))	IN DT CD NNS VVN IN DT CD JJ NNS , DT VVD DT RB VVN NN NNS IN DT NN NNS NN , ( NN	BackGround	GRelated	Neutral
08-1027_3	For reference , the best results overall ( ) are also shown	IN NN , DT JJS NNS JJ ( ) VBP RB VVN	BackGround	SRelated	Positive
08-1027_4	Many other works manually develop a set of heuristic features devised with some specific relationship in mind , like a WordNet-based meronymy feature ( ) or size-of feature ( )	JJ JJ NNS RB VVP DT NN IN JJ NNS VVN IN DT JJ NN IN NN , IN DT JJ NN NN ( ) CC JJ NN ( )	BackGround	GRelated	Neutral
08-1027_5	The winning algorithms were LWL ( ) , SMO ( ) , and K* ( ) (there were 7 tasks , and different algorithms could be selected for each task)	DT JJ NNS VBD NP ( ) , NP ( ) , CC NP ( ) NN VBD CD NNS , CC JJ NNS MD VB VVN IN DT NN	BackGround	SRelated	Positive
08-1027_6	Some other systems that avoided using the labels used WN as a supporting resource for their algorithms ( )	DT JJ NNS WDT VVD VVG DT NNS VVD NN IN DT VVG NN IN PP$ NNS ( )	BackGround	GRelated	Neutral
08-1027_6	Table 1 shows our results , along with the best Task 4 result not using WordNet labels ( )	NN CD VVZ PP$ NNS , RB IN DT JJS NP CD NN RB VVG NP NNS ( )	Fundamental	Basis	Positive
08-1027_7	To specify patterns , following ( ) we classify words into high-frequency words (HFWs) and content words (CWs)	TO VV NNS , VVG ( ) PP VV NNS IN NN NNS JJ CC JJ NNS NN	Fundamental	Idea	Neutral
08-1027_7	For each nominal pair (w 1 ,w 2) in a given sentence S , we use a method similar to ( ) to extract words that have a shared meaning with w 1 or w 2	IN DT JJ NN NN CD NNS JJ IN DT VVN NN NP , PP VVP DT NN JJ TO ( ) TO VV NNS WDT VHP DT VVN NN IN NN CD CC NN LS	Fundamental	Idea	Neutral
08-1027_8	In ( ) we present an approach to extract pattern clusters from an untagged corpus	IN ( ) PP VVP DT NN TO VV NN NNS IN DT JJ NN	BackGround	SRelated	Neutral
08-1027_8	In ( ) we describe the algorithm at length , discuss its behavior and parameters in detail , and evaluate its intrinsic quality	IN ( ) PP VVP DT NN IN NN , VV PP$ NN CC NNS IN NN , CC VV PP$ JJ NN	BackGround	SRelated	Neutral
08-1027_9	This corpus was extracted from the web starting from open directory links , comprising English web pages with varied topics and styles ( )	DT NN VBD VVN IN DT NN VVG IN JJ NN NNS , VVG JJ NN NNS IN JJ NNS CC NNS ( )	Fundamental	Basis	Neutral
08-1027_10	Common choices include variations of SVM ( ) , decision trees and memory-based learners	JJ NNS VVP NNS IN NP ( ) , NN NNS CC JJ NNS	BackGround	GRelated	Neutral
08-1027_11	For example , in noun compounds many different semantic relationships are encoded by the same simple form ( ): 'dog food' denotes food consumed by dogs , while 'summer morn-ing' denotes a morning that happens in the summer	IN NN , IN NN NNS JJ JJ JJ NNS VBP VVN IN DT JJ JJ NN ( JJ NN NP VVZ NN VVN IN NNS , IN NP NP VVZ DT NN WDT VVZ IN DT NN	BackGround	GRelated	Neutral
08-1027_13	Recently , SemEval-07 Task 4 ( ) proposed a benchmark dataset that includes a subset of 7 widely accepted nominal relationship (NR) classes , allowing consistent evaluation of different NR classification algorithms	RB , JJ NP CD ( ) VVN DT JJ NN WDT VVZ DT NN IN CD RB VVN JJ NN NN NNS , VVG JJ NN IN JJ NP NN NNS	BackGround	GRelated	Neutral
08-1027_13	The most recent dataset has been developed for SemEval 07 Task 4 ( )	DT RBS JJ NN VHZ VBN VVN IN NP CD NP CD ( )	BackGround	GRelated	Neutral
08-1027_13	In our evaluation we have selected the setup and data from SemEval-07 Task 4 ( )	IN PP$ NN PP VHP VVN DT NN CC NNS IN JJ NP CD ( )	Fundamental	Basis	Neutral
08-1027_13	Task 4 ( ) involves classification of relationships between simple nominals other than named entities	NN CD ( ) VVZ NN IN NNS IN JJ NNS JJ IN VVN NNS	BackGround	SRelated	Neutral
08-1027_14	A leading method for utilizing context information for classification and extraction of relationships is that of patterns ( )	DT VVG NN IN VVG NN NN IN NN CC NN IN NNS VBZ IN/that IN NNS ( )	BackGround	GRelated	Positive
08-1027_16	Several different relationship hierarchies have been proposed ( )	JJ JJ NN NNS VHP VBN VVN ( )	BackGround	GRelated	Neutral
08-1027_16	A wide variety of features are used by different algorithms , ranging from simple bag-of-words frequencies to WordNet-based features ( )	DT JJ NN IN NNS VBP VVN IN JJ NNS , VVG IN JJ NNS NNS TO JJ NNS ( )	BackGround	GRelated	Neutral
08-1027_16	Moldovan et al. ( ) proposed a different scheme with 35 classes	NP NP NP ( ) VVN DT JJ NN IN CD NNS	BackGround	GRelated	Neutral
08-1027_18	Since the SemEval dataset is of a very specific nature , we have also applied our classification framework to the ( ) dataset , which contains 600 pairs labeled with 5 main relationship types	IN DT JJ NN VBZ IN DT RB JJ NN , PP VHP RB VVN PP$ NN NN TO DT ( ) NN , WDT VVZ CD NNS VVN IN CD JJ NN NNS	Fundamental	Basis	Neutral
08-1027_19	We have used the exact evaluation procedure described in ( ) , achieving a class f-score average of 60.1 , as opposed to 54.6 in ( ) and 51.2 in ( )	PP VHP VVN DT JJ NN NN VVN IN ( ) , VVG DT NN NN NN IN CD , RB VVD TO CD IN ( ) CC CD IN ( )	Compare	Compare	Neutral
08-1027_20	Many relationship classification methods utilize some language-dependent preprocessing , like deep or shallow parsing , part of speech tagging and named entity annotation ( )	JJ NN NN NNS VV DT JJ NN , IN JJ CC JJ VVG , NN IN NN VVG CC VVN NN NN ( )	BackGround	GRelated	Neutral
08-1027_21	Strategies were developed for discovery of multiple patterns for some specified lexical relationship ( ) and for unsuper-vised pattern ranking ( )	NNS VBD VVN IN NN IN JJ NNS IN DT JJ JJ NN ( ) CC IN JJ NN NN ( )	BackGround	GRelated	Neutral
08-1027_23	Other resources used for relationship discovery include Wikipedia ( ) , thesauri or synonym sets ( ) and domain-specific semantic hierarchies like MeSH ( )	JJ NNS VVN IN NN NN VVP NP ( ) , NNS CC NN NNS ( ) CC JJ JJ NNS IN NP ( )	BackGround	GRelated	Neutral
08-1027_24	Rosenfeld and Feldman ( ) discover relationship instances by clustering entities appearing in similar contexts	NP CC NP ( ) VV NN NNS IN VVG NNS VVG IN JJ NNS	BackGround	GRelated	Neutral
08-1027_27	Relationship classification is known to improve many practical tasks , e.g. , textual entailment ( )	NN NN VBZ VVN TO VV JJ JJ NNS , FW , JJ NN ( )	BackGround	GRelated	Neutral
08-1027_30	Freely available tools like Weka ( ) allow easy experimentation with common learning algorithms ( )	RB JJ NNS IN NP ( ) VV JJ NN IN JJ NN NNS ( )	BackGround	GRelated	Positive
08-1028_0	All occurrences of these verbs with a subject noun were next extracted from a RASP parsed ( ) version of the British National Corpus (BNC)	DT NNS IN DT NNS IN DT JJ NN VBD RB VVN IN DT NP VVD ( ) NN IN DT NP NP NP NN	Fundamental	Basis	Neutral
08-1028_2	Following previous work ( ) , we optimized its parameters on a word-based semantic similarity task	VVG JJ NN ( ) , PP VVD PP$ NNS IN DT JJ JJ NN NN	Fundamental	Idea	Neutral
08-1028_2	In addition , Bullinaria and Levy ( ) found that these parameters perform well on a number of other tasks such as the synonymy task from the Test of English as a Foreign Language (TOEFL)	IN NN , NP CC NP ( ) VVD IN/that DT NNS VVP RB IN DT NN IN JJ NNS JJ IN DT NN NN IN DT NN IN NP IN DT NP NP NN	BackGround	GRelated	Neutral
08-1028_3	Examples include automatic thesaurus extraction ( ) , word sense discrimination ( ) and disambiguation ( ) , collocation extraction ( ) , text segmentation ( )  , and notably information retrieval ( )	NNS VVP JJ NN NN ( ) , NN NN NN ( ) CC NN ( ) , NN NN ( ) , NN NN ( ) , CC RB NN NN ( )	BackGround	GRelated	Neutral
08-1028_4	NLP tasks that could benefit from composition models include paraphrase identification and context-dependent language modeling ( )	NN NNS WDT MD VV IN NN NNS VVP VVP NN CC JJ NN NN ( )	BackGround	MRelated	Positive
08-1028_5	In order to establish which ones fit our data better , we examined whether the correlation coefficients achieved differ significantly using a t-test ( )	IN NN TO VV WDT NNS VVP PP$ NNS RBR , PP VVD IN DT NN NNS VVN VVP RB VVG DT NN ( )	Fundamental	Basis	Neutral
08-1028_6	While vector addition has been effective in some applications such as essay grading ( ) and coherence assessment ( ) , there is ample empirical evidence that syntactic relations across and within sentences are crucial for sentence and discourse processing ( ) and modulate cognitive behavior in sentence priming ( ) and inference tasks ( )	IN NN NN VHZ VBN JJ IN DT NNS JJ IN NN VVG ( ) CC NN NN ( ) , EX VBZ JJ JJ NN IN/that JJ NNS IN CC IN NNS VBP JJ IN NN CC NN NN ( ) CC VVP JJ NN IN NN NN ( ) CC NN NNS ( )	BackGround	GRelated	Positive
08-1028_6	For example , assuming that individual words are represented by vectors , we can compute the meaning of a sentence by taking their mean ( )	IN NN , VVG IN/that JJ NNS VBP VVN IN NNS , PP MD VV DT NN IN DT NN IN VVG PP$ NN ( )	BackGround	GRelated	Neutral
08-1028_8	Specifically , they belonged to different synsets and were maximally dissimilar as measured by the Jiang and Con-rath ( ) measure	RB , PP VVD TO JJ NNS CC VBD RB JJ IN VVN IN DT NP CC NP ( ) NN	Fundamental	Basis	Neutral
08-1028_9	Vector-based models of word meaning ( ) have become increasingly popular in natural language processing (NLP) and cognitive science	JJ NNS IN NN NN ( ) VHP VVN RB JJ IN JJ NN NN NN CC JJ NN	BackGround	GRelated	Positive
08-1028_9	In cognitive science vector-based models have been successful in simulating semantic priming ( ) and text comprehension ( )	IN JJ NN JJ NNS VHP VBN JJ IN VVG JJ NN ( ) CC NN NN ( )	BackGround	GRelated	Positive
08-1028_9	This is illustrated in the example below taken from Landauer et al. ( )	DT VBZ VVN IN DT NN IN VVN IN NP NP NP ( )	Fundamental	Basis	Neutral
08-1028_9	Previous applications of vector addition to document indexing ( ) or essay grading ( ) were more concerned with modeling the gist of a document rather than the meaning of its sentences	JJ NNS IN NN NN TO NN NN ( ) CC NN VVG ( ) VBD RBR VVN IN VVG DT NN IN DT NN RB IN DT NN IN PP$ NNS	BackGround	GRelated	Neutral
08-1028_12	Moreover , the vector similarities within such semantic spaces have been shown to substantially correlate with human similarity judgments ( ) and word association norms ( )	RB , DT NN NNS IN JJ JJ NNS VHP VBN VVN TO RB VVP IN JJ NN NNS ( ) CC NN NN NNS ( )	BackGround	GRelated	Neutral
08-1028_13	Computational models of semantics which use symbolic logic representations ( ) can account naturally for the meaning of phrases or sentences	JJ NNS IN NNS WDT VVP JJ NN NNS ( ) MD VV RB IN DT NN IN NNS CC NNS	BackGround	GRelated	Neutral
08-1029_0	multi-task learning ( ) in which the task (and label set) is allowed to vary from source to target	NN NN ( ) IN WDT DT NN NN NN NN VBZ VVN TO VV IN NN TO VV	BackGround	GRelated	Neutral
08-1029_1	Most of this prior work deals with supervised transfer learning , and thus requires labeled source domain data , though there are examples of unsupervised ( ) , semi-supervised ( ) , and transductive approaches ( )	JJS IN DT JJ NN NNS IN JJ NN NN , CC RB VVZ VVN NN NN NNS , IN EX VBP NNS IN JJ ( ) , JJ ( ) , CC JJ NNS ( )	BackGround	GRelated	Neutral
08-1029_2	Some of the first formulations of the transfer learning problem were presented over 10 years ago ( )	DT IN DT JJ NNS IN DT NN VVG NN VBD VVN IN CD NNS RB ( )	BackGround	GRelated	Neutral
08-1029_3	Other techniques have tried to quantify the generalizability of certain features across domains ( ) , or tried to exploit the common structure of related problems ( )	JJ NNS VHP VVN TO VV DT NN IN JJ NNS IN NNS ( ) , CC VVD TO VV DT JJ NN IN JJ NNS ( )	BackGround	GRelated	Neutral
08-1029_5	These are: abstracts from biological journals [UT ( ) , Yapex ( )]; news articles [MUC6 ( ) , MUC7 ( )]; and personal e-mails [CSPACE ( )]	DT JJ NNS IN JJ NNS NN ( ) , NP ( JJ NN NNS NN ( ) , JJ ( JJ CC JJ NP NN ( NN	Fundamental	Basis	Neutral
08-1029_7	When the task being learned varies (say , from identifying person names to identifying protein names) , the problem is called multi-task learning ( )	WRB DT NN VBG VVN VVZ NN , IN VVG NN NNS TO VVG NN NN , DT NN VBZ VVN NN VVG ( )	BackGround	GRelated	Neutral
08-1029_8	One recently proposed method ( ) for transfer learning in Maximum Entropy models 1 involves modifying the /t's of this Gaussian prior	CD RB VVN NN ( ) IN NN VVG IN NP NP NNS CD VVZ VVG DT NNS IN DT JJ RB	BackGround	SRelated	Neutral
08-1029_9	To avoid overfitting the training data , these A's are often further constrained by the use of a Gaussian prior ( ) with diagonal co-variance , N(/x , a 2) , which tries to maximize: where f3 > 0 is a parameter controlling the amount of regularization , and N is the number of sentences in the training set	TO VV VVG DT NN NNS , DT NP VBP RB RBR VVN IN DT NN IN DT JJ JJ ( ) IN JJ NN , NP , DT JJ , WDT VVZ TO NN WRB NN SYM CD VBZ DT NN VVG DT NN IN NN , CC NP VBZ DT NN IN NNS IN DT NN NN	BackGround	SRelated	Neutral
08-1029_10	Representing feature spaces with this kind oftree , besides often coinciding with the explicit language used by common natural language toolkits ( ) , has the added benefit of allowing a model to easily back-off , or smooth , to decreasing levels of specificity	VVG NN NNS IN DT NN NN , IN RB VVG IN DT JJ NN VVN IN JJ JJ NN NNS ( ) , VHZ DT JJ NN IN VVG DT NN TO RB NN , CC JJ , TO VVG NNS IN NN	BackGround	GRelated	Neutral
08-1029_10	We used a standard natural language toolkit ( ) to compute tens of thousands of binary features on each of these tokens , encoding such information as capitalization patterns and contextual information from surrounding words	PP VVD DT JJ JJ NN NN ( ) TO VV NNS IN NNS IN JJ NNS IN DT IN DT NNS , VVG JJ NN IN NN NNS CC JJ NN IN VVG NNS	Fundamental	Basis	Neutral
08-1029_11	When only the type of data being examined is allowed to vary (from news articles to e-mails , for example) , the problem is called domain adaptation ( )	WRB RB DT NN IN NNS VBG VVN VBZ VVN TO VV NN NN NNS TO NP , IN NN , DT NN VBZ VVN NN NN ( )	BackGround	GRelated	Neutral
08-1029_12	Daume allows an extra degree of freedom among the features of his domains , implicitly creating a two-level feature hierarchy with one branch for general features , and another for domain specific ones , but does not extend his hierarchy further ( ))	NP VVZ DT JJ NN IN NN IN DT NNS IN PP$ NNS , RB VVG DT JJ NN NN IN CD NN IN JJ NNS , CC DT IN NN JJ NNS , CC VVZ RB VV PP$ NN RBR ( NN	BackGround	GRelated	Neutral
08-1029_18	In this work , we will base our work on Conditional Random Fields (CRF's) ( ) , which are now one of the most preferred sequential models for many natural language processing tasks	IN DT NN , PP MD VV PP$ NN IN NP NP NP NN ( ) , WDT VBP RB CD IN DT RBS JJ JJ NNS IN JJ JJ NN NN NNS	Fundamental	Basis	Positive
08-1029_19	Recent work using so-called meta-level priors to transfer information across tasks ( ) , while related , does not take into explicit account the hierarchical structure ofthese meta-level features often found in NLP tasks	JJ NN VVG JJ JJ NNS TO VV NN IN NNS ( ) , IN VVN , VVZ RB VV IN JJ NN DT JJ NN NN NN NNS RB VVD IN NP NNS	BackGround	GRelated	Negative
08-1029_20	It has been shown empirically that , while the significance of particular features might vary between domains and tasks , certain generalized classes of features retain their importance across domains ( )	PP VHZ VBN VVN RB IN/that , IN DT NN IN JJ NNS MD VV IN NNS CC NNS , JJ VVD NNS IN NNS VVP PP$ NN IN NNS ( )	BackGround	GRelated	Positive
08-1029_21	One common way of addressing the transfer learning problem is to use a prior which , in conjunction with a probabilistic model , allows one to specify a priori beliefs about a distribution , thus biasing the results a learning algorithm would have produced had it only been allowed to see the training data ( )	CD JJ NN IN VVG DT NN VVG NN VBZ TO VV DT JJ WDT , IN NN IN DT JJ NN , VVZ PP TO VV DT NN NNS IN DT NN , RB VVG DT NNS DT VVG NN MD VH VVN VHD PP RB VBN VVN TO VV DT NN NNS ( )	BackGround	GRelated	Neutral
08-1029_24	Similarly , work on hierarchical penalization ( ) in two-level trees tries to produce models that rely only on a relatively small number of groups of variable , as structured by the tree , as opposed to transferring knowledge between branches themselves	RB , NN IN JJ NN ( ) IN JJ NNS VVZ TO VV NNS WDT VVP RB IN DT RB JJ NN IN NNS IN NN , RB VVN IN DT NN , RB VVN TO VVG NN IN NNS PP	BackGround	GRelated	Neutral
08-1030_0	Almost all the current event extraction systems focus on processing single documents and , except for coreference resolution , operate a sentence at a time ( )	RB PDT DT JJ NN NN NNS VVP IN VVG JJ NNS CC , IN IN NN NN , VVP DT NN IN DT NN ( )	BackGround	GRelated	Neutral
08-1030_1	We use a state-of-the-art English IE system as our baseline ( )	PP VVP DT JJ NP NP NN IN PP$ NN ( )	Fundamental	Basis	Positive
08-1030_3	Mann ( ) encoded specific inference rules to improve extraction of CEO (name , start year , end year) in the MUC management succession task	NP ( ) VVN JJ NN NNS TO VV NN IN NN NNS , NN NN , NN NN IN DT NP NN NN NN	BackGround	GRelated	Positive
08-1030_5	In addition , Patwardhan and Ri-loff ( ) also demonstrated that selectively applying event patterns to relevant regions can improve MUC event extraction	IN NN , NP CC NP ( ) RB VVD IN/that RB VVG NN NNS TO JJ NNS MD VV NP NN NN	BackGround	GRelated	Neutral
08-1030_6	We then use the INDRI retrieval system ( ) to obtain the top N (N=25 in this pa-per 3) related documents	PP RB VVP DT NP NN NN ( ) TO VV DT JJ NP NN IN DT NN JJ JJ NNS	Fundamental	Basis	Neutral
08-1030_7	Yangarber et al. ( ) applied cross-document inference to correct local extraction results for disease name , location and start/end time	NP NP NP ( ) VVN NN NN TO VV JJ NN NNS IN NN NN , NN CC NN NN	BackGround	GRelated	Neutral
08-1030_8	Several recent studies involving specific event types have stressed the benefits of going beyond traditional single-document extraction; in particular , Yangarber ( ) has emphasized this potential in his work on medical information extraction	JJ JJ NNS VVG JJ NN NNS VHP VVN DT NNS IN VVG IN JJ NN NN IN JJ , NP ( ) VHZ VVN DT NN IN PP$ NN IN JJ NN NN	BackGround	GRelated	Neutral
08-1030_10	Heng Ji?alph Grishman Computer Science Department New York University New York , NY 10003 , USA (hengji , grishman)@cs.nyu.edu Abstract We apply the hypothesis of "One Sense Per Discourse" ( ) to information extraction (IE) , and extend the scope of "discourse" from one single document to a cluster of topically-related documents	NP NP NP NP NP NP NP NP NP NP NP , NP CD , NP NP , NP NP PP VVP DT NN IN NN NN IN NP ( ) TO NN NN NN , CC VV DT NN IN NN IN CD JJ NN TO DT NN IN JJ NNS	NULL	NULL	NULL
08-1030_10	The trigger labeling task described in this paper is in part a task of word sense disambiguation (WSD) , so we have used the idea of sense consistency introduced in ( ) , extending it to operate across related documents	DT NN VVG NN VVN IN DT NN VBZ IN NN DT NN IN NN NN NN NN , RB PP VHP VVN DT NN IN NN NN VVN IN ( ) , VVG PP TO VV IN JJ NNS	Fundamental	Idea	Neutral
08-1031_0	The correlated topic model ( ) is one way to account for relationships between hidden topics; more structured representations , such as hierarchies , may also be considered	DT VVN NN NN ( ) VBZ CD NN TO VV IN NNS IN JJ NN RBR JJ NNS , JJ IN NNS , MD RB VB VVN	BackGround	GRelated	Neutral
08-1031_1	We employ Gibbs sampling , previously used in NLP by Finkel et al. ( ) and Goldwater et al. ( ) , among others	PP VVP NP NN , RB VVN IN NP IN NP NP NP ( ) CC NP NP NP ( ) , IN NNS	Fundamental	Basis	Neutral
08-1031_2	Our approach relates to previous work on property extraction from reviews ( )	PP$ NN VVZ TO JJ NN IN NN NN IN NNS ( )	BackGround	GRelated	Neutral
08-1031_3	for a discussion of similarity metrics , see Lin ( )	IN DT NN IN NN NNS , VVP NP ( )	BackGround	SRelated	Neutral
08-1031_4	For this purpose , we use the Rand Index ( ) , a measure of cluster similarity	IN DT NN , PP VVP DT NP NP ( ) , DT NN IN NN NN	Fundamental	Basis	Neutral
08-1031_6	Recent work has examined coupling topic models with explicit supervision ( )	JJ NN VHZ VVN NN NN NNS IN JJ NN ( )	BackGround	GRelated	Neutral
08-1031_8	become widely available ( )	VV RB JJ ( )	NULL	NULL	NULL
08-1032_0	These range from supervised classification ( ) to instantiations of the noisy-channel model ( ) , to clustering ( ) , and methods inspired by information retrieval ( )	DT NN IN JJ NN ( ) TO NNS IN DT NN NN ( ) , TO VVG ( ) , CC NNS VVN IN NN NN ( )	BackGround	GRelated	Neutral
08-1032_0	Barnard et al. ( ) propose a hierarchical latent model in order to account for the fact that some words are more general than others	NP NP NP ( ) VV DT JJ JJ NN IN NN TO VV IN DT NN IN/that DT NNS VBP RBR JJ IN NNS	BackGround	GRelated	Neutral
08-1032_1	More sophisticated graphical models ( ) have also been employed including Gaussian Mixture Models (GMM) and Latent Dirichlet Allocation (LDA)	RBR JJ JJ NNS ( ) VHP RB VBN VVN VVG JJ NN NP NP CC NP NP NP NN	BackGround	GRelated	Neutral
08-1032_2	Specifically , we use Latent Dirichlet Allocation (LDA) as our topic model ( )	RB , PP VVP NP NP NP NN IN PP$ NN NN ( )	Fundamental	Basis	Neutral
08-1032_4	Duygulu et al. ( ) improve on this model by treating image regions and keywords as a bi-text and using the EM algorithm to construct an image region-word dictionary	NP NP NP ( ) VV IN DT NN IN VVG NN NNS CC NNS IN DT NN CC VVG DT JJ NN TO VV DT NN NN NN	BackGround	GRelated	Positive
08-1032_4	Typically , the k-best words are taken to be the automatic annotations for a test image I ( ) where k is a small number and the same for all images	RB , DT NN NNS VBP VVN TO VB DT JJ NNS IN DT NN NN PP ( ) WRB NN VBZ DT JJ NN CC DT JJ IN DT NNS	BackGround	SRelated	Neutral
08-1032_4	Our evaluation follows the experimental methodology proposed in Duygulu et al. ( )	PP$ NN VVZ DT JJ NN VVN IN NP NP NP ( )	Fundamental	Idea	Neutral
08-1032_5	For instance , resources like WordNet ( ) can be used to expand the annotations by exploiting information about is-a relationships	IN NN , NNS IN NP ( ) MD VB VVN TO VV DT NNS IN VVG NN IN NN NNS	BackGround	MRelated	Neutral
08-1032_6	Finally , relevance models originally developed for information retrieval , have been successfully applied to image annotation ( )	RB , NN NNS RB VVN IN NN NN , VHP VBN RB VVN TO NN NN ( )	BackGround	GRelated	Positive
08-1032_6	We are more interested in modeling the presence or absence of words in the annotation and thus use the multiple-Bernoulli distribution to generate words ( )	PP VBP RBR JJ IN VVG DT NN CC NN IN NNS IN DT NN CC RB VV DT NP NN TO VV NNS ( )	Fundamental	Basis	Neutral
08-1032_6	Using a grid avoids unnecessary errors from image segmentation algorithms , reduces computation time , and simplifies parameter estimation ( )	VVG DT NN VVZ JJ NNS IN NN NN NNS , VVZ NN NN , CC VVZ NN NN ( )	NULL	NULL	NULL
08-1032_7	Standard latent semantic analysis (LSA) and its probabilistic variant (PLSA) have been applied to this task ( )	JJ JJ JJ NN NN CC PP$ JJ JJ NN VHP VBN VVN TO DT NN ( )	BackGround	GRelated	Neutral
08-1032_9	Specifically , we extend and modify Lavrenko's ( ) continuous relevance model to suit our task	RB , PP VVP CC VV NP ( ) JJ NN NN TO VV PP$ NN	Fundamental	Basis	Neutral
08-1032_9	Our work is an extension of the continuous relevance annotation model put forward in Lavrenko et al. ( )	PP$ NN VBZ DT NN IN DT JJ NN NN NN VVD RB IN NP NP NP ( )	Fundamental	Basis	Neutral
08-1032_9	The continuous relevance image annotation model ( ) generatively learns the joint probability distribution P(V , W) of words W and image regions V	DT JJ NN NN NN NN ( ) RB VVZ DT JJ NN NN NP , NP IN NNS NP CC NN NNS NN	BackGround	SRelated	Neutral
08-1032_9	When estimating P(V I\s) , the probability of image regions and words , Lavrenko et al. ( ) reasonably assume a generative Gaussian kernel distribution for the image regions: where N vi is the number of regions in image I , v r the feature vector for region r in image i , n s v the number of regions in the image of latent variable s , v i the feature vector for region i in s's image , k the dimension of the image feature vectors and ?the feature covari-ance matrix	WRB VVG NP NP , DT NN IN NN NNS CC NNS , NP NP NP ( ) RB VV DT JJ JJ NN NN IN DT NN NN WRB NP NP VBZ DT NN IN NNS IN NN PP , NN NN DT NN NN IN NN NN IN NN NP , NN NN NN DT NN IN NNS IN DT NN IN JJ JJ NNS , NN NP DT NN NN IN NN NP IN JJ NN , NN DT NN IN DT NN NN NNS CC NNS VVP NN NN	BackGround	SRelated	Neutral
08-1032_9	Lavrenko et al. ( ) estimate the word probabilities P(W I\s) using a multinomial distribution	NP NP NP ( ) VV DT NN NNS JJ NP VVG DT NN NN	BackGround	SRelated	Neutral
08-1032_9	Our third baseline is Lavrenko et al.'s ( ) continuous relevance model	PP$ JJ NN VBZ NP NP NNS ( ) JJ NN NN	Fundamental	Basis	Neutral
08-1032_9	We compare the annotation performance of the model proposed in this paper (ExtModel) with Lavrenko et al.'s ( ) original continuous relevance model (Lavrenko03) and two other simpler models which do not take the image into account (tf* idfand Doc-Title)	PP VVP DT NN NN IN DT NN VVN IN DT NN NN IN NP NP NNS ( ) JJ JJ NN NN NN CC CD JJ JJR NNS WDT VVP RB VV DT NN IN NN NN NN NP	Compare	Compare	Neutral
08-1032_9	Incidentally , LDA can be also used to rerank the output of Lavrenko et al.'s ( ) model	RB , NP MD VB RB VVN TO VV DT NN IN NP NP NNS ( ) NN	BackGround	SRelated	Neutral
08-1032_9	4 Interestingly , the latter yields precision similar to Lavrenko et al. ( )	CD RB , DT JJ NNS NN JJ TO NP NP NP ( )	BackGround	SRelated	Neutral
08-1032_10	The co-occurrence model ( ) collects co-occurrence counts between words and image features and uses them to predict annotations for new images	DT NN NN ( ) VVZ NN NNS IN NNS CC NN NNS CC VVZ PP TO VV NNS IN JJ NNS	BackGround	GRelated	Neutral
08-1032_11	We estimate P est (w s d) using maximum likelihood estimation ( ): P est (w\s d ) = -?(7) num sd where num wsi denotes the frequency of w in the accompanying document of latent variable s and num sd the number of all tokens in the document	PP VVP NN NP NN NN NN VVG JJ NN NN ( JJ NN NP NP SYM ) SYM JJ NN NN WRB NN NN VVZ DT NN IN NN IN DT JJ NN IN JJ JJ NN CC NN NNS DT NN IN DT NNS IN DT NN	Fundamental	Basis	Neutral
08-1032_12	We reduce the search space , by scoring each document word with its tf * idf weight ( ) and adding the n-best candidates to our caption vocabulary	PP VVP DT NN NN , IN VVG DT NN NN IN PP$ NN SYM NN NN ( ) CC VVG DT JJ NNS TO PP$ NN NN	Fundamental	Basis	Neutral
08-1032_12	The first baseline is based on tf * idf ( )	DT JJ NN VBZ VVN IN NP SYM NN ( )	Fundamental	Basis	Neutral
08-1032_13	The documents and captions were part-of-speech tagged and lemmatized with Tree Tagger ( ).Words other than nouns , verbs , and adjectives were discarded	DT NNS CC NNS VBD NN VVN CC VVN IN NP NP ( NNS JJ IN NNS , NNS , CC NNS VBD VVN	Fundamental	Basis	Neutral
08-1032_15	The earliest approaches are closely related to image classification ( ) , where pictures are assigned a set of simple descriptions such as indoor , outdoor , landscape , people , animal	DT JJS NNS VBP RB VVN TO NN NN ( ) , WRB NNS VBP VVN DT NN IN JJ NNS JJ IN JJ , JJ , NN , NNS , NN	BackGround	GRelated	Neutral
08-1032_16	reliably ( )	RB ( )	NULL	NULL	NULL
08-1032_18	LDA represents documents as a mixture of topics and has been previously used to perform document classification ( ) and ad-hoc information retrieval ( ) with good results	NP VVZ NNS IN DT NN IN NNS CC VHZ VBN RB VVN TO VV NN NN ( ) CC NN NN NN ( ) IN JJ NNS	BackGround	GRelated	Neutral
08-1033_0	Maximum Entropy Models ( ) seek to maximise the conditional probability of classes , given certain observations (features)	JJ NP NP ( ) VV TO VV DT JJ NN IN NNS , VVN JJ NNS NN	BackGround	SRelated	Neutral
08-1033_3	This phenomenon , together with others used to express forms of authorial opinion , is often classified under the notion of subjectivity ( ) , ( )	DT NN , RB IN NNS VVN TO VV NNS IN JJ NN , VBZ RB VVN IN DT NN IN NN ( ) , ( )	BackGround	GRelated	Neutral
08-1033_3	In contrast to the findings of Wiebe et al. (( )) , who addressed the broader task of subjectivity learning and found that the density of other potentially subjective cues in the context benefits classification accuracy , we observed that the co-occurence of speculative cues in a sentence does not help in classifying a term as speculative or not	IN NN TO DT NNS IN NP CC JJ JJ NN , WP VVD DT JJR NN IN NN NN CC VVD IN/that DT NN IN JJ RB JJ NNS IN DT NN VVZ NN NN , PP VVD IN/that DT NN IN JJ NNS IN DT NN VVZ RB VV IN VVG DT NN IN JJ CC RB	Compare	Compare	Neutral
08-1034_0	Since the benefits from combining classifiers that always make similar decisions is minimal , the two (or more) base-learners should complement each other ( )	IN DT NNS IN VVG NNS WDT RB VVP JJ NNS VBZ JJ , DT CD NN JJ NNS MD VV DT JJ ( )	BackGround	SRelated	Neutral
08-1034_2	Recent experiments assessing system portability across different domains , conducted by Aue and Gamon ( ) , demonstrated that sentiment annotation classifiers trained in one domain do not perform well on other domains	JJ NNS VVG NN NN IN JJ NNS , VVN IN NP CC NP ( ) , VVD IN/that NN NN NNS VVN IN CD NN VVP RB VV RB IN JJ NNS	BackGround	GRelated	Neutral
08-1034_2	Such approaches work well in situations where large labeled corpora are available for training and validation (e.g. , movie reviews) , but they do not perform well when training data is scarce or when it comes from a different domain ( ) , topic ( ) or time period ( )	JJ NNS VVP RB IN NNS WRB JJ VVN NNS VBP JJ IN NN CC NN NN , NN NN , CC PP VVP RB VV RB WRB NN NNS VBZ JJ CC WRB PP VVZ IN DT JJ NN ( ) , NN ( ) CC NN NN ( )	BackGround	GRelated	Neutral
08-1034_2	For instance , Aue and Gamon ( ) proposed training on a samll number of labeled examples and large quantities of unlabelled in-domain data	IN NN , NP CC NP ( ) VVN NN IN DT NN NN IN VVN NNS CC JJ NNS IN JJ NN NNS	BackGround	GRelated	Neutral
08-1034_2	Research on sentiment annotation is usually conducted at the text ( ) or at the sentence levels ( )	NN IN NN NN VBZ RB VVN IN DT NN ( ) CC IN DT NN NNS ( )	BackGround	GRelated	Neutral
08-1034_2	3.3 Establishing a Baseline for a Corpus-based System (CBS) Supervised statistical methods have been very successful in sentiment tagging of texts: on movie review texts they reach accuracies of 85-90% ( )	CD NN DT NP IN DT JJ NP NN VVD JJ NNS VHP VBN RB JJ IN NN VVG IN NN IN NN NN NNS PP VVP NNS IN CD ( )	BackGround	GRelated	Neutral
08-1034_3	A number of methods has been proposed in order to overcome this system portability limitation by using out-of-domain data , unlabelled in-domain corpora or a combination of in-domain and out-of-domain examples ( )	DT NN IN NNS VHZ VBN VVN IN NN TO VV DT NN NN NN IN VVG NN NNS , JJ NN NNS CC DT NN IN NN CC NN NNS ( )	BackGround	GRelated	Positive
08-1034_4	Consistent with findings in the literature ( ) , on the large corpus of movie review texts , the in-domain-trained system based solely on unigrams had lower accuracy than the similar system trained on bigrams	JJ IN NNS IN DT NN ( ) , IN DT JJ NN IN NN NN NNS , DT JJ NN VVN RB IN NNS VHD JJR NN IN DT JJ NN VVN IN NNS	Compare	Compare	Neutral
08-1034_6	Drezde et al. ( ) applied structural correspondence learning ( ) to the task of domain adaptation for sentiment classification of product reviews	NP NP NP ( ) VVN JJ NN VVG ( ) TO DT NN IN NN NN IN NN NN IN NN NNS	BackGround	GRelated	Neutral
08-1034_6	It also strongly depends on the similarity between the domains as has been shown by ( )	PP RB RB VVZ IN DT NN IN DT NNS RB VHZ VBN VVN IN ( )	BackGround	GRelated	Neutral
08-1034_8	On other domains , such as product reviews , the performance of systems that use general word lists is comparable to the performance of supervised machine learning approaches ( )	IN JJ NNS , JJ IN NN NNS , DT NN IN NNS WDT VVP JJ NN NNS VBZ JJ TO DT NN IN JJ NN VVG NNS ( )	BackGround	GRelated	Neutral
08-1034_8	To our knowledge , the only work that describes the application of statistical classifiers (SVM) to sentence-level sentiment classification is ( ) 1	TO PP$ NN , DT JJ NN WDT VVZ DT NN IN JJ NNS JJ TO JJ NN NN VBZ ( ) CD	BackGround	SRelated	Positive
08-1034_8	In sentiment tagging and related areas , Aue and Gamon ( ) demonstrated that combining classifiers can be a valuable tool in domain adaptation for sentiment analysis	IN NN VVG CC JJ NNS , NP CC NP ( ) VVD IN/that VVG NNS MD VB DT JJ NN IN NN NN IN NN NN	BackGround	SRelated	Neutral
08-1034_9	Since the structure of WordNet glosses is fairly different from that of other types of corpora , we developed a system that used the list of human-annotated adjectives from ( ) as a seed list and then learned additional unigrams from WordNet synsets and glosses with up to 88% accuracy , when evaluated against General Inquirer ( ) (GI) on the intersection of our automatically acquired list with GI	IN DT NN IN NP NNS VBZ RB JJ IN IN/that IN JJ NNS IN NNS , PP VVD DT NN WDT VVD DT NN IN JJ NNS IN ( ) IN DT NN NN CC RB VVD JJ NNS IN NP NNS CC NNS IN IN TO CD NN , WRB VVN IN NP NP ( ) NN IN DT NN IN PP$ RB VVN NN IN NP	Fundamental	Basis	Neutral
08-1034_9	In order to assign the membership score to each word , we did 58 system runs on unique non-intersecting seed lists drawn from manually annotated list of positive and negative adjectives from ( )	IN NN TO VV DT NN NN TO DT NN , PP VVD CD NN VVZ IN JJ JJ NN NNS VVN IN RB VVN NN IN JJ CC JJ NNS IN ( )	Fundamental	Basis	Neutral
08-1034_10	?A set of 1200 product review (PR) sentences extracted from the annotated corpus made available by Bing Liu ( ) (<a href="http://www.cs.uic.edu/">http://www.cs.uic.edu/ liub/FBS/FBS.html)	NNS VVN IN CD NN NN NN NNS VVN IN DT VVN NN VVD JJ IN NP NP ( ) NN NN NN	Fundamental	Basis	Neutral
08-1034_13	For this we used four different data sets of sentences annotated with sentiment tags: ?A set of movie review snippets (further: movie) from ( )	IN DT PP VVD CD JJ NNS NNS IN NNS VVN IN NN NN NN VVN IN NN NN NNS JJ NN IN ( )	Fundamental	Basis	Neutral
08-1034_14	But such general word lists were shown to perform worse than statistical models built on sufficiently large in-domain training sets of movie reviews ( )	CC JJ JJ NN NNS VBD VVN TO VV JJR IN JJ NNS VVN IN RB JJ NN NN VVZ IN NN NNS ( )	BackGround	GRelated	Neutral
08-1034_16	The results reported by ( ) for binary classification of sentences in a related domain of subjectivity tagging (i.e. , the separation of sentiment-laden from neutral sentences) suggest that statistical classifiers can perform well on this task: the authors have reached 74.9% accuracy on the MPQA corpus ( )	DT NNS VVN IN ( ) IN JJ NN IN NNS IN DT JJ NN IN NN VVG NP , DT NN IN NN IN JJ NN VVP IN/that JJ NNS MD VV RB IN DT NN DT NNS VHP VVN CD NN IN DT NP NN ( )	BackGround	SRelated	Positive
08-1034_19	Similarly , Tan et al. ( ) suggested to combine out-of-domain labeled examples with unla-belled ones from the target domain in order to solve the domain-transfer problem	RB , NP NP NP ( ) VVN TO VV JJ VVN NNS IN JJ NNS IN DT NN NN IN NN TO VV DT NN NN	BackGround	GRelated	Neutral
08-1034_19	In order to maximize the utility of the examples from the target domain , these examples were selected using Similarity Ranking and Relative Similarity Ranking algorithms ( )	IN NN TO VV DT NN IN DT NNS IN DT NN NN , DT NNS VBD VVN VVG NP NP CC NP NP NP NNS ( )	BackGround	SRelated	Neutral
08-1034_22	For example , it has been observed that texts often contain multiple opinions on different topics ( ) , which makes assignment of the overall sentiment to the whole document problematic	IN NN , PP VHZ VBN VVN IN/that NNS RB VVP JJ NNS IN JJ NNS ( ) , WDT VVZ NN IN DT JJ NN TO DT JJ NN JJ	BackGround	GRelated	Neutral
08-1034_23	The NOSs were then normalized into the interval from -1 to +1 using a sigmoid fuzzy membership function ( ) 4	DT NP VBD RB VVN IN DT NN IN CD TO JJ VVG DT JJ JJ NN NN ( ) LS	Fundamental	Basis	Neutral
08-1035_0	Despite its limited scale , prior work in sentence compression relied heavily on this particular corpus for establishing results ( )	IN PP$ JJ NN , JJ NN IN NN NN VVD RB IN DT JJ NN IN VVG NNS ( )	BackGround	GRelated	Neutral
08-1035_0	In the context of sentence compression , a linear programming based approach such as Clarke and Lapata ( ) is certainly one that deserves consideration	IN DT NN IN NN NN , DT JJ NN VVN NN JJ IN NP CC NP ( ) VBZ RB CD WDT VVZ NN	BackGround	SRelated	Neutral
08-1035_1	Thus , unlike McDonald ( ) , Clarke and Lap-ata ( ) and Cohn and Lapata ( ) , we do not insist on finding a globally optimal solution in the space of 2 n possible compressions for an n word long sentence	RB , IN NP ( ) , NP CC NP ( ) CC NP CC NP ( ) , PP VVP RB VV IN VVG DT RB JJ NN IN DT NN IN CD NN JJ NNS IN DT NN NN JJ NN	Compare	Compare	Neutral
08-1035_1	1 But how do we find compressions that are grammatical? To address the issue , rather than resort to statistical generation models as in the previous literature ( ) , we pursue a particular rule-based approach we call a 'dependency truncation ,' which as we will see , gives us a greater control over the form that compression takes	LS CC WRB VVP PP VV NNS WDT VBP JJ TO VV DT NN , RB IN NN TO JJ NN NNS RB IN DT JJ NN ( ) , PP VVP DT JJ JJ NN PP VVP DT NN NN NNS WDT IN PP MD VV , VVZ PP DT JJR NN IN DT NN IN/that NN VVZ	BackGround	GRelated	Neutral
08-1035_2	Our approach is broadly in line with prior work ( ) , in that we make use of some form of syntactic knowledge to constrain compressions we generate	PP$ NN VBZ RB IN NN IN JJ NN ( ) , IN WDT PP VVP NN IN DT NN IN JJ NN TO VV NNS PP VVP	BackGround	SRelated	Neutral
08-1035_3	DPM was first introduced in ( ) , later explored by a number of people ( )	NN VBD RB VVN IN ( ) , RBR VVN IN DT NN IN NNS ( )	BackGround	GRelated	Neutral
08-1035_7	For better or worse , much of prior work on sentence compression ( ) turned to a single corpus developed by Knight and Marcu ( ) (K&M , henceforth) for evaluating their approaches	IN JJR CC JJR , RB IN JJ NN IN NN NN ( ) VVN TO DT JJ NN VVN IN NP CC NP ( ) NN , NN IN VVG PP$ NNS	BackGround	GRelated	Neutral
08-1035_8	What sets this work apart from them , however , is a novel use we make of Conditional Random Fields (CRFs) to select among possible compressions ( )	WP VVZ DT NN RB IN PP , RB , VBZ DT JJ NN PP VVP IN NP NP NPS NN TO VV IN JJ NNS ( )	Fundamental	Basis	Neutral
08-1035_10	In the experiment described later , we set a = 0.1 for DPM , following Morooka et al. ( ) , who found the best performance with that setting for a	IN DT NN VVD RBR , PP VVD DT SYM CD IN NN , VVG NP NP NP ( ) , WP VVD DT JJS NN IN DT NN IN DT	Fundamental	Idea	Positive
08-1035_11	Nonetheless , there is some cost that comes with the straightforward use of CRFs as a discriminative classifier in sentence compression; its outputs are often ungrammatical and it allows no control over the length of compression they generates ( )	RB , EX VBZ DT NN WDT VVZ IN DT JJ NN IN NP IN DT JJ NN IN NN NN PP$ NNS VBP RB JJ CC PP VVZ DT NN IN DT NN IN NN PP VVZ ( )	BackGround	GRelated	Negative
08-1035_14	If it is , the whole scheme of ours would fall under what is known as 'Linear Programming CRFs' ( )	IN PP VBZ , DT JJ NN IN PP MD VV IN WP VBZ VVN IN JJ NP NP ( )	BackGround	SRelated	Positive
08-1035_15	We extracted lead sentences both from the brief and from its source article , and aligned them , using what is known as the Smith-Waterman algorithm ( ) , which produced 1 ,401 pairs of summary and source sentence	PP VVD JJ NNS CC IN DT NN CC IN PP$ NN NN , CC VVN PP , VVG WP VBZ VVN IN DT NP NN ( ) , WDT VVD CD CD NNS IN NN CC NN NN	Fundamental	Basis	Positive
08-1035_17	A part of our system makes use of a modeling toolkit called GRMM ( )	DT NN IN PP$ NN VVZ NN IN DT NN NN VVN NP ( )	Fundamental	Basis	Neutral
08-1035_22	In any case , we need some extra rules on G(S) to take care of language specific issues (cf.Vandeghinste and Pan ( ) for English)	IN DT NN , PP VVP DT JJ NNS IN NP TO VV NN IN NN JJ NNS NN CC NP ( ) IN NN	Fundamental	Basis	Neutral
08-1036_0	Recently , Blei and McAuliffe ( ) proposed an approach for joint sentiment and topic modeling that can be viewed as a supervised LDA (sLDA) model that tries to infer topics appropriate for use in a given classification or regression problem	RB , NP CC NP ( ) VVN DT NN IN JJ NN CC NN NN WDT MD VB VVN IN DT JJ NP NN NN WDT VVZ TO VV NNS JJ IN NN IN DT VVN NN CC NN NN	BackGround	GRelated	Neutral
08-1036_1	The Multi-Grain Latent Dirichlet Allocation model (MG-LDA) is an extension of Latent Dirichlet Allocation (LDA) ( )	DT NP NP NP NP NN NN VBZ DT NN IN NP NP NP NN ( )	BackGround	GRelated	Neutral
08-1036_2	Parallel to this study Branavan et al. ( ) also showed that joint models of text and user annotations benefit extractive summarization	JJ TO DT NN NP NP NP ( ) RB VVD IN/that JJ NNS IN NN CC NN NNS VVP JJ NN	BackGround	GRelated	Neutral
08-1036_3	In this study , we look at the problem of aspect-based sentiment summarization ( )	IN DT NN , PP VVP IN DT NN IN JJ NN NN ( )	BackGround	GRelated	Neutral
08-1036_4	Text excerpts are usually extracted through string matching ( ) , sentence clustering ( ) , or through topic models ( )	NN NNS VBP RB VVN IN NN VVG ( ) , NN VVG ( ) , CC IN NN NNS ( )	BackGround	GRelated	Neutral
08-1036_5	Gibbs sampling is an example of a Markov Chain Monte Carlo algorithm ( )	NP NN VBZ DT NN IN DT NP NP NP NP NN ( )	BackGround	SRelated	Neutral
08-1036_6	Following Titov and McDonald ( ) we use a collapsed Gibbs sampling algorithm that was derived for the MG-LDA model based on the Gibbs sampling method proposed for LDA in ( )	VVG NP CC NP ( ) PP VVP DT JJ NP NN NN WDT VBD VVN IN DT NP NN VVN IN DT NP NN NN VVN IN NP IN ( )	Fundamental	Idea	Neutral
08-1036_6	However , ( ) demonstrated that an efficient collapsed Gibbs sampler can be constructed , where only assignments z need to be sampled , whereas the dependency on distributions 0 and p can be integrated out analytically	RB , ( ) VVN IN/that DT JJ JJ NP NN MD VB VVN , WRB JJ NNS SYM VVP TO VB VVN , IN DT NN IN NNS CD CC NN MD VB VVN RP RB	BackGround	GRelated	Neutral
08-1036_7	These simple techniques are capable of modeling local topics without more expensive modeling of topic transitions used in ( )	DT JJ NNS VBP JJ IN VVG JJ NNS IN JJR JJ NN IN NN NNS VVN IN ( )	BackGround	GRelated	Neutral
08-1036_9	2 Aspect identification has also been thoroughly studied ( ) , but again , ontologies and users often provide this information negating the need for automation	CD NN NN VHZ RB VBN RB VVD ( ) , CC RB , NNS CC NNS RB VVP DT NN VVG DT NN IN NN	BackGround	GRelated	Neutral
08-1036_11	When labeled data exists , this problem can be solved effectively using a wide variety of methods available for text classification and information extraction ( )	WRB VVN NN VVZ , DT NN MD VB VVN RB VVG DT JJ NN IN NNS JJ IN NN NN CC NN NN ( )	BackGround	GRelated	Positive
08-1036_12	A closely related model to ours is that of Mei et al. ( ) which performs joint topic and sentiment modeling of collections	DT RB VVN NN TO PP VBZ IN/that IN NNS CC JJ ( ) WDT VVZ JJ NN CC NN NN IN NNS	BackGround	SRelated	Neutral
08-1036_13	For details on computing gradients for loglinear graphical models with Gibbs sampling we refer the reader to ( )	IN NNS IN VVG NNS IN JJ JJ NNS IN NP VVG PP VVP DT NN TO ( )	BackGround	SRelated	Neutral
08-1036_14	Sentiment classification is a well studied problem ( ) and in many domains users explicitly provide ratings for each aspect making automated means unnecessary	NN NN VBZ DT RB VVN NN ( ) CC IN JJ NNS NNS RB VVP NNS IN DT NN VVG JJ NNS JJ	BackGround	GRelated	Neutral
08-1036_16	However , it has been observed that ratings for different aspects can be correlated ( ) , e.g. , very negative opinion about room cleanliness is likely to result not only in a low rating for the aspect rooms , but also is very predictive of low ratings for the aspects service and dining	RB , PP VHZ VBN VVN IN/that NNS IN JJ NNS MD VB VVN ( ) , FW , RB JJ NN IN NN NN VBZ JJ TO VV RB RB IN DT JJ NN IN DT NN NNS , CC RB VBZ RB JJ IN JJ NNS IN DT NNS NN CC NN	BackGround	GRelated	Neutral
08-1036_17	The first part is based on Multi-Grain Latent Dirichlet Allocation ( ) , which has been previously shown to build topics that are representative of ratable aspects	DT JJ NN VBZ VVN IN NP NP NP NP ( ) , WDT VHZ VBN RB VVN TO VV NNS WDT VBP JJ IN JJ NNS	Fundamental	Basis	Neutral
08-1036_17	As was demon-strated in Titov and McDonald ( ) , the topics produced by LDA do not correspond to ratable aspects of entities	RB VBD VVN IN NP CC NP ( ) , DT NNS VVN IN NP VVP RB VV TO JJ NNS IN NNS	BackGround	SRelated	Neutral
08-1036_17	It was demonstrated in Titov and McDonald ( ) that ratable aspects will be captured by local topics and global topics will capture properties of reviewed items	PP VBD VVN IN NP CC NP ( ) IN/that JJ NNS MD VB VVN IN JJ NNS CC JJ NNS MD VV NNS IN VVN NNS	BackGround	SRelated	Neutral
08-1036_17	This factor is proportional to the conditional distribution used in the Gibbs sampler of the MG-LDA model ( )	DT NN VBZ JJ TO DT JJ NN VVN IN DT NP NN IN DT NP NN ( )	BackGround	SRelated	Neutral
08-1036_17	Other local topics , as for the MG-LDA model , correspond to other aspects discussed in reviews (breakfast , prices , noise) , and as it was previously shown in Titov and McDonald ( ) , aspects for global topics correspond to the types of reviewed items (hotels in Russia , Paris hotels) or background words	JJ JJ NNS , RB IN DT NP NN , VV TO JJ NNS VVN IN NNS NNS , NNS , NN , CC IN PP VBD RB VVN IN NP CC NP ( ) , NNS IN JJ NNS VV TO DT NNS IN VVN NNS NNS IN NP , NP NN CC NN NNS	BackGround	SRelated	Neutral
08-1037_0	For a broader review of WSD in NLP applications , see Resnik ( )	IN DT JJR NN IN NP IN NP NNS , VVP NP ( )	BackGround	SRelated	Neutral
08-1037_0	This problem of identifying the correct sense of a word in context is known as word sense disambiguation (WSD: Agirre and Edmonds ( ))	DT NN IN VVG DT JJ NN IN DT NN IN NN VBZ VVN IN NN NN NN NN NP CC NP ( NN	BackGround	SRelated	Neutral
08-1037_1	Following Atterer and Schutze ( ) , we wrote a script that , given a parse tree , identifies instances of PP attachment ambiguity and outputs the (v ,n1 ,p ,n2) quadruple involved and the attachment decision	VVG NP CC NP ( ) , PP VVD DT NN IN/that , VVN DT VVP NN , VVZ NNS IN NP NN NN CC NNS DT NN JJ NN NN NN VVN CC DT NN NN	Fundamental	Idea	Neutral
08-1037_1	This evaluation methodology coincides with that of Atterer and Schutze ( )	DT NN NN VVZ IN DT IN NP CC NP ( )	BackGround	SRelated	Neutral
08-1037_1	Note that Atterer and Schutze ( ) have shown that the Bikel parser performs as well as the state-of-the-art in PP attachment , which suggests our method improves over the current state-of-the-art	NN IN/that NP CC NP ( ) VHP VVN IN/that DT NP NN VVZ RB RB IN DT JJ IN NP NN , WDT VVZ PP$ NN VVZ IN DT JJ JJ	BackGround	SRelated	Positive
08-1037_2	We provide the first definitive results that word sense information can enhance Penn Treebank parser performance , building on earlier results of Bikel ( ) and Xiong et al. ( )	PP VVP DT JJ JJ NNS IN/that NN NN NN MD VV NP NP NN NN , VVG IN JJR NNS IN NP ( ) CC NP NP NP ( )	Fundamental	Basis	Neutral
08-1037_2	The most closely related research is that of Bikel ( ) , who merged the Brown portion of the Penn Tree-bank with SemCor (similarly to our approach in Section 4.1) , and used this as the basis for evaluation of a generative bilexical model for joint WSD and parsing	DT RBS RB VVN NN VBZ IN/that IN NP ( ) , WP VVD DT NP NN IN DT NP NP IN NP RB TO PP$ NN IN NN JJ , CC VVD DT IN DT NN IN NN IN DT JJ JJ NN IN JJ NN CC VVG	BackGround	SRelated	Neutral
08-1037_2	Note that this dataset is smaller than the one described by Bikel ( ) in a similar exercise , the reason being our simple and conservative approach taken when merging the resources	NN IN/that DT NN VBZ JJR IN DT CD VVN IN NP ( ) IN DT JJ NN , DT NN VBG PP$ JJ CC JJ NN VVN WRB VVG DT NNS	Compare	Compare	Neutral
08-1037_3	Parsing As our baseline parsers , we use two state-of-the-art lexicalised parsing models , namely the Bikel parser ( ) and Charniak parser ( )	VVG IN PP$ NN NNS , PP VVP CD JJ NNS VVG NNS , RB DT NP NN ( ) CC NP NN ( )	Fundamental	Basis	Neutral
08-1037_4	Tighter integration of semantics into the parsing models , possibly in the form of discriminative reranking models ( ) , is a promising way forward in this regard	JJR NN IN NNS IN DT VVG NNS , RB IN DT NN IN JJ JJ NNS ( ) , VBZ DT JJ NN RB IN DT NN	BackGround	MRelated	Positive
08-1037_5	For example , a number of different parsers have been shown to benefit from lexicalisation , that is , the conditioning of structural features on the lexical head of the given constituent ( )	IN NN , DT NN IN JJ NNS VHP VBN VVN TO VV IN NN , WDT VBZ , DT NN IN JJ NNS IN DT JJ NN IN DT VVN NN ( )	BackGround	GRelated	Neutral
08-1037_7	This extraction system uses Collins' rules (based on treep ( )) to locate the heads of phrases	DT NN NN VVZ NP NNS JJ IN NN ( JJ TO VV DT NNS IN NNS	Fundamental	Basis	Neutral
08-1037_11	Other notable examples of the successful incorporation of lexical semantics into parsing , not through word sense information but indirectly via selectional preferences , are Dowding et al. ( ) and Hektoen ( )	JJ JJ NNS IN DT JJ NN IN JJ NNS IN VVG , RB IN NN NN NN CC RB IN JJ NNS , VBP NP NP NP ( ) CC NP ( )	BackGround	GRelated	Positive
08-1037_12	The method we use to predict the first sense is that of McCarthy et al. ( ) , which was obtained using a thesaurus automatically created from the British National Corpus (BNC) applying the method of Lin ( ) , coupled with WordNet-based similarity measures	DT NN PP VVP TO VV DT JJ NN VBZ IN/that IN NP NP NP ( ) , WDT VBD VVN VVG DT NN RB VVN IN DT NP NP NP NN VVG DT NN IN NP ( ) , VVN IN JJ NN NNS	Fundamental	Basis	Neutral
08-1037_13	The only successful applications of word sense information to parsing that we are aware of are Xiong et al. ( ) and Fujita et al. ( )	DT JJ JJ NNS IN NN NN NN TO VVG IN/that PP VBP JJ IN VBP NP NP NP ( ) CC NP NP NP ( )	BackGround	SRelated	Positive
08-1037_14	Note also that our baseline results for the Table 6: Parsing results with ASR (* indicates that the recall or precision is significantly better than baseline; the best performing method in each column is shown in bold) Table 7: PP attachment results with ASR (* indicates that the recall or precision is significantly better than baseline; the best performance in each column is shown in bold) dataset are almost the same as previous work parsing the Brown corpus with similar models ( ) , which suggests that our dataset is representative of this corpus	VVP RB IN/that PP$ NN NNS IN DT JJ CD NP NNS IN NP NP VVZ IN/that DT NN CC NN VBZ RB JJR IN NN DT JJS VVG NN IN DT NN VBZ VVN IN JJ NP CD NP NN NNS IN NP NP VVZ IN/that DT NN CC NN VBZ RB JJR IN NN DT JJS NN IN DT NN VBZ VVN IN JJ NN VBP RB DT JJ IN JJ NN VVG DT NP NN IN JJ NNS ( ) , WDT VVZ IN/that PP$ NN VBZ JJ IN DT NN	Compare	Compare	Neutral
08-1037_17	The only publicly-available resource with these two characteristics at the time of this work was the subset of the Brown Corpus that is included in both SemCor ( ) and the Penn Tree-bank (PTB)	DT JJ JJ NN IN DT CD NNS IN DT NN IN DT NN VBD DT NN IN DT NP NP WDT VBZ VVN IN DT NP ( ) CC DT NP NP NN	BackGround	SRelated	Positive
08-1037_18	Li and Abe ( ) , McCarthy and Carroll ( ) , Xiong et al. ( ) , Fu-jita et al. ( ))	NP CC NP ( ) , NP CC NP ( ) , NP NP NP ( ) , NP NP NP ( NN	NULL	NULL	NULL
08-1037_21	Traditionally , the two parsers have been trained and evaluated over the WSJ portion of the Penn Treebank (PTB: Marcus et al. ( ))	RB , DT CD NNS VHP VBN VVN CC VVN IN DT NP NN IN DT NP NP NP NP NP NP ( NN	BackGround	GRelated	Neutral
08-1037_25	Prepositional phrase attachment (PP attachment) is the problem of determining the correct attachment site for a PP , conventionally in the form of the noun or verb in a V NP PP structure ( )	JJ NN NN NP NN VBZ DT NN IN VVG DT JJ NN NN IN DT NP , RB IN DT NN IN DT NN CC NN IN DT NN NP NP NN ( )	BackGround	GRelated	Neutral
08-1037_26	Disambiguating each word relative to its context of use becomes increasingly difficult for fine-grained representations ( )	VVG DT NN JJ TO PP$ NN IN NN VVZ RB JJ IN JJ NNS ( )	BackGround	GRelated	Neutral
08-1037_29	The best published results over RRR are those of Stetina and Nagao ( ) , who employ WordNet sense predictions from an unsuper-vised WSD method within a decision tree classifier	DT RBS VVN NNS IN NP VBP DT IN NP CC NP ( ) , WP VVP NP NN NNS IN DT JJ NN NN IN DT NN NN NN	BackGround	GRelated	Positive
08-1037_29	The fact that the improvement is larger for PP attachment than for full parsing is suggestive of PP attachment being a parsing subtask where lexical semantic information is particularly important , supporting the findings of Stetina and Nagao ( ) over a standalone PP attachment task	DT NN IN/that DT NN VBZ JJR IN NP NN IN IN JJ VVG VBZ JJ IN NP NN VBG DT VVG NN WRB JJ JJ NN VBZ RB JJ , VVG DT NNS IN NP CC NP ( ) IN DT JJ NP NN NN	BackGround	GRelated	Neutral
08-1038_0	However , it has been argued that Spanish causative verbs do not in fact take objects ( )	RB , PP VHZ VBN VVN IN/that JJ JJ NNS VVP RB IN NN VVP NNS ( )	BackGround	SRelated	Neutral
08-1038_1	> The schema in (1) is also found in the widely-studied Romance causative construction ( ) , illustrated in (22): (22)?Nos hizo   leer El Senor de los Anillos	SYM DT NN IN NN VBZ RB VVN IN DT JJ JJ JJ NN ( ) , VVN IN JJ NNS NN NN NP NP NP NP NP	BackGround	SRelated	Neutral
08-1038_2	The logic also ensures that the new rules are subject to modalities consistent with those defined by Baldridge and Kruijff ( )	DT NN RB VVZ IN/that DT JJ NNS VBP JJ TO NNS JJ IN DT VVN IN NP CC NP ( )	BackGround	SRelated	Neutral
08-1038_2	This treats CCG as a compilation of CTL proofs , providing a principled , grammar-internal basis for restrictions on the CCG rules , transferring language-particular restrictions on rule application to the lexicon , and allowing the CCG rules to be viewed as grammatical universals ( )	DT VVZ NN IN DT NN IN NP NNS , VVG DT JJ , JJ NN IN NNS IN DT NP NNS , VVG JJ NNS IN NN NN TO DT NN , CC VVG DT NP NNS TO VB VVN IN JJ NNS ( )	BackGround	SRelated	Neutral
08-1038_2	The rules of this multimodal version of CCG ( ) are derived as theorems of a Categorial Type Logic (CTL , Moortgat ( ))	DT NNS IN DT JJ NN IN NP ( ) VBP VVN IN NNS IN DT NP NP NP NP , NP ( NN	BackGround	SRelated	Neutral
08-1038_2	The most commonly used are Karttunnen's chart subsumption check ( ) and Eisner's normal-form constraints ( )	DT RBS RB VVN VBP NP NN NN NN ( ) CC NP NN NNS ( )	BackGround	GRelated	Neutral
08-1038_2	Furthermore , CCG augmented with  D is compatible with Eisner NF ( ) , a standard technique for controlling derivational ambiguity in CCG-parsers , and also with the modalized version of CCG ( )	RB , NP VVD IN NP VBZ JJ IN NP NP ( ) , DT JJ NN IN VVG JJ NN IN NNS , CC RB IN DT JJ NN IN NP ( )	BackGround	GRelated	Neutral
08-1038_3	It has been used for a variety of tasks , such as wide-coverage parsing ( ) , sentence realization ( ) , learning semantic parsers ( ) , dialog systems ( ) , grammar engineering ( ) , and modeling syntactic priming ( )	PP VHZ VBN VVN IN DT NN IN NNS , JJ IN NN VVG ( ) , NN NN ( ) , VVG JJ NNS ( ) , NN NNS ( ) , NN NN ( ) , CC VVG JJ NN ( )	BackGround	GRelated	Neutral
08-1038_4	We show two ways to derive the D rules: one based on unary composition and the other based on a logical characterization of CCG's rule base ( )	PP VVP CD NNS TO VV DT NP NN CD VVN IN JJ NN CC DT JJ VVN IN DT JJ NN IN NP NN NN ( )	NULL	NULL	NULL
08-1038_4	The  D rules are well-behaved; we show this by deriving them both from unary composition and from the logic defined by Baldridge ( )	DT NP NNS VBP JJ PP VVP DT IN VVG PP CC IN JJ NN CC IN DT NN VVN IN NP ( )	Fundamental	Basis	Neutral
08-1038_4	(11) s vp/np base your verdict on s/(vp/np) s/(s/np) s/vp what you can (s/(vp/np))\(s/(vp/np)) (x\x)/x s/(vp/np) and s/(s/np) s/vp what you must not The category for and is marked for non-associativity with * , and thus combines with other expressions only by function application ( )	JJ NN NN NN PP$ NN IN JJ JJ NN WP PP MD NN NN NN CC JJ NN WP PP MD RB DT NN IN CC VBZ VVN IN NN IN SYM , CC RB VVZ IN JJ NNS RB IN NN NN ( )	BackGround	SRelated	Neutral
08-1038_4	The implication is that outputs of  B 1+ rules are inert , using the terminology of Baldridge ( )	DT NN VBZ IN/that NNS IN NP JJ NNS VBP JJ , VVG DT NN IN NP ( )	BackGround	SRelated	Neutral
08-1038_4	In this section , we present an alternate formulation of Eisner NF with Baldridge's ( ) CTL basis for CCG	IN DT NN , PP VVP DT JJ NN IN NP NP IN NP ( ) NP NN IN NP	Fundamental	Basis	Neutral
08-1038_4	In Baldridge's ( ) system , only proofs involving the ARP and ALP rules produce inert categories	IN NP ( ) NN , RB VVZ VVG DT NN CC NN NNS VVP JJ NNS	BackGround	SRelated	Neutral
08-1038_6	These are instead determined by syntactic , semantic , and pragmatic factors , such as transitivity , word order , animacy , gender , social prestige , and referential specificity ( )	DT VBP RB VVN IN JJ , JJ , CC JJ NNS , JJ IN NN , NN NN , NN , NN , JJ NN , CC JJ NN ( )	BackGround	SRelated	Neutral
08-1038_7	The standard CCG analysis for English auxiliary verbs is the type exemplified in (16) ( ) , interpreted as a unary operator over sentence meanings ( ): (16)?can h (s\np)/(s\np) : AP etAx .<0>P(x) However , this type is empirically underdetermined , given a widely-noted set of generalizations suggesting that auxiliaries and raising verbs take no subject argument at all ( )	DT JJ NP NN IN JJ JJ NNS VBZ DT NN VVN IN JJ ( ) , VVN IN DT JJ NN IN NN NNS ( JJ NN NN NN : NP NN NN RB , DT NN VBZ RB JJ , VVN DT JJ NN IN NNS VVG IN/that NNS CC VVG NNS VV DT JJ NN IN DT ( )	BackGround	GRelated	Neutral
08-1038_8	Combining (20) with a type-raised subject presents another instance of the structure in (1) , where that question words are represented as variable-binding operators ( ): 	VVG NN IN DT JJ NN VVZ DT NN IN DT NN IN NN , WRB DT NN NNS VBP VVN IN JJ NNS ( NN	BackGround	SRelated	Neutral
08-1038_9	Several techniques have been proposed for the problem ( )	JJ NNS VHP VBN VVN IN DT NN ( )	BackGround	GRelated	Neutral
08-1038_11	Following Jacobson ( ) , a more empirically-motivated assignment is (20): s (20)?can h s/s : Xp t 	VVG NP ( ) , DT RBR JJ NN VBZ JJ NN NN NN NNS : NN NN	Fundamental	Idea	Neutral
08-1038_12	Applying  D to an argument sequence is equivalent to compound application of binary  B: (37)?((Df )g)h)x = (/g)(hx) (38)?(((BB)f )g)h)x = ((B(/g))h)x = (fg)(hx) Syntactically , binary  B is equivalent to application of unary  B to the primary functor A , followed by applying the secondary functor r to the output of  B by means of function application ( ): (39) x/y y/z (x/z)/(y/z) x/z (40) x/y (x/w)/z The rules for  D correspond to application of  B to both the primary and secondary functors , followed by function application: (41) x/(y/z) y/w ( x/(w/z))/((y/z)/(w/z)) (y/z)/(w/z) x/ ( w/z ) As with  B n ,  D n- 1 can be derived by iterative application of  B to both primary and secondary functors	VVG NP TO DT NN NN VBZ JJ TO VV NN IN JJ NP NP NP SYM JJ NN NN SYM NN SYM NP NP , JJ NN VBZ JJ TO NN IN JJ NN TO DT JJ NN NP , VVN IN VVG DT JJ NN SYM TO DT NN IN NP IN NNS IN NN NN ( JJ JJ NN NN JJ NP NN NN NN DT NNS IN NP VV TO NN IN NN TO CC DT JJ CC JJ NNS , VVN IN NN NN JJ JJ NNS ( JJ NN NN ( NP ) IN IN NP NN , NP NP CD MD VB VVN IN JJ NN IN NP TO DT JJ CC JJ NNS	BackGround	SRelated	Neutral
08-1038_15	For applications that call for increased incremental-ity (e.g. , aligning visual and spoken input incrementally ( )) , CCG rules that do not produce inert categories can be derived a CTL basis that does not require ? ant for associativity and permutation	IN NNS WDT VVP IN VVN NN NN , VVG JJ CC VVN NN RB ( NP , NP NNS WDT VVP RB VV JJ NNS MD VB VVN DT NP NN WDT VVZ RB VV SENT NN IN NN CC NN	BackGround	GRelated	Neutral
08-1038_19	It was noted by Pickering and Barry ( ) for English , but to the best of our knowledge it has not been treated in the s CCG literature , nor noted in other languages	PP VBD VVN IN NP CC NP ( ) IN NP , CC TO DT JJS IN PP$ NN PP VHZ RB VBN VVN IN DT JJ NP NN , CC VVD IN JJ NNS	BackGround	GRelated	Neutral
08-1038_22	Combinatory Categorial Grammar (CCG , Steedman ( )) is a compositional , semantically transparent formalism that is both linguistically expressive and computationally tractable	JJ NP NP NN , NP ( NN VBZ DT JJ , RB JJ NN WDT VBZ DT RB JJ CC RB JJ	BackGround	GRelated	Positive
08-1038_22	This supports elegant analyses of several phenomena (e.g. , coordination , long-distance extraction , and intonation) and allows incremental parsing with the competence grammar ( )	DT VVZ JJ NNS IN JJ NNS JJ , NN , JJ NN , CC NN CC VVZ JJ VVG IN DT NN NN ( )	BackGround	GRelated	Neutral
08-1038_22	Inert slashes are Baldridge's ( ) encoding in OpenCCG 3 of his CTL interpretation of Steedman's ( ) antecedent-government feature	JJ NNS VBP NP ( ) VVG IN NP CD IN PP$ NP NN IN NP ( ) NN NN	BackGround	GRelated	Neutral
08-1038_25	Following Wittenburg ( ) , we remedy this by adding a set of rules based on the D combinator of combinatory logic ( )	VVG NP ( ) , PP VV DT IN VVG DT NN IN NNS VVN IN DT NP NN IN JJ NN ( )	Fundamental	Idea	Neutral
08-1038_25	CCG's flexibility is useful for linguistic analyses , but leads to spurious ambiguity ( ) due to the associativity introduced by the  B and  T rules	NP NN VBZ JJ IN JJ NNS , CC VVZ TO JJ NN ( ) JJ TO DT NN VVN IN DT NN CC NN NNS	BackGround	GRelated	Neutral
08-1038_25	Wittenburg ( ) originally proposed using rules based on  D as a way to reduce spurious ambiguity , which he achieved by eliminating  B rules entirely and replacing them with variations on  D	NP ( ) RB VVD VVG NNS VVN IN NP IN DT NN TO VV JJ NN , WDT PP VVD IN VVG NP NNS RB CC VVG PP IN NNS IN NP	BackGround	GRelated	Neutral
08-1039_0	The parser uses a two-stage system , first employing a supertagger ( ) to propose lexical categories for each word , and then applying the cky chart parsing algorithm	DT NN VVZ DT NN NN , RB VVG DT NN ( ) TO VV JJ NNS IN DT NN , CC RB VVG DT JJ NN VVG NN	BackGround	SRelated	Neutral
08-1039_1	To remove this variable , we carry out a second evaluation against the Briscoe and Carroll ( ) reannotation ofDep-Bank ( ) , as described in Clark and Curran ( )	TO VV DT NN , PP VVP RP DT JJ NN IN DT NP CC NP ( ) NN NN ( ) , RB VVN IN NP CC NP ( )	Fundamental	Idea	Neutral
08-1039_1	This evaluation is particularly relevant for nps , as the Briscoe and Carroll ( ) corpus has been annotated for internal np structure	DT NN VBZ RB JJ IN NNS , IN DT NP CC NP ( ) NN VHZ VBN VVN IN JJ NN NN	BackGround	GRelated	Neutral
08-1039_2	Lewin ( ) experiments with detecting base -nps using ner information , while Buyko et al. ( ) use a crf to identify a guest comedian Victor Borge NP[nb}/N N/N  N/N  N/N ~N~ NP (a) N N N guest comedian Victor Borge NP[nb  \I N N  I N NP N N   (NP\NP  )I (NP\NP) NP\NP -> -> NP\NP NP (b) Figure 4: CCGbank derivations for apposition with dt coordinate structure in biological named entities	NP ( ) NNS IN VVG NN NNS VVG JJ NN , IN NP NP NP ( ) VV DT NN TO VV DT NN NN NP NP NP NP NP NP NP NP NP NP NP NP NN NN NP NP NP NP NP NP NP NP NP NP NP NN NNS JJ NP NN NN NP NP NN NP CD NP NNS IN NN IN NP VV NN IN JJ VVN NNS	BackGround	GRelated	Neutral
08-1039_3	N N/N N lung N/N N ??? ???  \?  \ I N    ???    ??? deaths deaths  lung cancer N N/N N/N N deaths (N/N )/(N/N ) cancer  deaths  lung  cancer?ung cancer Figure 2: (a) Original right-branching CCGbank (b) Left-branching (c) Left-branching with new supertags 2.2  CCG parsing The C&C ccg parser ( ) is used to perform our experiments , and to evaluate the effect of the changes to CCGbank	NP NP NP NN NP NP JJ JJ NN SYM NP NP JJ NN NNS NNS NN NN NP NP NP NP NNS NN NN ) NN NNS NN NN NN NP CD NN JJ NN NP NP NP NP NP IN JJ NNS CD NN VVG DT NP NN NN ( ) VBZ VVN TO VV PP$ NNS , CC TO VV DT NN IN DT NNS TO NP	NULL	NULL	NULL
08-1039_3	Clark and Curran ( ) has a full description of the C&C parser's pre-existing features , to which we have added a number of novel ner-based features	NP CC NP ( ) VHZ DT JJ NN IN DT NP NNS VVG NNS , TO WDT PP VHP VVN DT NN IN JJ JJ NNS	Fundamental	Basis	Neutral
08-1039_3	Our experiments are run with the C&C ccg parser ( ) , and will evaluate the changes made to CCGbank , as well as the effectiveness of the ner features	PP$ NNS VBP VVN IN DT NP NN NN ( ) , CC MD VV DT NNS VVN TO NP , RB RB IN DT NN IN DT JJ NNS	Fundamental	Basis	Neutral
08-1039_5	Vadas and Curran ( ) carry out supervised experiments using this data set of 36 ,584 NPs , outperforming the Collins ( ) parser	NP CC NP ( ) VV RP JJ NNS VVG DT NNS VVN IN CD CD NP , VVG DT NP ( ) NN	BackGround	GRelated	Positive
08-1039_6	This is unexpected , because possessives were already bracketed properly when CCGbank was originally created ( )	DT VBZ JJ , IN NNS VBD RB VVN RB WRB NP VBD RB VVN ( )	BackGround	SRelated	Positive
08-1039_7	We generate the two forms of output that CCGbank contains: AUTO files , which represent the tree structure of each sentence; and PARG files , which list the word-word dependencies ( )	PP VVP DT CD NNS IN NN IN/that NP NP NP NNS , WDT VVP DT NN NN IN DT NN CC NP NNS , WDT VVP DT NN NNS ( )	Fundamental	Basis	Neutral
08-1039_8	The flat structure described by the Penn Treebank can be seen in this example: (NP (NN lung)  (NN cancer)  (NNS deaths)) CCGbank ( ) is the primary English corpus for Combinatory Cate-gorial Grammar (ccg) ( ) and was created by a semi-automatic conversion from the penn Treebank	DT JJ NN VVN IN DT NP NP MD VB VVN IN DT JJ NN NP NP NP NN NNS JJ NP ( ) VBZ DT JJ JJ NN IN NP NP NP NN ( ) CC VBD VVN IN DT JJ NN IN DT NP NP	BackGround	SRelated	Positive
08-1039_8	N N/N N  i? \ cotton   conj N  i? \ and  N/N N  ? I N N/N N/N   N/N [conj ] N fibers cotton conj   N / N acetate  fibers?nd acetate Figure 1: (a) Incorrect ccg derivation from Hockenmaier and Steedman ( ) (b) The correct derivation Parsing of nps is typically framed as np bracketing , where the task is limited to discriminating between left and right-branching NPs of three nouns only: ?(crude oil) prices - left-branching ?world (oil prices) - right-branching Lauer ( ) presents two models to solve this problem: the adjacency model , which compares the association strength between words 1-2 to words 2-3; and the dependency model , which compares words 1-2 to words 1-3	NP NP NP NP SYM NN NP NP NP SYM CC NP NP SENT NP NP NP NP NP NP SYM NP NNS NN NP NP SYM NP NN NP NN NP CD NN NN NN NN IN NP CC NP ( ) NN DT JJ NN NN IN NNS VBZ RB VVN IN NN NN , WRB DT NN VBZ VVN TO VVG IN NN CC VVG NP IN CD NNS JJ NN NN NNS : VVG NN NN NN : VVG NP ( ) VVZ CD NNS TO VV DT NN DT NN NN , WDT VVZ DT NN NN IN NNS CD TO NNS JJ CC DT NN NN , WDT VVZ NNS CD TO NNS JJ	NULL	NULL	NULL
08-1039_8	Heads are then assigned using heuristics adapted from Hockenmaier and Steedman ( )	NNS VBP RB VVN VVG NNS VVN IN NP CC NP ( )	Fundamental	Basis	Neutral
08-1039_9	Honnibal and Curran ( ) have also made changes to CCGbank , aimed at better differentiating between complements and adjuncts	NP CC NP ( ) VHP RB VVN NNS TO NP , VVN IN JJR VVG IN VVZ CC NNS	BackGround	GRelated	Positive
08-1039_10	Finally , we evaluate against DepBank ( )	RB , PP VVP IN NP ( )	Fundamental	Basis	Neutral
08-1039_13	This is because their training data , the Penn Treebank ( ) , does not fully annotate np structure	DT VBZ IN PP$ NN NNS , DT NP NP ( ) , VVZ RB RB VV NN NN	BackGround	SRelated	Neutral
08-1039_14	Nakov and Hearst ( ) use search engine hit counts and extend the query set with typographical markers	NP CC NP ( ) NN NN NN VVD NNS CC VV DT NN VVN IN JJ NNS	BackGround	GRelated	Neutral
08-1039_15	PropBank ( ) is used as a gold-standard to inform these decisions , similar to the way that we use the Vadas and Curran ( ) data	NP ( ) VBZ VVN IN DT NN TO VV DT NNS , JJ TO DT NN IN/that PP VVP DT NP CC NP ( ) NNS	Fundamental	Basis	Neutral
08-1039_16	Combinatory Categorial Grammar (ccg) ( ) is a type-driven , lexicalised theory of grammar	JJ NP NP NN ( ) VBZ DT JJ , JJ NN IN NN	BackGround	SRelated	Neutral
08-1039_17	We apply an automatic conversion process using the gold-standard np data annotated by Vadas and Curran ( )	PP VVP DT JJ NN NN VVG DT NN NN NNS VVN IN NP CC NP ( )	Fundamental	Basis	Neutral
08-1039_17	Recently , Vadas and Curran ( ) annotated internal NP structure for the entire Penn Treebank , providing a large gold-standard corpus for np bracketing	RB , NP CC NP ( ) VVN JJ NP NN IN DT JJ NP NP , VVG DT JJ NN NN IN NN NN	BackGround	GRelated	Neutral
08-1039_17	The Vadas and Curran ( ) annotation scheme inserts NML and JJP brackets to describe the correct np structure , as shown below: We use these brackets to determine new goldstandard CCG derivations in Section 3	DT NP CC NP ( ) NN NN NNS NP CC NP VVZ TO VV DT JJ NN NN , IN VVN NN PP VVP DT NNS TO VV JJ NN NN NNS IN NP CD	Fundamental	Basis	Neutral
08-1039_17	This section describes the process of converting the Vadas and Curran ( ) data to ccg derivations	DT NN VVZ DT NN IN VVG DT NP CC NP ( ) NNS TO NN NNS	Fundamental	Basis	Neutral
08-1039_17	For example , we would insert the NML bracket shown below: This simple heuristic captures np structure not explicitly annotated by Vadas and Curran ( )	IN NN , PP MD VV DT NP NN VVN NN DT JJ JJ NNS NN NN RB RB VVN IN NP CC NP ( )	BackGround	SRelated	Neutral
08-1039_17	Vadas and Curran ( ) describe using ne tags during the annotation process , suggesting that ner-based features will be helpful in a statistical model	NP CC NP ( ) VV VVG RB NNS IN DT NN NN , VVG IN/that JJ NNS MD VB JJ IN DT JJ NN	BackGround	SRelated	Positive
08-1039_17	Vadas and Curran ( ) experienced a similar drop in performance on Penn Tree-bank data , and noted that the F-score for nml and jjp brackets was about 20% lower than the overall figure	NP CC NP ( ) VVD DT JJ NN IN NN IN NP NP NNS , CC VVD IN/that DT NN IN NN CC NN NNS VBD RB CD JJR IN DT JJ NN	BackGround	SRelated	Neutral
08-1039_17	The first contribution of this paper is the application of the Vadas and Curran ( ) data to Combinatory Categorial Grammar	DT JJ NN IN DT NN VBZ DT NN IN DT NP CC NP ( ) NNS TO NP NP NP	Fundamental	Basis	Neutral
08-1039_19	In particular , we implement new features using ner tags from the BBN Entity Type Corpus ( )	IN JJ , PP VV JJ NNS VVG JJ NNS IN DT NP NP NP NP ( )	Fundamental	Basis	Neutral
08-1039_19	We draw ne tags from the BBN Entity Type Corpus ( ) , which describes 28 different entity types	PP VVP RB NNS IN DT NP NP NP NP ( ) , WDT VVZ CD JJ NN NNS	Fundamental	Basis	Neutral
08-1040_0	Another group of related work focuses on summarizing sentences through a series of deletions ( )	DT NN IN JJ NN VVZ IN VVG NNS IN DT NN IN NNS ( )	BackGround	GRelated	Neutral
08-1040_2	A relatively straight-forward extension of the inside-outside algorithm for chart-parses allows us to learn and perform inference in our compact representation (a similar algorithm is presented in ( ))	DT RB JJ NN IN DT NN NN IN NNS VVZ PP TO VV CC VV NN IN PP$ JJ NN NN JJ NN VBZ VVN IN ( NN	Fundamental	Idea	Neutral
08-1040_3	Features derived from a syntactic parse of the sentence have proven particularly useful ( )	NNS VVN IN DT JJ VVP IN DT NN VHP VVN RB JJ ( )	BackGround	GRelated	Positive
08-1040_5	Another area of related work in the semantic role labeling literature is that on tree kernels ( )	DT NN IN JJ NN IN DT JJ NN VVG NN VBZ IN/that IN NN NNS ( )	BackGround	GRelated	Neutral
08-1040_6	We compared to a strong Baseline SRL system that learns a logistic regression model using the features of Pradhan et al. ( )	PP VVD TO DT JJ NP NP NN WDT VVZ DT JJ NN NN VVG DT NNS IN NP NP NP ( )	Compare	Compare	Neutral
08-1040_7	Table 2 shows results of these three systems on the Conll-2005 task , plus the top-performing system ( ) for reference	NN CD VVZ NNS IN DT CD NNS IN DT NP NN , CC DT JJ NN ( ) IN NN	Fundamental	Basis	Positive
08-1040_8	Approaches include incorporating a subcategorization feature ( ) , such as the one used in our baseline; and building a model which jointly classifies all arguments of a verb ( )	NNS VVP VVG DT NN NN ( ) , JJ IN DT CD VVN IN PP$ NN CC VVG DT NN WDT RB VVZ DT NNS IN DT NN ( )	BackGround	GRelated	Neutral
08-1041_1	Graph-ranking algorithms , e.g. , Page-Rank ( ) , are then applied to rank those sentences	JJ NNS , FW , NP ( ) , VBP RB VVN TO VV DT NNS	BackGround	GRelated	Neutral
08-1041_2	This problem and its influence on email summarization were studied in ( ) and ( )	DT NN CC PP$ NN IN NN NN VBD VVN IN ( ) CC ( )	BackGround	GRelated	Neutral
08-1041_3	As a second contribution of this paper , we study several ways to measure the cohesion between parent and child sentences in the quotation graph: clue words (re-occurring words in the reply) ( ) , semantic similarity and cosine similarity	IN DT JJ NN IN DT NN , PP VVP JJ NNS TO VV DT NN IN NN CC NN NNS IN DT NN NN NN NNS VVG NNS IN DT JJ ( ) , JJ NN CC NN NN	Fundamental	Basis	Neutral
08-1041_3	In our recent study ( ) , we built a fragment quotation graph to represent an email conversation and developed a ClueWordSummarizer (CWS) based on the concept of clue words	IN PP$ JJ NN ( ) , PP VVD DT NN NN NN TO VV DT NN NN CC VVD DT NN NN VVN IN DT NN IN NN NNS	BackGround	SRelated	Neutral
08-1041_3	One is the generalization of the CWS algorithm in ( ) and one is the well-known PageRank algorithm	PP VBZ DT NN IN DT JJ NN IN ( ) CC PP VBZ DT JJ NN NN	Fundamental	Basis	Neutral
08-1041_3	Particularly , when the weight of the edge is based on clue words as in Equation 1 , this method is equivalent to Algorithm CWS in ( )	RB , WRB DT NN IN DT NN VBZ VVN IN NN NNS IN IN NP CD , DT NN VBZ JJ TO NP NP IN ( )	BackGround	SRelated	Neutral
08-1041_5	Those discussions can be viewed as conversations via emails and are valuable for the user as a personal information repository( )	DT NNS MD VB VVN IN NNS IN NNS CC VBP JJ IN DT NN IN DT JJ NN NN )	BackGround	GRelated	Neutral
08-1041_6	Other than for email summarization , other document summarization methods have adopted graphranking algorithms for summarization , e.g. , ( ) , ( ) and ( )	JJ IN IN NN NN , JJ NN NN NNS VHP VVN JJ NNS IN NN , FW , ( ) , ( ) CC ( )	BackGround	GRelated	Neutral
08-1041_7	In many applications , it has been shown that sentences with subjective meanings are paid more attention than factual ones( )( )	IN JJ NNS , PP VHZ VBN VVN IN/that NNS IN JJ NNS VBP VVN JJR NN IN JJ JJ NN )	BackGround	GRelated	Neutral
08-1041_8	Department of Computer Science University of British Columbia Vancouver , BC , Canada {carenini , rng , xdzhou}@cs.ubc.ca With the ever increasing popularity of emails , it is very common nowadays that people discuss specific issues , events or tasks among a group of people by emails( )	NP IN NP NP NP IN NP NP NP , NP , NP NP , NP , NN IN DT RB VVG NN IN NNS , PP VBZ RB JJ RB IN/that NNS VV JJ NNS , NNS CC NNS IN DT NN IN NNS IN NN )	NULL	NULL	NULL
08-1041_10	?OpBear: The list of opinion bearing words in ( )	NN DT NN IN NN VVG NNS IN ( )	Fundamental	Basis	Neutral
08-1041_12	Since this idea is borrowed from the pyramid metric by Nenkova et al. ( ) , we call it the sentence pyramid precision	IN DT NN VBZ VVN IN DT NN JJ IN NP NP NP ( ) , PP VVP PP DT NN NN NN	Fundamental	Idea	Neutral
08-1041_14	Specifically , we use the package by ( ) , which includes several methods to compute the semantic similarity	RB , PP VVP DT NN IN ( ) , WDT VVZ JJ NNS TO VV DT JJ NN	Fundamental	Basis	Neutral
08-1041_15	We use the MEAD package to segment the text into 1 ,394 sentences ( )	PP VVP DT JJ NN TO NN DT NN IN CD CD NNS ( )	Fundamental	Basis	Neutral
08-1041_16	Meanwhile , most existing email summarization approaches use quantitative features to describe the conversation structure , e.g. , number of recipients and responses , and apply some general multi-document summarization methods to extract some sentences as the summary ( ) ( )	RB , RBS JJ NN NN NNS VVP JJ NNS TO VV DT NN NN , FW , NN IN NNS CC NNS , CC VV DT JJ NN NN NNS TO VV DT NNS IN DT NN ( ) ( )	BackGround	GRelated	Neutral
08-1041_16	Our experiments showed that CWS had a higher accuracy than the email summarization approach in ( ) and the generic multi-document summarization approach MEAD ( )	PP$ NNS VVD IN/that NP VHD DT JJR NN IN DT NN NN NN IN ( ) CC DT JJ NN NN NN NN ( )	Compare	Compare	Negative
08-1041_17	The major source of this list is from ( ) with additional words from other sources	DT JJ NN IN DT NN VBZ IN ( ) IN JJ NNS IN JJ NNS	Fundamental	Basis	Neutral
08-1041_18	A large amount of work has been done on determining the level of subjectivity of text ( )	DT JJ NN IN NN VHZ VBN VVN IN VVG DT NN IN NN IN NN ( )	BackGround	GRelated	Neutral
08-1041_19	Similar to the issue-response relationship , Shrestha et al. ( ) proposed methods to identify the question-answer pairs from an email thread	JJ TO DT NN NN , NP NP NP ( ) VVN NNS TO VV DT NN NNS IN DT NN NN	BackGround	GRelated	Neutral
08-1041_22	?OpFind: The list of subjective words in ( )	NN DT NN IN JJ NNS IN ( )	Fundamental	Basis	Neutral
08-1041_23	Most of the existing methods dealing with email conversations use the email thread to represent the email conversation structure , which is not accurate in many cases ( )	JJS IN DT JJ NNS VVG IN NN NNS VVP DT NN NN TO VV DT NN NN NN , WDT VBZ RB JJ IN JJ NNS ( )	BackGround	GRelated	Neutral
08-1042_0	For example , in (2) , the daughters list RB TO JJ NNS is a daughters list with no correlates in the treebank; it is erroneous because close to wholesale needs another layer of structure , namely adjective phrase (ADJP)  ( )	IN NN , IN NN , DT NNS NN NN TO NN NP VBZ DT NNS NN IN DT NNS IN DT NN PP VBZ JJ IN NN TO JJ NNS DT NN IN NN , RB NN NN NN ( )	BackGround	SRelated	Neutral
08-1042_0	The description for tagging titles in the guidelines (Bies et al. , 1995 , p	DT NN IN VVG NNS IN DT NNS NNS CC JJ , CD , NN	NULL	NULL	NULL
08-1042_0	To understand this , we have to realize that most modifiers are adjoined at the sentence level when there is any doubt about their attachment  ( )	TO VV DT , PP VHP TO VV IN/that JJS NNS VBP VVN IN DT NN NN WRB EX VBZ DT NN IN PP$ NN ( )	BackGround	SRelated	Neutral
08-1042_0	QP is "used for multiword numerical expressions that occur within NP (and sometimes ADJP) , where the QP corresponds frequently to some kind of complex determiner phrase"  ( )	NP VBZ VVN IN NN JJ NNS WDT VVP IN NP NN RB NP , WRB DT NP VVZ RB TO DT NN IN JJ NN NN ( )	BackGround	SRelated	Neutral
08-1042_1	When extracting rules from constituency-based tree-banks employing flat structures , grammars often limit the set of rules ( ) , due to the large number of rules ( ) and "leaky" rules that can lead to mis-analysis ( )	WRB VVG NNS IN JJ NNS VVG JJ NNS , NNS RB VVP DT NN IN NNS ( ) , JJ TO DT JJ NN IN NNS ( ) CC JJ NNS WDT MD VV TO NNS ( )	BackGround	GRelated	Neutral
08-1042_2	Parse reranking techniques , for instance , rely on knowledge about features other than those found in the core parsing model in order to determine the best parse ( )	JJ NN NNS , IN NN , VVP IN NN IN NNS JJ IN DT VVN IN DT NN VVG NN IN NN TO VV DT JJS VVP ( )	BackGround	GRelated	Neutral
08-1042_3	Instead of examining and comparing rules in their entirety , this method abstracts a rule to its component parts , similar to features using information about n-grams of daughter nodes in parse reranking models ( )	RB IN VVG CC VVG NNS IN PP$ NN , DT NN NNS DT NN TO PP$ NN NNS , JJ TO NNS VVG NN IN NNS IN NN NNS RB VVP JJ NNS ( )	BackGround	SRelated	Neutral
08-1042_4	Although frequency-based criteria are often used , these are not without problems because low-frequency rules can be valid and potentially useful rules ( ) , and high-frequency rules can be erroneous ( )	IN JJ NNS VBP RB VVN , DT VBP RB IN NNS IN NN NNS MD VB JJ CC RB JJ NNS ( ) , CC NN NNS MD VB JJ ( )	BackGround	GRelated	Neutral
08-1042_5	This method can be extended to increase recall , by treating similar daughters lists as equivalent ( )	DT NN MD VB VVN TO VV NN , IN VVG JJ NNS NNS IN JJ ( )	BackGround	SRelated	Neutral
08-1042_6	Using this strict equivalence to identify ad hoc rules is quite successful ( ) , but it misses a significant number of generalizations	VVG DT JJ NN TO VV FW FW NNS VBZ RB JJ ( ) , CC PP VVZ DT JJ NN IN NNS	BackGround	GRelated	Neutral
08-1042_7	Not only are errors inherently undesirable for obtaining an accurate grammar , but training on data with erroneous rules can be detrimental to parsing performance ( ) 	RB RB VBP NNS RB JJ IN VVG DT JJ NN , CC NN IN NNS IN JJ NNS MD VB JJ TO VVG NN ( )	BackGround	GRelated	Neutral
08-1042_7	To define dissimilarity , we need a notion of similarity , and , a starting point for this is the error detection method outlined in Dickinson and Meurers ( )	TO VV NN , PP VVP DT NN IN NN , CC , DT VVG NN IN DT VBZ DT NN NN NN VVN IN NP CC NP ( )	BackGround	SRelated	Neutral
08-1042_7	This captures the property that identical daughters lists with different mothers are distinct ( )	DT VVZ DT NN IN/that JJ NNS NNS IN JJ NNS VBP JJ ( )	BackGround	SRelated	Neutral
08-1042_8	Although statistical techniques have been employed to detect anomalous annotation ( ) , these methods do not account for linguistically-motivated generalizations across rules , and no full evaluation has been done on a treebank	IN JJ NNS VHP VBN VVN TO VV JJ NN ( ) , DT NNS VVP RB VV IN JJ NNS IN NNS , CC DT JJ NN VHZ VBN VVN IN DT NN	BackGround	SRelated	Neutral
08-1042_9	Infrequent rules in one genre may be quite frequent in another ( ) and their frequency may be unrelated to their usefulness for parsing ( )	JJ NNS IN CD NN MD VB RB JJ IN DT ( ) CC PP$ NN MD VB JJ TO PP$ NN IN VVG ( )	BackGround	GRelated	Neutral
08-1042_10	This issue is of even more importance when considering the task of porting a parser trained on one genre to another genre ( )	DT NN VBZ IN RB JJR NN WRB VVG DT NN IN VVG DT NN VVN IN CD NN TO DT NN ( )	BackGround	GRelated	Neutral
08-1042_12	Active learning techniques also require a scoring function for parser confidence ( ) , and often use uncertainty scores of parse trees in order to select representative samples for learning ( )	JJ VVG NNS RB VVP DT VVG NN IN NN NN ( ) , CC RB VV NN NNS IN VVP NNS IN NN TO VV JJ NNS IN VVG ( )	BackGround	GRelated	Neutral
08-1042_13	Since most natural language expressions are endocentric , i.e. , a category projects to a phrase of the same category ( ) , daughters lists with more than one possible mother are flagged as potentially containing an error	IN RBS JJ NN NNS VBP JJ , FW , DT NN NNS TO DT NN IN DT JJ NN ( ) , NNS NNS IN JJR IN CD JJ NN VBP VVN IN RB VVG DT NN	BackGround	GRelated	Neutral
08-1042_15	This is in the spirit of Kveton and Oliva ( ) , who define invalid bigrams for POS annotation sequences in order to detect annotation errors.	DT VBZ IN DT NN IN NP CC NP ( ) , WP VV JJ NNS IN NP NN NNS IN NN TO VV NN NN	Fundamental	Idea	Neutral
08-1042_16	For example , IN NP 1 has nine different mothers in the Wall Street Journal (WSJ) portion of the Penn Treebank ( ) , six of which are errors	IN NN , IN NP CD VHZ CD JJ NNS IN DT NP NP NP JJ NN IN DT NP NP ( ) , CD IN WDT VBP NNS	BackGround	GRelated	Neutral
08-1042_17	If a treebank grammar is used ( ) , then one needs to isolate rules for ungram-matical data , to be able to distinguish grammatical from ungrammatical input	IN DT NN NN VBZ VVN ( ) , RB CD NNS TO VV NNS IN JJ NNS , TO VB JJ TO VV JJ IN JJ NN	BackGround	GRelated	Neutral
08-1042_22	Thus , identifying ad hoc rules can also provide feedback on annotation schemes , an especially important step if one is to use the treebank for specific applications ( ) , or if one is in the process of developing a treebank	RB , VVG NN FW NNS MD RB VV NN IN NN NNS , DT RB JJ NN IN CD VBZ TO VV DT NN IN JJ NNS ( ) , CC IN PP VBZ IN DT NN IN VVG DT NN	BackGround	GRelated	Neutral
08-1042_23	This is true of precision grammars , where analyses can be more or less preferred ( ) , and in applications like intelligent computer-aided language learning , where learner input is parsed to detect what is correct or not (see , e.g. , Vandeventer Faltin , 2003 , ch	DT VBZ JJ IN NN NNS , WRB NNS MD VB JJR CC JJR JJ ( ) , CC IN NNS IN JJ JJ NN NN , WRB NN NN VBZ VVN TO VV WP VBZ JJ CC RB NN , FW , NP NP , CD , NN	BackGround	GRelated	Neutral
08-1043_0	The input for the segmentation task is however highly ambiguous for Semitic languages , and surface forms (tokens) may admit multiple possible analyses as in ( )	DT NN IN DT NN NN VBZ RB RB JJ IN JJ NNS , CC NN NNS NN MD VV JJ JJ NNS IN IN ( )	BackGround	GRelated	Neutral
08-1043_0	Morphological dis-ambiguators that consider a token in context (an utterance) and propose the most likely morphological analysis of an utterance (including segmentation) were presented by Bar-Haim  et al. ( ) , Adler and Elhadad ( ) , Shacham and Wintner ( ) , and achieved good results (the best segmentation result so far is around 98%)	JJ NNS WDT VVP DT JJ IN NN NN NN CC VV DT RBS JJ JJ NN IN DT NN VVG NN VBD VVN IN NP NP NP ( ) , NP CC NP ( ) , NP CC NP ( ) , CC VVD JJ NNS NNS JJS NN NN RB RB VBZ RB JJ	BackGround	GRelated	Neutral
08-1043_0	A possible probabilistic model for assigning probabilities to complex analyses of a surface form may be P (REL ,  VB fmnh , context) = P (RELf)P (VB|mnh , REL)P (REL , VB| context) and indeed recent sequential disambiguation models for Hebrew ( ) and Arabic ( ) present similar models	DT JJ JJ NN IN VVG NNS TO JJ NNS IN DT NN NN MD VB NN NN , NP NP , NN SYM NN NN NP , NP NP , NP NN CC RB JJ JJ NN NNS IN JJ ( ) CC NP ( ) JJ JJ NNS	BackGround	GRelated	Neutral
08-1043_0	In sequential tagging models such as ( ) weights are assigned according to a language model based on linear context	IN JJ VVG NNS JJ IN ( ) NNS VBP VVN VVG TO DT NN NN VVN IN JJ NN	BackGround	GRelated	Neutral
08-1043_1	Using a wide-coverage morphological analyzer based on ( ) should cater for a better coverage , and incorporating lexical probabilities learned from a big (unannotated) corpus (cf. ( )) will make the parser more robust and suitable for use in more realistic scenarios	VVG DT NN JJ NN VVN IN ( ) MD VV IN DT JJR NN , CC VVG JJ NNS VVN IN DT JJ JJ NN NN ( NN MD VV DT NN RBR JJ CC JJ IN NN IN JJR JJ NNS	BackGround	MRelated	Positive
08-1043_2	This is by now a fairly standard representation for multiple morphological segmentation of Hebrew utterances ( )	DT VBZ IN RB DT RB JJ NN IN JJ JJ NN IN JJ NNS ( )	BackGround	GRelated	Neutral
08-1043_4	Tsarfaty ( ) used a morphological analyzer ( ) , a PoS tagger ( ) , and a general purpose parser ( ) in an integrated framework in which morphological and syntactic components interact to share information , leading to improved performance on the joint task	NP ( ) VVN DT JJ NN ( ) , DT NP NN ( ) , CC DT JJ NN NN ( ) IN DT JJ NN IN WDT JJ CC JJ NNS VVP TO VV NN , VVG TO VVN NN IN DT JJ NN	BackGround	GRelated	Neutral
08-1043_4	To evaluate the performance on the segmentation task , we report SEG , the standard harmonic means for segmentation Precision and Recall F 1 (as defined in Bar-Haim  et al. ( ); Tsarfaty ( )) as well as the segmentation accuracy SEG Tok measure indicating the percentage of input tokens assigned the correct exact segmentation (as reported by Cohen and Smith ( ))	TO VV DT NN IN DT NN NN , PP VVP NP , DT JJ JJ NNS IN NN NP CC NP NN CD NNS VVN IN NP NP NP ( NP NP ( NN RB RB IN DT NN NN NP NP NN VVG DT NN IN NN NNS VVN DT JJ JJ NN NNS VVN IN NP CC NP ( NN	Fundamental	Basis	Neutral
08-1043_5	REL+VB) (cf. ( )) and probabilities are assigned to different analyses in accordance with the likelihood of their tags (e.g. , "fmnh is 30% likely to be tagged NN and 70% likely to be tagged REL+VB")	NP NP ( NN CC NNS VBP VVN TO JJ NNS IN NN IN DT NN IN PP$ NNS NN , NP VBZ CD JJ TO VB VVN NP CC CD JJ TO VB VVN NP	NULL	NULL	NULL
08-1043_7	This is akin to PoS tags sequences induced by different parses in the setup familiar from English and explored in e.g. ( )	DT VBZ JJ TO NP NNS NNS VVN IN JJ VVZ IN DT NN JJ IN NP CC VVN IN FW ( )	BackGround	SRelated	Neutral
08-1043_8	One way to approach this discrepancy is to assume a preceding phase of morphological segmentation for extracting the different lexical items that exist at the token level (as is done , to the best of our knowledge , in all parsing related work on Arabic and its dialects ( ))	CD NN TO VV DT NN VBZ TO VV DT JJ NN IN JJ NN IN VVG DT JJ JJ NNS WDT VVP IN DT JJ NN NNS VBZ VVN , TO DT JJS IN PP$ NN , IN DT VVG JJ NN IN NP CC PP$ NNS ( NN	BackGround	GRelated	Neutral
08-1043_9	Cohen and Smith ( ) followed up on these results and pro-posed a system for joint inference of morphological and syntactic structures using factored models each designed and trained on its own	NP CC NP ( ) VVN RP IN DT NNS CC JJ DT NN IN JJ NN IN JJ CC JJ NNS VVG VVN NNS DT VVN CC VVN IN PP$ JJ	BackGround	GRelated	Neutral
08-1043_9	Morphological segmentation decisions in our model are delegated to a lexeme-based PCFG and we show that using a simple treebank grammar , a data-driven lexicon , and a linguistically motivated unknown-tokens handling our model outperforms ( ) and ( ) on the joint task and achieves state-of-the-art results on a par with current respective standalone models	JJ NN NNS IN PP$ NN VBP VVN TO DT JJ NN CC PP VVP IN/that VVG DT JJ NN NN , DT JJ NN , CC DT RB VVN NNS VVG PP$ NN VVZ ( ) CC ( ) IN DT JJ NN CC VVZ JJ NNS IN DT NN IN JJ JJ JJ NNS	Compare	Compare	Negative
08-1043_9	Cohen and Smith ( ) later on based a system for joint inference on factored , independent , morphological and syntactic components of which scores are combined to cater for the joint inference task	NP CC NP ( ) RBR IN VVN DT NN IN JJ NN IN VVN , JJ , JJ CC JJ NNS IN WDT NNS VBP VVN TO VV IN DT JJ NN NN	BackGround	GRelated	Neutral
08-1043_9	To facilitate the comparison of our results to those reported by ( ) we use their data set in which 177 empty and "malformed" 7 were removed	TO VV DT NN IN PP$ NNS TO DT VVN IN ( ) PP VVP PP$ NNS VVP IN WDT CD JJ CC JJ CD VBD VVN	Compare	Compare	Neutral
08-1043_9	 We used BitPar ( ) , an efficient general purpose parser , 10 together with various treebank grammars to parse the input sentences and propose compatible morphological segmentation and syntactic analysis	PP VVD NP ( ) , DT JJ JJ NN NN , CD RB IN JJ NN NNS TO VV DT NN NNS CC VV JJ JJ NN CC JJ NN	Fundamental	Basis	Neutral
08-1043_9	Finally , model GT v = 2 includes parent annotation on top of the various state-splits , as is done also in ( )	RB , JJ NP NN SYM CD VVZ NN NN IN NN IN DT JJ NNS , RB VBZ VVN RB IN ( )	Fundamental	Idea	Neutral
08-1043_9	Table 2 compares the performance of our system on the setup of Cohen and Smith ( ) to the best results reported by them for the same tasks	NN CD VVZ DT NN IN PP$ NN IN DT NN IN NP CC NP ( ) TO DT JJS NNS VVD IN PP IN DT JJ NNS	Compare	Compare	Neutral
08-1043_10	In Modern Hebrew (Hebrew) , a Semitic language with very rich morphology , particles marking conjunctions , prepositions , complementizers and rela-tivizers are bound elements prefixed to the word ( )	IN NP NP NN , DT JJ NN IN RB JJ NN , NNS VVG NNS , NNS , NNS CC NNS VBP VVN NNS VVN TO DT NN ( )	BackGround	GRelated	Neutral
08-1043_11	In our third model GT ppp we also add the distinction between general PPs and possessive PPs following Goldberg and Elhadad ( )	IN PP$ JJ NN NP NP PP RB VVP DT NN IN JJ NNS CC JJ NNS VVG NP CC NP ( )	Fundamental	Idea	Neutral
08-1043_12	The current work treats both segmental and super-segmental phenomena , yet we note that there may be more adequate ways to treat super-segmental phenomena assuming Word-Based morphology as we explore in ( )	DT JJ NN VVZ DT JJ CC JJ NNS , RB PP VVP IN/that EX MD VB RBR JJ NNS TO VV JJ NNS VVG JJ NN IN PP VVP IN ( )	Fundamental	Idea	Positive
08-1043_15	We use the HSPELL 9 ( ) wordlist as a lexeme-based lexicon for pruning segmentations involving invalid segments	PP VVP DT NP CD ( ) NN IN DT JJ NN IN VVG NNS VVG JJ NNS	Fundamental	Basis	Neutral
08-1043_16	Morphological analyzers for Hebrew that analyze a surface form in isolation have been proposed by Segal ( ) , Yona and Wintner ( ) , and recently by the knowledge center for processing Hebrew ( )	JJ NNS IN NP WDT VVP DT NN NN IN NN VHP VBN VVN IN NP ( ) , NP CC NP ( ) , CC RB IN DT NN NN IN VVG JJ ( )	BackGround	GRelated	Neutral
08-1043_16	Such resources exist for Hebrew ( ) , but unfortunately use a tagging scheme which is incom-patible with the one of the Hebrew Treebank	JJ NNS VVP IN JJ ( ) , CC RB VV DT VVG NN WDT VBZ JJ IN DT CD IN DT NP NP	BackGround	GRelated	Negative
08-1043_22	The development of the very first Hebrew Tree-bank ( ) called for the exploration of general statistical parsing methods , but the application was at first limited	DT NN IN DT RB JJ NP NP ( ) VVN IN DT NN IN JJ JJ VVG NNS , CC DT NN VBD IN RB VVN	BackGround	GRelated	Negative
08-1043_22	We use the Hebrew Treebank , ( ) , provided by the knowledge center for processing Hebrew , in which sentences from the daily newspaper "Ha'aretz" are morphologically segmented and syntactically annotated	PP VVP DT NP NP , ( ) , VVN IN DT NN NN IN NN NN , IN WDT NNS IN DT JJ NN NN VBP RB VVN CC RB VVN	Fundamental	Basis	Neutral
08-1043_25	The joint morphological and syntactic hypothesis was first discussed in ( ) and empirically explored in ( )	DT JJ JJ CC JJ NN VBD RB VVN IN ( ) CC RB VVN IN ( )	BackGround	GRelated	Neutral
08-1043_26	Tsarfaty ( ) was the first to demonstrate that fully automatic Hebrew parsing is feasible using the newly available 5000 sentences treebank	NP ( ) VBD DT JJ TO VV IN/that RB JJ NN VVG VBZ JJ VVG DT RB JJ CD NNS NN	BackGround	GRelated	Neutral
08-1043_26	Tsarfaty and Sima'an ( ) have reported state-of-the-art results on Hebrew unlexicalized parsing (74.41%) albeit assuming oracle morphological segmentation	NP CC NP ( ) VHP VVN JJ NNS IN NP VVD VVG NN IN VVG NN JJ NN	BackGround	GRelated	Neutral
08-1043_26	In our forth model GT nph we add the definiteness status of constituents following Tsarfaty and Sima'an ( )	IN PP$ RB JJ NP NN PP VVP DT NN NN IN NNS VVG NP CC NP ( )	Fundamental	Idea	Neutral
08-1043_26	We conjecture that this trend may continue by incorporating additional information , e.g. , three-dimensional models as proposed by Tsarfaty and Sima'an ( )	PP VVP IN/that DT NN MD VV IN VVG JJ NN , FW , JJ NNS IN VVN IN NP CC NP ( )	BackGround	MRelated	Neutral
08-1043_27	Tsarfaty ( ) argues that for Semitic languages determining the correct morphological segmentation is dependent on syntactic context and shows that increasing information sharing between the morphological and the syntactic components leads to improved performance on the joint task	NP ( ) VVZ IN/that IN JJ NNS VVG DT JJ JJ NN VBZ JJ IN JJ NN CC VVZ IN/that VVG NN VVG IN DT JJ CC DT JJ NNS VVZ TO VVN NN IN DT JJ NN	BackGround	GRelated	Neutral
08-1043_27	In our second model GT vpi we also distinguished finite and non-finite verbs and VPs as proposed in ( )	IN PP$ JJ NN NP NP PP RB VVD JJ CC JJ NNS CC NNS RB VVN IN ( )	Fundamental	Idea	Neutral
08-1043_27	Both ( ) have shown that a single integrated framework outperforms a completely streamlined implementation , yet neither has shown a single generative model which handles both tasks	CC ( ) VHP VVN IN/that DT JJ JJ NN VVZ DT RB JJ NN , RB CC VHZ VVN DT JJ JJ NN WDT VVZ DT NNS	BackGround	GRelated	Neutral
08-1044_0	Finally , Adda-Decker and Lamel ( ) demonstrated that both French and English ASR systems had more trouble with male speakers than female speakers , and found several possible explanations , including higher rates of disfluencies and more reduction	RB , NP CC NP ( ) VVN IN/that DT JJ CC JJ NP NNS VHD JJR NN IN JJ NNS IN JJ NNS , CC VVD JJ JJ NNS , VVG JJR NNS IN NNS CC JJR NN	BackGround	GRelated	Neutral
08-1044_0	Also , like Adda-Decker and Lamel ( ) , we find that male speakers have higher error rates than females , though in our data set the difference is more striking (3.6% absolute , compared to their 2.0%)	RB , IN NP CC NP ( ) , PP VVP IN/that JJ NNS VHP JJR NN NNS IN NNS , RB IN PP$ NNS VVD DT NN VBZ RBR JJ NN JJ , VVN TO PP$ JJ	Compare	Compare	Neutral
08-1044_0	This last result sheds some light on the work of Adda-Decker and Lamel ( ) , who suggested several factors that could explain males' higher error rates	DT JJ NN VVZ DT NN IN DT NN IN NP CC NP ( ) , WP VVD JJ NNS WDT MD VV NP JJR NN NNS	BackGround	SRelated	Neutral
08-1044_1	We fit our models using the lme4 package ( ) of R ( )	PP VVP PP$ NNS VVG DT JJ NN ( ) IN NN ( )	Fundamental	Basis	Neutral
08-1044_2	One possibility is that female speech is more easily recognized because females tend to have expanded vowel spaces ( ) , a factor that is associated with greater intelligibility ( ) and is characteristic of genres with lower ASR error rates ( )	CD NN VBZ IN/that JJ NN VBZ RBR RB VVN IN NNS VVP TO VH VVN NN NNS ( ) , DT NN WDT VBZ VVN IN JJR NN ( ) CC VBZ JJ IN NNS IN JJR NP NN NNS ( )	BackGround	SRelated	Neutral
08-1044_3	Previous work on recognition of spontaneous monologues and dialogues has shown that infrequent words are more likely to be misrecognized ( ) and that fast speech increases error rates ( )	JJ NN IN NN IN JJ NNS CC NNS VHZ VVN IN/that JJ NNS VBP RBR JJ TO VB JJ ( ) CC DT JJ NN VVZ NN NNS ( )	BackGround	GRelated	Neutral
08-1044_3	In the word-level analyses of Fosler-Lussier and Morgan ( ) and Shinozaki and Fu-rui ( ) , only substitution and deletion errors were considered , so we do not know how including insertions might affect the results	IN DT JJ NNS IN NP CC NP ( ) CC NP CC NP ( ) , JJ NN CC NN NNS VBD VVN , RB PP VVP RB VV WRB VVG NNS MD VV DT NNS	BackGround	GRelated	Negative
08-1044_4	In Hirschberg et al.'s ( ) analysis of two human-computer dialogue systems , misrecog-nized turns were found to have (on average) higher maximum pitch and energy than correctly recognized turns	IN NP NP NNS ( ) NN IN CD NN NN NNS , JJ NNS VBD VVN TO VH NN NN JJR NN NN CC NN IN RB VVN NNS	BackGround	GRelated	Neutral
08-1044_4	Hirschberg et al.'s ( ) work suggests that prosodic factors can impact error rates , but leaves open the question of which factors are important at the word level and how they influence recognition of natural conversational speech	NP FW NNS ( ) NN VVZ IN/that JJ NNS MD VV NN NNS , CC VVZ VV DT NN IN WDT NNS VBP JJ IN DT NN NN CC WRB PP VVP NN IN JJ JJ NN	BackGround	GRelated	Neutral
08-1044_6	Classes were identified using a POS tagger ( ) trained on the tagged Switchboard corpus	NNS VBD VVN VVG DT NP NN ( ) VVN IN DT VVN NN NN	Fundamental	Basis	Neutral
08-1044_7	We also see a tendency towards higher IWER for very slow speech , consistent with Shinozaki and Furui ( ) and Siegler and Stern ( )	PP RB VVP DT NN IN JJR NN IN RB JJ NN , JJ IN NP CC NP ( ) CC NP CC NP ( )	Compare	Compare	Positive
08-1044_8	For our analysis , we used the output from the SRI/ICSI/UW RT-04 CTS system ( ) on the NIST RT-03 development set	IN PP$ NN , PP VVD DT NN IN DT NP NP NP NN ( ) IN DT NP NP NN NN	Fundamental	Basis	Neutral
08-1045_0	Rachmaninoff , Rafael and Brokoviev Ref3 composers including Bach , Mozart , Schopen , Beethoven , missing name Raphael , Rahmaniev and Brokofien Ref4 composers such as Bach , Mozart , missing name Beethoven , Schumann , Rachmaninov , Raphael and Prokofiev The task of transliterating names (independent of end-to-end MT) has received a significant amount of research , e.g. , ( )	NP , NP CC NP NP NNS VVG NP , NP , NP , NP , JJ NN NP , NP CC NP NP NNS JJ IN NP , NP , JJ NN NP , NP , NP , NP CC NP DT NN IN VVG NNS JJ IN NN NP VHZ VVN DT JJ NN IN NN , FW , ( )	BackGround	GRelated	Neutral
08-1045_5	Unlike various generative approaches ( ) , we do not synthesize an English spelling from scratch , but rather find a translation in very large lists of English words (3.4 million) and phrases (47 million)	IN JJ JJ NNS ( ) , PP VVP RB VV DT JJ NN IN NN , CC RB VV DT NN IN RB JJ NNS IN JJ NNS JJ NN CC NNS JJ NN	Compare	Compare	Neutral
08-1046_0	We follow previous work in using the Brent corpus consists of 9790 transcribed utterances (33 ,399 words) of child-directed speech from the Bernstein-Ratner corpus ( ) in the CHILDES database ( )	PP VVP JJ NN IN VVG DT NP NN VVZ IN CD VVN NNS JJ CD NN IN JJ NN IN DT NP NN ( ) IN DT NP NN ( )	Fundamental	Basis	Neutral
08-1046_1	Second , we can generalize over arbitrary subtrees rather than local trees in much the way done in DOP or tree substitution grammar ( ) , which leads to adaptor grammars	RB , PP MD VV IN JJ NNS RB IN JJ NNS IN RB DT NN VVN IN NP CC NN NN NN ( ) , WDT VVZ TO NN NNS	BackGround	SRelated	Neutral
08-1046_2	PCFG estimation procedures have been used to model the supervised and unsupervised acquisition of syllable structure ( ); and the best performance in unsupervised acquisition is obtained using a grammar that encodes linguistically detailed properties of syllables whose rules are inferred using a fairly complex algorithm ( )	NP NN NNS VHP VBN VVN TO VV DT JJ CC JJ NN IN NN NN ( NN CC DT JJS NN IN JJ NN VBZ VVN VVG DT NN WDT VVZ RB JJ NNS IN NNS WP$ NNS VBP VVN VVG DT RB JJ NN ( )	BackGround	GRelated	Neutral
08-1046_2	Following Goldwater and Johnson ( ) , the grammar differentiates between OnsetI , which expands to word-initial onsets , and Onset , Sentence Word Word Onsetl Nucleus CodaF Onsetl Nucleus CodaF W?    t  s    D? s Figure 6: A parse of "what's this" produced by the unigram syllable adaptor grammar of Figure 7	VVG NP CC NP ( ) , DT NN VVZ IN NP , WDT VVZ TO JJ NNS , CC NN , NP NP NP NP NP NP NP NP NP NP NN NN NP NN NP CD DT VVP IN JJ NN VVN IN DT NN NN NN NN IN NP CD	Fundamental	Idea	Neutral
08-1046_2	For example , the adaptor grammars for syllable structure presented in sections 3.3 and 3.6 learn more information about syllable onsets and codas than the PCFGs presented in Goldwater and Johnson ( )	IN NN , DT NN NNS IN NN NN VVN IN NNS CD CC CD VVP JJR NN IN NN NNS CC NNS IN DT NNS VVN IN NP CC NP ( )	BackGround	SRelated	Negative
08-1046_3	We evaluated the f-score of the recovered word constituents ( )	PP VVD DT NN IN DT VVN NN NNS ( )	Fundamental	Basis	Neutral
08-1046_3	Johnson et al. ( ) presented an adaptor grammar that defines a unigram model of word segmentation and showed that it performs as well as the unigram DP word segmentation model presented by ( )	NP NP NP ( ) VVD DT NN NN WDT VVZ DT NN NN IN NN NN CC VVD IN/that PP VVZ RB RB IN DT NN JJ NN NN NN VVN IN ( )	BackGround	GRelated	Neutral
08-1046_3	As reported in Goldwater et al. ( ) and Goldwater et al. ( ) , a unigram word segmentation model tends to undersegment and misanalyse collocations as individual words	IN VVN IN NP NP NP ( ) CC NP NP NP ( ) , DT NN NN NN NN VVZ TO NN CC NN NNS IN JJ NNS	BackGround	SRelated	Neutral
08-1046_3	based unsupervised morphological analysis model presented by Goldwater et al. ( )	VVN JJ JJ NN NN VVN IN NP NP NP ( )	NULL	NULL	NULL
08-1046_3	Goldwater et al. ( ) showed that modeling dependencies between adjacent words dramatically improves word segmentation accuracy	NP NP NP ( ) VVD IN/that VVG NNS IN JJ NNS RB VVZ NN NN NN	BackGround	GRelated	Positive
08-1046_3	It's straight forward to design an adaptor grammar that can capture a finite number of concatenative paradigm classes ( )	NP RB RB TO VV DT NN NN WDT MD VV DT JJ NN IN JJ NN NNS ( )	BackGround	GRelated	Neutral
08-1046_6	It may be possible to adapt efficient split-merge samplers ( )?nd Variational Bayes methods ( )?or DPs to adaptor grammars and other linguistic applications of HDPs	PP MD VB JJ TO VV JJ JJ NNS ( NP NP NP NNS ( NP NP TO NN NNS CC JJ JJ NNS IN NP	BackGround	MRelated	Positive
08-1046_7	(In fact , the inference procedure for adaptor grammars described in Johnson et al. ( ) relies on a PCFG approximation that contains a rule for each subtree generalization in the adaptor grammar)	NN NN , DT NN NN IN NN NNS VVN IN NP NP NP ( ) VVZ IN DT NP NN WDT VVZ DT NN IN DT NN NN IN DT NN NN	BackGround	SRelated	Neutral
08-1046_7	Adaptor grammars ( ) are a non-parametric Bayesian extension of Probabilistic Context-Free Grammars (PCFGs) which in effect learn the probabilities of entire subtrees	NN NNS ( ) VBP DT JJ NP NN IN NP NP NP NN WDT IN NN VVP DT NNS IN JJ NNS	NULL	NULL	NULL
08-1046_7	This section introduces adaptor grammars as an extension of PCFGs; for a more detailed exposition see Johnson et al. ( )	DT NN VVZ NN NNS IN DT NN IN NP IN DT RBR JJ NN VVP NP NP NP ( )	BackGround	SRelated	Neutral
08-1046_7	Johnson et al. ( ) describe an MCMC procedure for inferring the adapted tree distributions Ga , and Johnson et al. ( ) describe a Bayesian inference procedure for the PCFG rule parameters 6 using a Metropolis-Hastings MCMC procedure; implementations are available from the author's web site	NP NP NP ( ) VV DT NP NN IN VVG DT VVN NN NNS NP , CC NP NP NP ( ) VV DT NP NN NN IN DT NP NN NNS CD VVG DT NP NP NN NNS VBP JJ IN DT JJ NN NN	BackGround	GRelated	Neutral
08-1046_7	Johnson et al. ( ) presented an adaptor grammar for segmenting verbs into stems and suffixes that implements the DP-Sentence	NP NP NP ( ) VVD DT NN NN IN VVG NNS IN VVZ CC VVZ WDT VVZ DT NN	BackGround	GRelated	Neutral
08-1046_7	The MCMC sampler of Johnson et al. ( ) used here is satifactory for small and medium-sized problems , but it would be very useful to have more efficient inference procedures	DT NP NN IN NP NP NP ( ) VVN RB VBZ JJ IN JJ CC JJ NNS , CC PP MD VB RB JJ TO VH JJR JJ NN NNS	BackGround	MRelated	Positive
08-1046_13	For example , the Bayesian unsupervised PCFG estimation procedure devised by Stolcke ( ) uses a model-merging procedure to propose new sets of PCFG rules and a Bayesian version of the EM procedure to estimate their weights	IN NN , DT NP JJ NP NN NN VVN IN NP ( ) VVZ DT VVG NN TO VV JJ NNS IN NP NNS CC DT NP NN IN DT JJ NN TO VV PP$ NNS	BackGround	GRelated	Neutral
08-1046_14	There are several different ways to define DPs; one of the most useful is the characterization of the conditional or sampling distribution of a draw from DP(a , H) in terms of the Polya urn or Chinese Restaurant Process ( )	EX VBP JJ JJ NNS TO VV NP CD IN DT RBS JJ VBZ DT NN IN DT JJ CC VVG NN IN DT NN IN NP , NP IN NNS IN DT NP NN CC JJ NP NP ( )	BackGround	GRelated	Positive
08-1046_14	Technically this grammar implements a Hierarchical Dirichlet Process (HDP) ( ) because the base distribution for the Word DP is itself constructed from the Stem and Suffix distributions , which are themselves generated by DPs	RB DT NN VVZ DT NP NP NP NN ( ) IN DT JJ NN IN DT NP NN VBZ PP VVN IN DT NP CC NP NNS , WDT VBP PP VVN IN NP	BackGround	SRelated	Neutral
08-1047_0	Asahara and Motsumoto ( ) proposed using characters instead of morphemes as the unit to alleviate the effect of segmentation errors in morphological analysis and we also used their character-based method	NP CC NP ( ) VVN VVG NNS RB IN NNS IN DT NN TO VV DT NN IN NN NNS IN JJ NN CC PP RB VVD PP$ JJ NN	Fundamental	Basis	Neutral
08-1047_0	Though there may be slight differences , these features are based on the standard ones proposed and used in previous studies on Japanese NER such as those by Asahara and Motsumoto ( ) , Nakano and Hirai ( ) , and Yamada ( )	IN EX MD VB JJ NNS , DT NNS VBP VVN IN DT JJ NNS VVN CC VVN IN JJ NNS IN JJ NN JJ IN DT IN NP CC NP ( ) , NP CC NP ( ) , CC NP ( )	BackGround	GRelated	Neutral
08-1047_2	In addition , the clustering methods used , such as HMMs and Brown's algorithm ( ) , seem unable to adequately capture the semantics of MNs since they are based only on the information of adjacent words	IN NN , DT VVG NNS VVN , JJ IN NP CC NP NN ( ) , VVP JJ TO RB VV DT NNS IN NP IN PP VBP VVN RB IN DT NN IN JJ NNS	BackGround	GRelated	Negative
08-1047_2	They constructed word clusters by using HMMs or Brown's clustering algorithm ( ) , which utilize only information from neighboring words	PP VVN NN NNS IN VVG NP CC NP VVG NN ( ) , WDT VV JJ NN IN JJ NNS	BackGround	GRelated	Neutral
08-1047_3	Chu et al. ( ) presented the MapReduce framework for a wide range of machine learning algorithms , including the EM algorithm	NP NP NP ( ) VVD DT NN NN IN DT JJ NN IN NN VVG NNS , VVG DT JJ NN	BackGround	GRelated	Neutral
08-1047_4	Since building and maintaining high-quality gazetteers by hand is very expensive , many methods have been proposed for automatic extraction of gazetteers from texts ( )	IN VVG CC VVG JJ NNS IN NN VBZ RB JJ , JJ NNS VHP VBN VVN IN JJ NN IN NNS IN NNS ( )	BackGround	GRelated	Neutral
08-1047_5	For example , we can use automatically extracted hyponymy relations ( ) , or automatically induced MN clusters ( )	IN NN , PP MD VV RB VVN JJ NNS ( ) , CC RB VVN JJ NNS ( )	BackGround	GRelated	Neutral
08-1047_6	Defining sentences in a dictionary or an encyclopedia have long been used as a source of hyponymy relations ( )	VVG NNS IN DT NN CC DT NN VHP RB VBN VVN IN DT NN IN JJ NNS ( )	BackGround	GRelated	Neutral
08-1047_7	19 Recently , Inui et al. ( ) investi-gated the relation between the size and the quality of a gazetteer and its effect	CD RB , NP NP NP ( ) VVD DT NN IN DT NN CC DT NN IN DT NN CC PP$ NN	BackGround	GRelated	Neutral
08-1047_8	For instance , Kazama and Torisawa ( ) used the hyponymy relations extracted from Wikipedia for the English NER , and reported improved accuracies with such a gazetteer	IN NN , NP CC NP ( ) VVN DT NN NNS VVN IN NP IN DT JJ NN , CC VVD VVN NNS IN PDT DT NN	BackGround	GRelated	Neutral
08-1047_8	We also compared the cluster gazetteers with the Wikipedia gazetteer constructed by following the method of ( )	PP RB VVD DT NN NNS IN DT NP NN VVN IN VVG DT NN IN ( )	Fundamental	Idea	Neutral
08-1047_8	Kazama and Torisawa ( ) extracted hyponymy relations from the first sentences (i.e. , defining sentences) of Wikipedia articles and then used them as a gazetteer for NER	NP CC NP ( ) VVN NN NNS IN DT JJ NNS JJ , VVG NN IN NP NNS CC RB VVD PP IN DT NN IN NP	BackGround	GRelated	Neutral
08-1047_8	Although this Wikipedia gazetteer is much smaller than the English version used by Kazama and Torisawa ( ) that has over 2 ,000 ,000 entries , it is the largest gazetteer that can be freely used for Japanese NER	IN DT NP NN VBZ RB JJR IN DT JJ NN VVN IN NP CC NP ( ) WDT VHZ RB CD CD CD NNS , PP VBZ DT JJS NN WDT MD VB RB VVN IN JJ NN	Compare	Compare	Positive
08-1047_8	We follow the method used by Kazama and Torisawa ( ) , which encodes the matching with a gazetteer entity using IOB tags , with the modification for Japanese	PP VVP DT NN VVN IN NP CC NP ( ) , WDT VVZ DT NN IN DT NN NN VVG NP NNS , IN DT NN IN JJ	Fundamental	Idea	Neutral
08-1047_9	In the context of tagging , there are several studies that utilized word clusters to prevent the data sparseness problem ( )	IN DT NN IN VVG , EX VBP JJ NNS WDT VVD NN NNS TO VV DT NNS NN NN ( )	BackGround	GRelated	Neutral
08-1047_9	Inducing features for taggers by clustering has been tried by several researchers ( )	VVG NNS IN NN IN VVG VHZ VBN VVN IN JJ NNS ( )	BackGround	GRelated	Neutral
08-1047_10	We used MeCab as a morphological analyzer and CaboCha 14 ( ) as the dependency parser to find the boundaries of the bunsetsu	PP VVD NP IN DT JJ NN CC NP CD ( ) IN DT NN NN TO VV DT NNS IN DT NN	Fundamental	Basis	Neutral
08-1047_11	The corpus we used for collecting dependencies was a large set (76 million) of Web documents , that were processed by a dependency parser , KNP ( )	DT NN PP VVD IN VVG NNS VBD DT JJ NN NN NN IN NP NNS , WDT VBD VVN IN DT NN NN , NP ( )	Fundamental	Basis	Neutral
08-1047_12	We use Conditional Random Fields (CRFs) ( ) to perform this tagging	PP VVP NP NP NP NN ( ) TO VV DT VVG	Fundamental	Basis	Neutral
08-1047_14	There are several studies that used automatically extracted gazetteers for NER ( )	EX VBP JJ NNS WDT VVD RB VVN NNS IN JJ ( )	BackGround	GRelated	Neutral
08-1047_14	Shinzato et al. ( ) constructed gazetteers with about 100 ,000 entries in total for the "restaurant" domain; Talukdar et al. ( ) used gazetteers with about 120 ,000 entries in total , and Nadeau et al. ( ) used gazetteers with about 85 ,000 entries in total	NP NP NP ( ) VVN NNS IN RB CD CD NNS IN NN IN DT JJ NN NP NP NP ( ) VVN NNS IN RB CD CD NNS IN NN , CC NP NP NP ( ) VVN NNS IN RB CD CD NNS IN NN	BackGround	GRelated	Neutral
08-1047_16	Newman et al. ( ) presented parallelized Latent Dirichlet Allocation (LDA)	NP NP NP ( ) VVN JJ NP NP NP NN	BackGround	GRelated	Neutral
08-1047_19	Rooth et al. ( ) and Torisawa ( ) showed that the EM-based clustering using verb-MN dependencies can produce semantically clean MN clusters	NP NP NP ( ) CC NP ( ) VVD IN/that DT JJ VVG VVG NN NNS MD VV RB VV JJ NNS	BackGround	GRelated	Neutral
08-1047_19	This study , on the other hand , utilized MN clustering based on verb-MN dependencies ( )	DT NN , IN DT JJ NN , VVD JJ VVG VVN IN JJ NNS ( )	Fundamental	Basis	Neutral
08-1047_20	Using models such as Semi-Markov CRFs ( ) , which handle the features on overlapping regions , is one possible direction	VVG NNS JJ IN NP NP ( ) , WDT VVP DT NNS IN JJ NNS , VBZ CD JJ NN	BackGround	MRelated	Neutral
08-1047_21	The system recently proposed by Sasano and Kurohashi ( ) is currently the best system for the IREX dataset	DT NN RB VVN IN NP CC NP ( ) VBZ RB DT JJS NN IN DT NP NN	BackGround	SRelated	Positive
08-1047_22	In our experiments , we used the IREX dataset ( ) to demonstrate the usefulness of cluster gazetteers	IN PP$ NNS , PP VVD DT NP NN ( ) TO VV DT NN IN NN NNS	Fundamental	Basis	Neutral
08-1047_27	We parallelized the algorithm of ( ) using the Message Passing Interface (MPI) , with the prime goal being to distribute parameters and thus enable clustering with a large vocabulary	PP VVD DT NN IN ( ) VVG DT NP NP NP NN , IN DT JJ NN VBG TO VV NNS CC RB VV VVG IN DT JJ NN	Fundamental	Basis	Neutral
08-1047_27	To learn p(n\c) and p(c) for Japanese , we use the EM-based clustering method presented by Torisawa ( )	TO VV NN CC NN IN JJ , PP VVP DT JJ VVG NN VVN IN NP ( )	Fundamental	Basis	Neutral
08-1047_29	The exception , which we noticed recently , is a study by Wolfe et al. ( ) , which describes how each node stores only those parameters relevant to the training data on each node	DT NN , WDT PP VVD RB , VBZ DT NN IN NP NP NP ( ) , WDT VVZ WRB DT NN NNS RB DT NNS JJ TO DT NN NNS IN DT NN	BackGround	SRelated	Neutral
08-1048_0	We consider 10 measures , noted in the table as J&C ( ) , Resnik ( ) , Lin ( ) , W&P ( ) , L&C ( ) , H&SO ( ) , Path (counts edges between synsets) , Lesk ( ) , and finally Vector and Vector Pair ( )	PP VVP CD NNS , VVD IN DT NN IN NP ( ) , NP ( ) , NP ( ) , NP ( ) , NP ( ) , NP ( ) , NP NNS NNS IN NN , NP ( ) , CC RB NP CC NP NP ( )	Fundamental	Basis	Neutral
08-1048_2	These kinds of measurements can help with problems such as identifying relevant sentences for extractive text summarization , or possibly paraphrase identification ( )	DT NNS IN NNS MD VV IN NNS JJ IN VVG JJ NNS IN JJ NN NN , CC RB VVP NN ( )	BackGround	MRelated	Neutral
08-1048_4	We use WordNet 3.0 , the latest version ( )	PP VVP NP CD , DT JJS NN ( )	Fundamental	Basis	Neutral
08-1048_5	The 1911 and 1987 Thesauri were compared with WordNet 3.0 on the three data sets containing pairs of words with manually assigned similarity scores: 30 pairs ( ) , 65 pairs ( ) and 353 pairs 3 ( )	DT CD CC CD NP VBD VVN IN NP CD IN DT CD NNS VVZ VVG NNS IN NNS IN RB VVN NN NN CD NNS ( ) , CD NNS ( ) CC CD NNS CD ( )	Compare	Compare	Neutral
08-1048_5	Even on the largest set ( ) , however , the differences between Roget's Thesaurus and the Vector method are not statistically significant at the p < 0.05 level for either thesaurus on a two-tailed test 4	RB IN DT JJS NN ( ) , RB , DT NNS IN NP NP CC DT NP NN VBP RB RB JJ IN DT NN SYM CD NN IN DT NN IN DT JJ NN CD	Compare	Compare	Neutral
08-1048_7	Other methods of determining sentence semantic relatedness expand term relatedness functions to create a sentence relatedness function ( )	JJ NNS IN VVG NN JJ NN VV NN NN NNS TO VV DT NN NN NN ( )	BackGround	GRelated	Neutral
08-1048_7	In ( ) , an even better system was proposed , with a correlation of 0.853	IN ( ) , DT RB JJR NN VBD VVN , IN DT NN IN CD	BackGround	GRelated	Positive
08-1048_8	Lexical chains have also been developed using the 1987 Roget's Thesaurus ( )	JJ NNS VHP RB VBN VVN VVG DT CD NP NP ( )	BackGround	GRelated	Neutral
08-1048_9	Kennedy and Szpakowicz ( ) show how disambiguating one of these relations , hypernymy , can help improve the semantic similarity functions in ( )	NP CC NP ( ) VV WRB VVG CD IN DT NNS , NN , MD VV VV DT JJ NN NNS IN ( )	BackGround	GRelated	Neutral
08-1048_9	Two terms in the same semicolon group score 16 , in the same paragraph - 14 , and so on ( )	CD NNS IN DT JJ NN NN NN CD , IN DT JJ NN : CD , CC RB IN ( )	BackGround	SRelated	Neutral
08-1048_9	We used the system from ( ) for identifying synonyms with Roget's	PP VVD DT NN IN ( ) IN VVG NNS IN NP	Fundamental	Basis	Neutral
08-1048_12	The 1987 data come from Penguin's Roget's Thesaurus ( )	DT CD NNS VVN IN NP NP NP ( )	Fundamental	Basis	Neutral
08-1048_13	We used three data sets for this application: 80 questions taken from the Test of English as a Foreign Language (TOEFL) ( ) , 50 questions - from the English as a Second Language test (ESL) ( ) and 300 questions - from the Reader's Digest Word Power Game (RDWP) ( )	PP VVD CD NN NNS IN DT JJ CD NNS VVN IN DT NN IN NP IN DT NP NP NN ( ) , CD NNS : IN DT NP IN DT NP NP NN NN ( ) CC CD NNS : IN DT NP NP NP NP NP NN ( )	Fundamental	Basis	Neutral
08-1048_16	We also proposed a new method of representing the meaning of sentences or other short texts using either WordNet or Roget's Thesaurus , and tested it on the data set provided by Li et al. ( )	PP RB VVD DT JJ NN IN VVG DT NN IN NNS CC JJ JJ NNS VVG DT NP CC NP NP , CC VVD PP IN DT NNS VVD VVN IN NP NP NP ( )	Fundamental	Basis	Neutral
08-1048_16	We worked with a data set from ( ) 	PP VVD IN DT NNS VVN IN ( )	Fundamental	Basis	Neutral
08-1048_16	For the system in ( ) , where this data set was first introduced , a correlation of 0.816 with the human annotators was achieved	IN DT NN IN ( ) , WRB DT NNS VVN VBD RB VVN , DT NN IN CD IN DT JJ NNS VBD VVN	BackGround	SRelated	Neutral
08-1048_19	On the ( ) and ( ) data sets the best system did not show a statistically significant improvement over the 1911 or 1987 Roget's Thesauri , even at p < 0.1 for a two-tailed test	IN DT ( ) CC ( ) NN VVZ DT JJS NN VVD RB VV DT RB JJ NN IN DT CD CC CD NP NP , RB IN NN SYM CD IN DT JJ NN	Compare	Compare	Neutral
08-1048_19	Much like ( ) , the data set used here is not large enough to determine if any system's improvement is statistically significant	RB IN ( ) , DT NNS VVD VVN RB VBZ RB JJ RB TO VV IN DT JJ NN VBZ RB JJ	Fundamental	Idea	Neutral
08-1048_20	Some work has been done on adding new terms and relations to WordNet ( ) and FACTOTUM ( )	DT NN VHZ VBN VVN IN VVG JJ NNS CC NNS TO NP ( ) CC NP ( )	BackGround	GRelated	Neutral
08-1048_22	We used Pedersen's Semantic Distance software package ( )	PP VVD NP NP NP NN NN ( )	Fundamental	Basis	Neutral
08-1048_25	They took a subset of the term pairs from ( ) and chose sentences to represent these terms; the sentences are definitions from the Collins Cobuild dictionary ( )	PP VVD DT NN IN DT NN NNS IN ( ) CC VVD NNS TO VV DT NN DT NNS VBP NNS IN DT NP NP NN ( )	BackGround	SRelated	Neutral
08-1049_0	The idea of using a bridge (i.e. , full-form) to obtain translation entries for unseen words (i.e. , abbreviation) is similar to the idea of using paraphrases in MT (see Callison-Burch et al. ( ) and references therein) as both are trying to introduce generalization into MT	DT NN IN VVG DT NN NN , JJ TO VV NN NNS IN JJ NNS JJ , NN VBZ JJ TO DT NN IN VVG NNS IN NP NP NP NP NP ( ) CC NNS RB IN DT VBP VVG TO VV NN IN NP	Fundamental	Idea	Neutral
08-1049_1	On the other hand , integrating an additional component into a baseline SMT system is notoriously tricky as evident in the research on integrating word sense disambiguation (WSD) into SMT systems: different ways of integration lead to conflicting conclusions on whether WSD helps MT performance ( )	IN DT JJ NN , VVG DT JJ NN IN DT JJ NP NN VBZ RB JJ IN JJ IN DT NN IN VVG NN NN NN NN IN NP JJ JJ NNS IN NN NN TO JJ NNS IN IN NP VVZ NP NN ( )	BackGround	GRelated	Neutral
08-1049_3	According to Chang and Lai ( ) , approximately 20% of sentences in a typical news article have abbreviated words in them	VVG TO NP CC NP ( ) , RB CD IN NNS IN DT JJ NN NN VHP VVN NNS IN PP	BackGround	GRelated	Neutral
08-1049_3	To create the baseline , we make use of the dominant abbreviation patterns shown in Table 5 , which have been reported in Chang and Lai ( )	TO VV DT NN , PP VVP NN IN DT JJ NN NNS VVN IN NP CD , WDT VHP VBN VVN IN NP CC NP ( )	Fundamental	Basis	Neutral
08-1049_3	For the statistics on manually collected examples , please refer to Chang and Lai ( )	IN DT NNS IN RB VVN NNS , VV VV TO NP CC NP ( )	BackGround	SRelated	Neutral
08-1049_3	Recently , Chang and Lai ( ) , Chang and Teng ( ) , and Lee ( ) have investigated this task	RB , NP CC NP ( ) , NP CC NP ( ) , CC NP ( ) VHP VVN DT NN	BackGround	GRelated	Neutral
08-1049_4	To identify their abbreviations , one can employ an HMM model ( )	TO VV PP$ NNS , PP MD VV DT NP NN ( )	BackGround	GRelated	Neutral
08-1049_4	In comparison , Chang and Teng ( ) reports a precision of 50% over relations between single-word full-forms and single-character abbreviations	IN NN , NP CC NP ( ) VVZ DT NN IN CD IN NNS IN NN NNS CC NN NNS	Compare	Compare	Neutral
08-1049_5	To handle different directions of translation between Chinese and English , we built two tri-gram language models with modified Kneser-Ney smoothing ( ) using the SRILM toolkit ( )	TO VV JJ NNS IN NN IN NP CC NP , PP VVD CD NN NN NNS IN JJ NP VVG ( ) VVG DT NP NN ( )	Fundamental	Basis	Neutral
08-1049_6	While the research in statistical machine translation (SMT) has made significant progress , most SMT systems ( ) rely on parallel corpora to extract translation entries	IN DT NN IN JJ NN NN NN VHZ VVN JJ NN , JJS NP NNS ( ) VV IN JJ NNS TO VV NN NNS	BackGround	GRelated	Negative
08-1049_6	However , since most of statistical translation models ( ) are symmetrical , it is relatively easy to train a translation system to translate from English to Chinese , except that we need to train a Chinese language model from the Chinese monolingual data	RB , IN JJS IN JJ NN NNS ( ) VBP JJ , PP VBZ RB JJ TO VV DT NN NN TO VV IN NP TO NP , IN WDT PP VVP TO VV DT JJ NN NN IN DT JJ JJ NNS	BackGround	GRelated	Neutral
08-1049_8	We carry out experiments on a state-of-the-art SMT system , i.e. , Moses ( ) , and show that the abbreviation translations consistently improve the translation performance (in terms of BLEU ( )) on various NIST MT test sets	PP VVP RP NNS IN DT JJ NP NN , FW , NP ( ) , CC VVP IN/that DT NN NNS RB VV DT NN NN NN NNS IN NP ( NN IN JJ NP NP NN NNS	Fundamental	Basis	Positive
08-1049_8	We integrate our method into a state-of-the-art phrase-based baseline translation system , i.e. , Moses ( ) , and show that the integrated system consistently improves the performance of the baseline system on various NIST machine translation test sets	PP VV PP$ NN IN DT JJ JJ NN NN NN , FW , NP ( ) , CC VVP IN/that DT JJ NN RB VVZ DT NN IN DT JJ NN IN JJ NP NN NN NN NNS	Fundamental	Basis	Positive
08-1049_10	In general , Chinese abbreviations are formed based on three major methods: reduction , elimination and generalization ( )	IN JJ , JJ NNS VBP VVN VVN IN CD JJ NN NN , NN CC NN ( )	BackGround	GRelated	Neutral
08-1049_10	Lee ( ) gives a summary about how Chinese abbreviations are formed and presents many examples	NP ( ) VVZ DT NN IN WRB JJ NNS VBP VVN CC VVZ JJ NNS	BackGround	GRelated	Neutral
08-1049_11	At last , the goal that we aim to exploit monolingual corpora to help MT is in-spirit similar to the goal of using non-parallel corpora to help MT as aimed in a large amount of work (see Munteanu and Marcu ( ) and references therein)	IN JJ , DT NN IN/that PP VVP TO VV JJ NNS TO VV NP VBZ NN JJ TO DT NN IN VVG NN NNS TO VV NP IN VVN IN DT JJ NN IN NN NN NP CC NP ( ) CC NNS RB	Fundamental	Idea	Neutral
08-1049_12	Moreover , our approach integrates the abbreviation translation component into the baseline system in a natural way , and thus is able to make use of the minimum-error-rate training ( ) to automatically adjust the model parameters to reflect the change of the integrated system over the baseline system	RB , PP$ NN VVZ DT NN NN NN IN DT JJ NN IN DT JJ NN , CC RB VBZ JJ TO VV NN IN DT NN NN ( ) TO RB VV DT NN NNS TO VV DT NN IN DT JJ NN IN DT NN NN	Fundamental	Basis	Neutral
08-1049_12	Once we obtain the augmented phrase table , we should run the minimum-error-rate training ( ) with the augmented phrase table such that the model parameters are properly adjusted	RB PP VV DT JJ NN NN , PP MD VV DT NN NN ( ) IN DT JJ NN NN JJ IN/that DT NN NNS VBP RB VVN	Fundamental	Basis	Neutral
08-1049_12	The feature functions are combined under a log-linear framework , and the weights are tuned by the minimum-error-rate training ( ) using BLEU ( ) as the optimization metric	DT NN NNS VBP VVN IN DT JJ NN , CC DT NNS VBP VVN IN DT NN NN ( ) VVG NP ( ) IN DT NN JJ	Fundamental	Basis	Neutral
08-1049_12	We use MT02 as the development set 4 for minimum error rate training (MERT) ( )	PP VVP NP IN DT NN VVD CD IN JJ NN NN NN NN ( )	Fundamental	Basis	Neutral
08-1049_13	Using the toolkit Moses ( ) , we built a phrase-based baseline system by following the standard procedure: running GIZA++ ( ) in both directions , applying refinement rules to obtain a many-to-many word alignment , and then extracting and scoring phrases using heuristics ( )	VVG DT NN NN ( ) , PP VVD DT JJ NN NN IN VVG DT JJ NN VVG NP ( ) IN DT NNS , VVG NN NNS TO VV DT NN NN NN , CC RB VVG CC VVG NNS VVG NNS ( )	Fundamental	Basis	Neutral
08-1049_15	The MT performance is measured by lower-case 4-gram BLEU ( )	DT NP NN VBZ VVN IN JJ NP NP ( )	Fundamental	Basis	Neutral
08-1050_0	Following studies on automatic SCF extraction ( ) , we apply a statistical test (Binomial Hypothesis Test) to the unfiltered-Levin-SCF to filter out noisy SCFs , and denote the resulting SCF set as filtered-Levin-SCF	VVG NNS IN JJ NP NN ( ) , PP VVP DT JJ NN JJ NP NP TO DT NN TO NN IN JJ NNS , CC VV DT VVG NP VVN IN NN	Fundamental	Idea	Neutral
08-1050_1	It is therefore unsurprising that much work on verb classification has adopted them as features ( )	PP VBZ RB VVG DT JJ NN IN NN NN VHZ VVN PP IN NNS ( )	BackGround	GRelated	Neutral
08-1050_1	However , some of the functions words , prepositions in particular , are known to carry great amount of syntactic information that is related to lexical meanings of verbs ( )	RB , DT IN DT NNS NNS , NNS IN JJ , VBP VVN TO VV JJ NN IN JJ NN WDT VBZ VVN TO JJ NNS IN NNS ( )	BackGround	GRelated	Neutral
08-1050_2	One way to avoid these high-dimensional spaces is to assume that most of the features are irrelevant , an assumption adopted by many of the previous studies working with high-dimensional semantic spaces ( )	CD NN TO VV DT JJ NNS VBZ TO VV IN/that JJS IN DT NNS VBP JJ , DT NN VVN IN JJ IN DT JJ NNS VVG IN JJ JJ NNS ( )	BackGround	GRelated	Neutral
08-1050_4	SCF and DR: These more linguistically informed features are constructed based on the grammatical relations generated by the C&C CCG parser ( )	NP CC NP DT RBR RB VVN NNS VBP VVN VVN IN DT JJ NNS VVN IN DT NP NP NN ( )	Fundamental	Basis	Neutral
08-1050_5	Many scholars hypothesize that the behavior of a verb , particularly with respect to the expression of arguments and the assignment of semantic roles is to a large extent driven by deep semantic regularities ( )	JJ NNS VVP IN/that DT NN IN DT NN , RB IN NN TO DT NN IN NNS CC DT NN IN JJ NNS VBZ TO DT JJ NN VVN IN JJ JJ NNS ( )	BackGround	GRelated	Neutral
08-1050_6	When the information about a verb type is not available or sufficient for us to draw firm conclusions about its usage , the information about the class to which the verb type belongs can compensate for it , addressing the pervasive problem of data sparsity in a wide range of NLP tasks , such as automatic extraction of subcategorization frames ( ) , semantic role labeling ( ) , natural language generation for machine translation ( ) , and deriving predominant verb senses from unlabeled data ( )	WRB DT NN IN DT NN NN VBZ RB JJ CC JJ IN PP TO VV NN NNS IN PP$ NN , DT NN IN DT NN TO WDT DT NN NN VVZ MD VV IN PP , VVG DT JJ NN IN NN NN IN DT JJ NN IN NP NNS , JJ IN JJ NN IN NN NNS ( ) , JJ NN VVG ( ) , JJ NN NN IN NN NN ( ) , CC VVG JJ NN NNS IN JJ NNS ( )	BackGround	GRelated	Neutral
08-1050_10	Although the problem of data sparsity is alleviated to certain extent (3) , these features do not generally improve classification performance ( )	IN DT NN IN NN NN VBZ VVN TO JJ NN NN , DT NNS VVP RB RB VV NN NN ( )	BackGround	GRelated	Neutral
08-1050_11	Other methods for combining syntactic information with lexical information have also been attempted ( )	JJ NNS IN VVG JJ NN IN JJ NN VHP RB VBN VVN ( )	BackGround	GRelated	Neutral
08-1050_11	Joanis et al. ( ) demonstrates that the general feature space they devise achieves a rate of error reduction ranging from 48% to 88% over a chance baseline accuracy , across classification tasks of varying difficulty	NP NP NP ( ) VVZ IN/that DT JJ NN NN PP VVP VVZ DT NN IN NN NN VVG IN CD TO CD IN DT NN NN NN , IN NN NNS IN VVG NN	BackGround	SRelated	Neutral
08-1050_11	JOANIS07: We use the feature set proposed in Joanis et al. ( ) , which consists of 224 features	NP PP VVP DT NN NN VVN IN NP NP NP ( ) , WDT VVZ IN CD NNS	Fundamental	Basis	Neutral
08-1050_11	For example , Schulte im Walde ( ) uses 153 verbs in 30 classes , and Joanis et al. ( ) takes on 835 verbs and 15 verb classes	IN NN , NP NP NP ( ) VVZ CD NNS IN CD NNS , CC NP NP NP ( ) VVZ IN CD NNS CC CD NN NNS	BackGround	SRelated	Neutral
08-1050_13	Although there exist several manually-created verb lexicons or ontologies , including Levin's verb taxonomy , VerbNet , and FrameNet , automatic verb classification (AVC) is still necessary for extending existing lexicons ( ) , building and tuning lexical information specific to different domains ( ) , and bootstrapping verb lexicons for new languages ( )	IN EX VV JJ JJ NN NNS CC NNS , VVG NP NN NN , NP , CC NP , JJ NN NN NN VBZ RB JJ IN VVG JJ NNS ( ) , VVG CC VVG JJ NN JJ TO JJ NNS ( ) , CC VVG NN NNS IN JJ NNS ( )	BackGround	GRelated	Neutral
08-1050_18	Although dependency relations have been widely used in automatic acquisition of lexical information , such as detection of polysemy ( ) and WSD ( ) , their utility in AVC still remains untested	IN NN NNS VHP VBN RB VVN IN JJ NN IN JJ NN , JJ IN NN IN NN ( ) CC NP ( ) , PP$ NN IN NP RB VVZ JJ	BackGround	GRelated	Negative
08-1050_19	The software performs the so-called 1-of-k classification ( )	DT NN VVZ DT JJ NN NN ( )	Fundamental	Basis	Neutral
08-1050_22	We also lemmatize each word using the English lemmatizer as described in Minnen et al. ( ) , and use lemmas as features instead of words	PP RB VVP DT NN VVG DT JJ NN IN VVN IN NP NP NP ( ) , CC VV NNS IN NNS RB IN NNS	Fundamental	Basis	Neutral
08-1050_24	Co-occurrence (CO): CO features mostly convey lexical information only and are generally considered not particularly sensitive to argument structures ( )	NP NP NP VVZ RB VV JJ NN RB CC VBP RB VVN RB RB JJ TO NN NNS ( )	BackGround	SRelated	Neutral
08-1050_24	In order to reduce undue influence of outlier features , we employ the four normalization strategies in table 4 , which help reduce the range of extreme values while having little effect on others ( )	IN NN TO VV JJ NN IN NN NNS , PP VVP DT CD NN NNS IN NN CD , WDT VVP VV DT NN IN JJ NNS IN VHG JJ NN IN NNS ( )	Fundamental	Basis	Neutral
08-1050_25	Trying to overcome the problem of data sparsity , Schulte im Walde ( ) explores the additional use of selectional preference features by augmenting each syntactic slot with the concept to which its head noun belongs in an ontology (e.g.WordNet)	VVG TO VV DT NN IN NN NN , NP NP NP ( ) VVZ DT JJ NN IN JJ NN NNS IN VVG DT JJ NN IN DT NN TO WDT PP$ NN NN VVZ IN DT NN NN	BackGround	GRelated	Neutral