Takehito Utsuro


2022

pdf
Detecting Causes of Stock Price Rise and Decline by Machine Reading Comprehension with BERT
Gakuto Tsutsumi | Takehito Utsuro
Proceedings of the 4th Financial Narrative Processing Workshop @LREC2022

In this paper, we focused on news reported when stock prices fluctuate significantly. The news reported when stock prices change is a very useful source of information on what factors cause stock prices to change. However, because it is manually produced, not all events that cause stock prices to change are necessarily reported. Thus, in order to provide investors with information on those causes of stock price changes, it is necessary to develop a system to collect information on events that could be closely related to the stock price changes of certain companies from the Internet. As the first step towards developing such a system, this paper takes an approach of employing a BERT-based machine reading comprehension model, which extracts causes of stock price rise and decline from news reports on stock price changes. In the evaluation, the approach of using the title of the article as the question of machine reading comprehension performs well. It is shown that the fine-tuned machine reading comprehension model successfully detects additional causes of stock price rise and decline other than those stated in the title of the article.

pdf
Speaker Identification of Quotes in Japanese Novels based on Gender Classification Model by BERT
Yuki Zenimoto | Takehito Utsuro
Proceedings of the 36th Pacific Asia Conference on Language, Information and Computation

pdf
Developing and Evaluating a Dataset for How-to Tip Machine Reading at Scale
Fuzhu Zhu | Shuting Bai | Tingxuan Li | Takehito Utsuro
Proceedings of the 36th Pacific Asia Conference on Language, Information and Computation

pdf
Tweet Review Mining focusing on Celebrities by Machine Reading Comprehension based on BERT
Yuta Nozaki | Kotoe Sugawara | Yuki Zenimoto | Takehito Utsuro
Proceedings of the 36th Pacific Asia Conference on Language, Information and Computation

2020

pdf
Automatic Annotation of Werewolf Game Corpus with Players Revealing Oneselves as Seer/Medium and Divination/Medium Results
Youchao Lin | Miho Kasamatsu | Tengyang Chen | Takuya Fujita | Huanjin Deng | Takehito Utsuro
Workshop on Games and Natural Language Processing

While playing the communication game “Are You a Werewolf”, a player always guesses other players’ roles through discussions, based on his own role and other players’ crucial utterances. The underlying goal of this paper is to construct an agent that can analyze the participating players’ utterances and play the werewolf game as if it is a human. For a step of this underlying goal, this paper studies how to accumulate werewolf game log data annotated with identification of players revealing oneselves as seer/medium, the acts of the divination and the medium and declaring the results of the divination and the medium. In this paper, we divide the whole task into four sub tasks and apply CNN/SVM classifiers to each sub task and evaluate their performance.

pdf
MRC Examples Answerable by BERT without a Question Are Less Effective in MRC Model Training
Hongyu Li | Tengyang Chen | Shuting Bai | Takehito Utsuro | Yasuhide Kawada
Proceedings of the 1st Conference of the Asia-Pacific Chapter of the Association for Computational Linguistics and the 10th International Joint Conference on Natural Language Processing: Student Research Workshop

Models developed for Machine Reading Comprehension (MRC) are asked to predict an answer from a question and its related context. However, there exist cases that can be correctly answered by an MRC model using BERT, where only the context is provided without including the question. In this paper, these types of examples are referred to as “easy to answer”, while others are as “hard to answer”, i.e., unanswerable by an MRC model using BERT without being provided the question. Based on classifying examples as answerable or unanswerable by BERT without the given question, we propose a method based on BERT that splits the training examples from the MRC dataset SQuAD1.1 into those that are “easy to answer” or “hard to answer”. Experimental evaluation from a comparison of two models, one trained only with “easy to answer” examples and the other with “hard to answer” examples demonstrates that the latter outperforms the former.

pdf
Developing a How-to Tip Machine Comprehension Dataset and its Evaluation in Machine Comprehension by BERT
Tengyang Chen | Hongyu Li | Miho Kasamatsu | Takehito Utsuro | Yasuhide Kawada
Proceedings of the Third Workshop on Fact Extraction and VERification (FEVER)

In the field of factoid question answering (QA), it is known that the state-of-the-art technology has achieved an accuracy comparable to that of humans in a certain benchmark challenge. On the other hand, in the area of non-factoid QA, there is still a limited number of datasets for training QA models, i.e., machine comprehension models. Considering such a situation within the field of the non-factoid QA, this paper aims to develop a dataset for training Japanese how-to tip QA models. This paper applies one of the state-of-the-art machine comprehension models to the Japanese how-to tip QA dataset. The trained how-to tip QA model is also compared with a factoid QA model trained with a Japanese factoid QA dataset. Evaluation results revealed that the how-to tip machine comprehension performance was almost comparative with that of the factoid machine comprehension even with the training data size reduced to around 4% of the factoid machine comprehension. Thus, the how-to tip machine comprehension task requires much less training data compared with the factoid machine comprehension task.

pdf
Integrating Disfluency-based and Prosodic Features with Acoustics in Automatic Fluency Evaluation of Spontaneous Speech
Huaijin Deng | Youchao Lin | Takehito Utsuro | Akio Kobayashi | Hiromitsu Nishizaki | Junichi Hoshino
Proceedings of the Twelfth Language Resources and Evaluation Conference

This paper describes an automatic fluency evaluation of spontaneous speech. In the task of automatic fluency evaluation, we integrate diverse features of acoustics, prosody, and disfluency-based ones. Then, we attempt to reveal the contribution of each of those diverse features to the task of automatic fluency evaluation. Although a variety of different disfluencies are observed regularly in spontaneous speech, we focus on two types of phenomena, i.e., filled pauses and word fragments. The experimental results demonstrate that the disfluency-based features derived from word fragments and filled pauses are effective relative to evaluating fluent/disfluent speech, especially when combined with prosodic features, e.g., such as speech rate and pauses/silence. Next, we employed an LSTM based framework in order to integrate the disfluency-based and prosodic features with time sequential acoustic features. The experimental evaluation results of those integrated diverse features indicate that time sequential acoustic features contribute to improving the model with disfluency-based and prosodic features when detecting fluent speech, but not when detecting disfluent speech. Furthermore, when detecting disfluent speech, the model without time sequential acoustic features performs best even without word fragments features, but only with filled pauses and prosodic features.

pdf
University of Tsukuba’s Machine Translation System for IWSLT20 Open Domain Translation Task
Hongyi Cui | Yizhen Wei | Shohei Iida | Takehito Utsuro | Masaaki Nagata
Proceedings of the 17th International Conference on Spoken Language Translation

In this paper, we introduce University of Tsukuba’s submission to the IWSLT20 Open Domain Translation Task. We participate in both Chinese→Japanese and Japanese→Chinese directions. For both directions, our machine translation systems are based on the Transformer architecture. Several techniques are integrated in order to boost the performance of our models: data filtering, large-scale noised training, model ensemble, reranking and postprocessing. Consequently, our efforts achieve 33.0 BLEU scores for Chinese→Japanese translation and 32.3 BLEU scores for Japanese→Chinese translation.

pdf
Text Mining of Evidence on Infants’ Developmental Stages for Developmental Order Acquisition from Picture Book Reviews
Miho Kasamatsu | Takehito Utsuro | Yu Saito | Yumiko Ishikawa
Proceedings of the 34th Pacific Asia Conference on Language, Information and Computation

2019

pdf
Mixed Multi-Head Self-Attention for Neural Machine Translation
Hongyi Cui | Shohei Iida | Po-Hsuan Hung | Takehito Utsuro | Masaaki Nagata
Proceedings of the 3rd Workshop on Neural Generation and Translation

Recently, the Transformer becomes a state-of-the-art architecture in the filed of neural machine translation (NMT). A key point of its high-performance is the multi-head self-attention which is supposed to allow the model to independently attend to information from different representation subspaces. However, there is no explicit mechanism to ensure that different attention heads indeed capture different features, and in practice, redundancy has occurred in multiple heads. In this paper, we argue that using the same global attention in multiple heads limits multi-head self-attention’s capacity for learning distinct features. In order to improve the expressiveness of multi-head self-attention, we propose a novel Mixed Multi-Head Self-Attention (MMA) which models not only global and local attention but also forward and backward attention in different attention heads. This enables the model to learn distinct representations explicitly among multiple heads. In our experiments on both WAT17 English-Japanese as well as IWSLT14 German-English translation task, we show that, without increasing the number of parameters, our models yield consistent and significant improvements (0.9 BLEU scores on average) over the strong Transformer baseline.

pdf
Attention over Heads: A Multi-Hop Attention for Neural Machine Translation
Shohei Iida | Ryuichiro Kimura | Hongyi Cui | Po-Hsuan Hung | Takehito Utsuro | Masaaki Nagata
Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics: Student Research Workshop

In this paper, we propose a multi-hop attention for the Transformer. It refines the attention for an output symbol by integrating that of each head, and consists of two hops. The first hop attention is the scaled dot-product attention which is the same attention mechanism used in the original Transformer. The second hop attention is a combination of multi-layer perceptron (MLP) attention and head gate, which efficiently increases the complexity of the model by adding dependencies between heads. We demonstrate that the translation accuracy of the proposed multi-hop attention outperforms the baseline Transformer significantly, +0.85 BLEU point for the IWSLT-2017 German-to-English task and +2.58 BLEU point for the WMT-2017 German-to-English task. We also find that the number of parameters required for a multi-hop attention is smaller than that for stacking another self-attention layer and the proposed model converges significantly faster than the original Transformer.

pdf
Selecting Informative Context Sentence by Forced Back-Translation
Ryuichiro Kimura | Shohei Iida | Hongyi Cui | Po-Hsuan Hung | Takehito Utsuro | Masaaki Nagata
Proceedings of Machine Translation Summit XVII: Research Track

pdf bib
Proceedings of the 8th Workshop on Patent and Scientific Literature Translation
Takehito Utsuro | Katsuhito Sudoh | Takashi Tsunakawa
Proceedings of the 8th Workshop on Patent and Scientific Literature Translation

pdf
A Multi-Hop Attention for RNN based Neural Machine Translation
Shohei Iida | Ryuichiro Kimura | Hongyi Cui | Po-Hsuan Hung | Takehito Utsuro | Masaaki Nagata
Proceedings of the 8th Workshop on Patent and Scientific Literature Translation

2018

pdf
Measuring Beginner Friendliness of Japanese Web Pages explaining Academic Concepts by Integrating Neural Image Feature and Text Features
Hayato Shiokawa | Kota Kawaguchi | Bingcai Han | Takehito Utsuro | Yasuhide Kawada | Masaharu Yoshioka | Noriko Kando
Proceedings of the 5th Workshop on Natural Language Processing Techniques for Educational Applications

Search engine is an important tool of modern academic study, but the results are lack of measurement of beginner friendliness. In order to improve the efficiency of using search engine for academic study, it is necessary to invent a technique of measuring the beginner friendliness of a Web page explaining academic concepts and to build an automatic measurement system. This paper studies how to integrate heterogeneous features such as a neural image feature generated from the image of the Web page by a variant of CNN (convolutional neural network) as well as text features extracted from the body text of the HTML file of the Web page. Integration is performed through the framework of the SVM classifier learning. Evaluation results show that heterogeneous features perform better than each individual type of features.

2017

pdf
Neural Machine Translation Model with a Large Vocabulary Selected by Branching Entropy
Zi Long | Ryuichiro Kimura | Takehito Utsuro | Tomoharu Mitsuhashi | Mikio Yamamoto
Proceedings of Machine Translation Summit XVI: Research Track

pdf
Patent NMT integrated with Large Vocabulary Phrase Translation by SMT at WAT 2017
Zi Long | Ryuichiro Kimura | Takehito Utsuro | Tomoharu Mitsuhashi | Mikio Yamamoto
Proceedings of the 4th Workshop on Asian Translation (WAT2017)

Neural machine translation (NMT) cannot handle a larger vocabulary because the training complexity and decoding complexity proportionally increase with the number of target words. This problem becomes even more serious when translating patent documents, which contain many technical terms that are observed infrequently. Long et al.(2017) proposed to select phrases that contain out-of-vocabulary words using the statistical approach of branching entropy. The selected phrases are then replaced with tokens during training and post-translated by the phrase translation table of SMT. In this paper, we apply the method proposed by Long et al. (2017) to the WAT 2017 Japanese-Chinese and Japanese-English patent datasets. Evaluation on Japanese-to-Chinese, Chinese-to-Japanese, Japanese-to-English and English-to-Japanese patent sentence translation proved the effectiveness of phrases selected with branching entropy, where the NMT model of Long et al.(2017) achieves a substantial improvement over a baseline NMT model without the technique proposed by Long et al.(2017).

2016

pdf
Analyzing Time Series Changes of Correlation between Market Share and Concerns on Companies measured through Search Engine Suggests
Takakazu Imada | Yusuke Inoue | Lei Chen | Syunya Doi | Tian Nie | Chen Zhao | Takehito Utsuro | Yasuhide Kawada
Proceedings of the Tenth International Conference on Language Resources and Evaluation (LREC'16)

This paper proposes how to utilize a search engine in order to predict market shares. We propose to compare rates of concerns of those who search for Web pages among several companies which supply products, given a specific products domain. We measure concerns of those who search for Web pages through search engine suggests. Then, we analyze whether rates of concerns of those who search for Web pages have certain correlation with actual market share. We show that those statistics have certain correlations. We finally propose how to predict the market share of a specific product genre based on the rates of concerns of those who search for Web pages.

pdf bib
Translation of Patent Sentences with a Large Vocabulary of Technical Terms Using Neural Machine Translation
Zi Long | Takehito Utsuro | Tomoharu Mitsuhashi | Mikio Yamamoto
Proceedings of the 3rd Workshop on Asian Translation (WAT2016)

Neural machine translation (NMT), a new approach to machine translation, has achieved promising results comparable to those of traditional approaches such as statistical machine translation (SMT). Despite its recent success, NMT cannot handle a larger vocabulary because training complexity and decoding complexity proportionally increase with the number of target words. This problem becomes even more serious when translating patent documents, which contain many technical terms that are observed infrequently. In NMTs, words that are out of vocabulary are represented by a single unknown token. In this paper, we propose a method that enables NMT to translate patent sentences comprising a large vocabulary of technical terms. We train an NMT system on bilingual data wherein technical terms are replaced with technical term tokens; this allows it to translate most of the source sentences except technical terms. Further, we use it as a decoder to translate source sentences with technical term tokens and replace the tokens with technical term translations using SMT. We also use it to rerank the 1,000-best SMT translations on the basis of the average of the SMT score and that of the NMT rescoring of the translated sentences with technical term tokens. Our experiments on Japanese-Chinese patent sentences show that the proposed NMT system achieves a substantial improvement of up to 3.1 BLEU points and 2.3 RIBES points over traditional SMT systems and an improvement of approximately 0.6 BLEU points and 0.8 RIBES points over an equivalent NMT system without our proposed technique.

2015

pdf
Collecting bilingual technical terms from patent families of character-segmented Chinese sentences and morpheme-segmented Japanese sentences
Zi Long | Takehito Utsuro | Tomoharu Mitsuhashi | Mikio Yamamoto
Proceedings of the 6th Workshop on Patent and Scientific Literature Translation

pdf
Evaluating Features for Identifying Japanese-Chinese Bilingual Synonymous Technical Terms from Patent Families
Zi Long | Takehito Utsuro | Tomoharu Mitsuhashi | Mikio Yamamoto
Proceedings of the Eighth Workshop on Building and Using Comparable Corpora

pdf
Detecting an Infant’s Developmental Reactions in Reviews on Picture Books
Hiroshi Uehara | Mizuho Baba | Takehito Utsuro
Proceedings of the 29th Pacific Asia Conference on Language, Information and Computation: Posters

2013

pdf bib
Compositional translation of technical terms by integrating patent families as a parallel corpus and a comparable corpus
Itsuki Toyota | Zi Long | Lijuan Dong | Takehito Utsuro | Mikio Yamamoto
Proceedings of the 5th Workshop on Patent Translation

pdf
Time Series Topic Modeling and Bursty Topic Detection of Correlated News and Twitter
Daichi Koike | Yusuke Takahashi | Takehito Utsuro | Masaharu Yoshioka | Noriko Kando
Proceedings of the Sixth International Joint Conference on Natural Language Processing

2012

pdf
Detecting Japanese Compound Functional Expressions using Canonical/Derivational Relation
Takafumi Suzuki | Yusuke Abe | Itsuki Toyota | Takehito Utsuro | Suguru Matsuyoshi | Masatoshi Tsuchiya
Proceedings of the Eighth International Conference on Language Resources and Evaluation (LREC'12)

The Japanese language has various types of functional expressions. In order to organize Japanese functional expressions with various surface forms, a lexicon of Japanese functional expressions with hierarchical organization was compiled. This paper proposes how to design the framework of identifying more than 16,000 functional expressions in Japanese texts by utilizing hierarchical organization of the lexicon. In our framework, more than 16,000 functional expressions are roughly divided into canonical / derived functional expressions. Each derived functional expression is intended to be identified by referring to the most similar occurrence of its canonical expression. In our framework, contextual occurrence information of much fewer canonical expressions are expanded into the whole forms of derived expressions, to be utilized when identifying those derived expressions. We also empirically show that the proposed method can correctly identify more than 80% of the functional / content usages only with less than 38,000 training instances of manually identified canonical expressions.

pdf
Cross-Lingual Topic Alignment in Time Series Japanese / Chinese News
Shuo Hu | Yusuke Takahashi | Liyi Zheng | Takehito Utsuro | Masaharu Yoshioka | Noriko Kando | Tomohiro Fukuhara | Hiroshi Nakagawa | Yoji Kiyota
Proceedings of the 26th Pacific Asia Conference on Language, Information, and Computation

2011

pdf
Semi-Automatic Identification of Bilingual Synonymous Technical Terms from Phrase Tables and Parallel Patent Sentences
Bing Liang | Takehito Utsuro | Mikio Yamamoto
Proceedings of the 25th Pacific Asia Conference on Language, Information and Computation

pdf
Example-based Translation of Japanese Functional Expressions utilizing Semantic Equivalence Classes
Yusuke Abe | Takafumi Suzuki | Bing Liang | Takehito Utsuro | Mikio Yamamoto | Suguru Matsuyoshi | Yasuhide Kawada
Proceedings of the 4th Workshop on Patent Translation

2010

pdf bib
Proceedings of the Second Workshop on NLP Challenges in the Information Explosion Era (NLPIX 2010)
Sadao Kurohashi | Takehito Utsuro
Proceedings of the Second Workshop on NLP Challenges in the Information Explosion Era (NLPIX 2010)

pdf
Utilizing Semantic Equivalence Classes of Japanese Functional Expressions in Translation Rule Acquisition from Parallel Patent Sentences
Taiji Nagasaka | Ran Shimanouchi | Akiko Sakamoto | Takafumi Suzuki | Yohei Morishita | Takehito Utsuro | Suguru Matsuyoshi
Proceedings of the Seventh International Conference on Language Resources and Evaluation (LREC'10)

In the ``Sandglass'' MT architecture, we identify the class of monosemous Japanese functional expressions and utilize it in the task of translating Japanese functional expressions into English. We employ the semantic equivalence classes of a recently compiled large scale hierarchical lexicon of Japanese functional expressions. We then study whether functional expressions within a class can be translated into a single canonical English expression. Based on the results of identifying monosemous semantic equivalence classes, this paper studies how to extract rules for translating functional expressions in Japanese patent documents into English. In this study, we use about 1.8M Japanese-English parallel sentences automatically extracted from Japanese-English patent families, which are distributed through the Patent Translation Task at the NTCIR-7 Workshop. Then, as a toolkit of a phrase-based SMT (Statistical Machine Translation) model, Moses is applied and Japanese-English translation pairs are obtained in the form of a phrase translation table. Finally, we extract translation pairs of Japanese functional expressions from the phrase translation table. Through this study, we found that most of the semantic equivalence classes judged as monosemous based on manual translation into English have only one translation rules even in the patent domain.

2009

pdf bib
Exploiting Patent Information for the Evaluation of Machine Translation
Atsushi Fujii | Masao Utiyama | Mikio Yamamoto | Takehito Utsuro
Proceedings of the Third Workshop on Patent Translation

pdf bib
Meta-evaluation of Automatic Evaluation Methods for Machine using Patent Translation Data in NTCIR-7
Hiroshi Echizen-ya | Terumasa Ehara | Sayori Shimohata | Atsushi Fujii | Masao Utiyama | Mikio Yamamoto | Takehito Utsuro | Noriko Kando
Proceedings of the Third Workshop on Patent Translation

pdf
Towards Conceptual Indexing of the Blogosphere through Wikipedia Topic Hierarchy
Mariko Kawaba | Daisuke Yokomoto | Hiroyuki Nakasaki | Takehito Utsuro | Tomohiro Fukuhara
Proceedings of the 23rd Pacific Asia Conference on Language, Information and Computation, Volume 2

pdf
Identifying and Utilizing the Class of Monosemous Japanese Functional Expressions in Machine Translation
Akiko Sakamoto | Taiji Nagasaka | Takehito Utsuro | Suguru Matsuyoshi
Proceedings of the 23rd Pacific Asia Conference on Language, Information and Computation, Volume 2

2008

pdf
Producing a Test Collection for Patent Machine Translation in the Seventh NTCIR Workshop
Atsushi Fujii | Masao Utiyama | Mikio Yamamoto | Takehito Utsuro
Proceedings of the Sixth International Conference on Language Resources and Evaluation (LREC'08)

In aiming at research and development on machine translation, we produced a test collection for Japanese-English machine translation in the seventh NTCIR Workshop. This paper describes details of our test collection. From patent documents published in Japan and the United States, we extracted patent families as a parallel corpus. A patent family is a set of patent documents for the same or related invention and these documents are usually filed to more than one country in different languages. In the parallel corpus, we aligned Japanese sentences with their counterpart English sentences. Our test collection, which includes approximately 2,000,000 sentence pairs, can be used to train and test machine translation systems. Our test collection also includes search topics for cross-lingual patent retrieval and the contribution of machine translation to a patent retrieval task can also be evaluated. Our test collection will be available to the public for research purposes after the NTCIR final meeting.

pdf
Toward the Evaluation of Machine Translation Using Patent Information
Atsushi Fujii | Masao Utiyama | Mikio Yamamoto | Takehito Utsuro
Proceedings of the 8th Conference of the Association for Machine Translation in the Americas: Research Papers

To aid research and development in machine translation, we have produced a test collection for Japanese/English machine translation. To obtain a parallel corpus, we extracted patent documents for the same or related inventions published in Japan and the United States. Our test collection includes approximately 2000000 sentence pairs in Japanese and English, which were extracted automatically from our parallel corpus. These sentence pairs can be used to train and evaluate machine translation systems. Our test collection also includes search topics for cross-lingual patent retrieval, which can be used to evaluate the contribution of machine translation to retrieving patent documents across languages. This paper describes our test collection, methods for evaluating machine translation, and preliminary experiments.

pdf
Integrating a Phrase-based SMT Model and a Bilingual Lexicon for Semi-Automatic Acquisition of Technical Term Translation Lexicons
Yohei Morishita | Takehito Utsuro | Mikio Yamamoto
Proceedings of the 8th Conference of the Association for Machine Translation in the Americas: Research Papers

This paper presents an attempt at developing a technique of acquiring translation pairs of technical terms with sufficiently high precision from parallel patent documents. The approach taken in the proposed technique is based on integrating the phrase translation table of a state-of-the-art statistical phrase-based machine translation model, and compositional translation generation based on an existing bilingual lexicon for human use. Our evaluation results clearly show that the agreement between the two individual techniques definitely contribute to improving precision of translation candidates. We then apply the Support Vector Machines (SVMs) to the task of automatically validating translation candidates in the phrase translation table. Experimental evaluation results again show that the SVMs based approach to translation candidates validation can contribute to improving the precision of translation candidates in the phrase translation table.

2007

pdf
Learning Dependency Relations of Japanese Compound Functional Expressions
Takehito Utsuro | Takao Shime | Masatoshi Tsuchiya | Suguru Matsuyoshi | Satoshi Sato
Proceedings of the Workshop on A Broader Perspective on Multiword Expressions

2006

pdf
Japanese Idiom Recognition: Drawing a Line between Literal and Idiomatic Meanings
Chikara Hashimoto | Satoshi Sato | Takehito Utsuro
Proceedings of the COLING/ACL 2006 Main Conference Poster Sessions

pdf
Adjective-to-Verb Paraphrasing in Japanese Based on Lexical Constraints of Verbs
Atsushi Fujita | Naruaki Masuno | Satoshi Sato | Takehito Utsuro
Proceedings of the Fourth International Natural Language Generation Conference

pdf
A comparative study on compositional translation estimation using a domain/topic-specific corpus collected from the Web
Masatsugu Tonoike | Mitsuhiro Kida | Toshihiro Takagi | Yasuhiro Sasaki | Takehito Utsuro | S. Sato
Proceedings of the 2nd International Workshop on Web as Corpus

pdf
Chunking Japanese Compound Functional Expressions by Machine Learning
Masatoshi Tsuchiya | Takao Shime | Toshihiro Takagi | Takehito Utsuro | Kiyotaka Uchimoto | Suguru Matsuyoshi | Satoshi Sato | Seiichi Nakagawa
Proceedings of the Workshop on Multi-word-expressions in a multilingual context

pdf
Compiling French-Japanese Terminologies from the Web
Xavier Robitaille | Yasuhiro Sasaki | Masatsugu Tonoike | Satoshi Sato | Takehito Utsuro
11th Conference of the European Chapter of the Association for Computational Linguistics

2005

pdf
Effect of Domain-Specific Corpus in Compositional Translation Estimation for Technical Terms
Masatsugu Tonoike | Mitsuhiro Kida | Toshihiro Takagi | Yasuhiro Sasaki | Takehito Utsuro | Satoshi Sato
Companion Volume to the Proceedings of Conference including Posters/Demos and tutorial abstracts

2004

pdf
Integrating Cross-Lingually Relevant News Articles and Monolingual Web Documents in Bilingual Lexicon Acquisition
Takehito Utsuro | Kohei Hino | Mitsuhiro Kida | Seiichi Nakagawa | Satoshi Sato
COLING 2004: Proceedings of the 20th International Conference on Computational Linguistics

pdf
An Empirical Study on Multiple LVCSR Model Combination by Machine Learning
Takehito Utsuro | Yasuhiro Kodama | Tomohiro Watanabe | Hiromitsu Nishizaki | Seiichi Nakagawa
Proceedings of HLT-NAACL 2004: Short Papers

pdf
Answer validation by keyword association
Masatsugu Tonoike | Takehito Utsuro | Satoshi Sato
Proceedings of the 3rd workshop on RObust Methods in Analysis of Natural Language Data (ROMAND 2004)

2003

pdf
Effect of Cross-Language IR in Bilingual Lexicon Acquisition from Comparable Corpora
Takehito Utsuro | Takashi Horiuchi | Kohei Hino | Takeshi Hamamoto | Takeaki Nakayama
10th Conference of the European Chapter of the Association for Computational Linguistics

2002

pdf
Combining Outputs of Multiple Japanese Named Entity Chunkers by Stacking
Takehito Utsuro | Manabu Sassano | Kiyotaka Uchimoto
Proceedings of the 2002 Conference on Empirical Methods in Natural Language Processing (EMNLP 2002)

pdf
A Web-based English Abstract Writing Tool Using a Tagged E-J Parallel Corpus
Masumi Narita | Kazuya Kurokawa | Takehito Utsuro
Proceedings of the Third International Conference on Language Resources and Evaluation (LREC’02)

pdf
Semi-automatic compilation of bilingual lexcion entries from cross-lingually relevant news articles on WWW news sites
Takehito Utsuro | Takashi Horiuchi | Yasunobu Chiba | Takeshi Hamamoto
Proceedings of the 5th Conference of the Association for Machine Translation in the Americas: Technical Papers

For the purpose of overcoming resource scarcity bottleneck in corpus-based translation knowledge acquisition research, this paper takes an approach of semi-automatically acquiring domain specific translation knowledge from the collection of bilingual news articles on WWW news sites. This paper presents results of applying standard co-occurrence frequency based techniques of estimating bilingual term correspondences from parallel corpora to relevant article pairs automatically collected from WWW news sites. The experimental evaluation results are very encouraging and it is proved that many useful bilingual term correspondences can be efficiently discovered with little human intervention from relevant article pairs on WWW news sites.

2000

pdf
Named Entity Chunking Techniques in Supervised Learning for Japanese Named Entity Recognition
Manabu Sassano | Takehito Utsuro
COLING 2000 Volume 2: The 18th International Conference on Computational Linguistics

pdf
Learning Preference of Dependency between Japanese Subordinate Clauses and its Evaluation in Parsing
Takehito Utsuro
Proceedings of the Second International Conference on Language Resources and Evaluation (LREC’00)

pdf
Minimally Supervised Japanese Named Entity Recognition: Resources and Evaluation
Takehito Utsuro | Manabu Sassano
Proceedings of the Second International Conference on Language Resources and Evaluation (LREC’00)

pdf
IPA Japanese Dictation Free Software Project
Katsunobu Itou | Kiyohiro Shikano | Tatsuya Kawahara | Kasuya Takeda | Atsushi Yamada | Akinori Itou | Takehito Utsuro | Tetsunori Kobayashi | Nobuaki Minematsu | Mikio Yamamoto | Shigeki Sagayama | Akinobu Lee
Proceedings of the Second International Conference on Language Resources and Evaluation (LREC’00)

pdf
Analyzing Dependencies of Japanese Subordinate Clauses based on Statistics of Scope Embedding Preference
Takehito Utsuro | Shigeyuki Nishiokayama | Masakazu Fujio | Yuji Matsumoto
1st Meeting of the North American Chapter of the Association for Computational Linguistics

1998

pdf
General-to-Specific Model Selection for Subcategorization Preference
Takehito Utsuro | Takashi Miyata | Yuji Matsumoto
36th Annual Meeting of the Association for Computational Linguistics and 17th International Conference on Computational Linguistics, Volume 2

pdf
General-to-Specific Model Selection for Subcategorization Preference
Takehito Utsuro | Takashi Miyata | Yuji Matsumoto
COLING 1998 Volume 2: The 17th International Conference on Computational Linguistics

1997

pdf
Learning Probabilistic Subcategorization Preference by Identifying Case Dependencies and Optimal Noun Class Generalization Level
Takehito Utsuro | Yuji Matsumoto
Fifth Conference on Applied Natural Language Processing

pdf
Maximum Entropy Model Learning of Subcategorization Preference
Takehito Utsuro | Takashi Miyata
Fifth Workshop on Very Large Corpora

1996

pdf
Sense Classification of Verbal Polysemy based-on Bilingual Class/Class Association
Takehito Utsuro
COLING 1996 Volume 2: The 16th International Conference on Computational Linguistics

1994

pdf
Thesaurus-based Efficient Example Retrieval by Generating Retrieval Queries from Similarities
Takehito Utsuro | Kiyotaka Uchimoto | Mitsutaka Matsumoto | Makoto Nagao
COLING 1994 Volume 2: The 15th International Conference on Computational Linguistics

pdf
Bilingual Text, Matching using Bilingual Dictionary and Statistics
Takehito Utsuro | Hiroshi Ikeda | Masaya Yamane | Yuji Matsumoto | Makoto Nagao
COLING 1994 Volume 2: The 15th International Conference on Computational Linguistics

1993

pdf
Structural Matching of Parallel Texts
Yuji Matsumoto | Takehito Utsuro | Hiroyuki Ishimoto
31st Annual Meeting of the Association for Computational Linguistics

1992

pdf
Lexical Knowledge Acquisition from Bilingual Corpora
Takehito Utsuro | Yuji Matsumoto | Makoto Nagao
COLING 1992 Volume 2: The 14th International Conference on Computational Linguistics

Search
Co-authors