Tomoaki Kono


Fixing paper assignments

  1. Please select all papers that belong to the same person.
  2. Indicate below which author they should be assigned to.
Provide a valid ORCID iD here. This will be used to match future papers to this author.
Provide the name of the school or the university where the author has received or will receive their highest degree (e.g., Ph.D. institution for researchers, or current affiliation for students). This will be used to form the new author page ID, if needed.

TODO: "submit" and "cancel" buttons here


2016

pdf bib
Original-Transcribed Text Alignment for Manyosyu Written by Old Japanese Language
Teruaki Oka | Tomoaki Kono
Proceedings of the Workshop on Language Technology Resources and Tools for Digital Humanities (LT4DH)

We are constructing an annotated diachronic corpora of the Japanese language. In part of thiswork, we construct a corpus of Manyosyu, which is an old Japanese poetry anthology. In thispaper, we describe how to align the transcribed text and its original text semiautomatically to beable to cross-reference them in our Manyosyu corpus. Although we align the original charactersto the transcribed words manually, we preliminarily align the transcribed and original charactersby using an unsupervised automatic alignment technique of statistical machine translation toalleviate the work. We found that automatic alignment achieves an F1-measure of 0.83; thus, each poem has 1–2 alignment errors. However, finding these errors and modifying them are less workintensiveand more efficient than fully manual annotation. The alignment probabilities can beutilized in this modification. Moreover, we found that we can locate the uncertain transcriptionsin our corpus and compare them to other transcriptions, by using the alignment probabilities.