Sun-Mee Bae


2004

pdf
A Statistical Model for Hangeul-Hanja Conversion in Terminology Domain
Jin-Xia Huang | Sun-Mee Bae | Key-sun Choi
Proceedings of the Third SIGHAN Workshop on Chinese Language Processing

pdf
Lexical Analysis of Agglutinative Languages Using a Dictionary of Lemmas and Lexical Transducers
Sun-Mee Bae | Key-Sun Choi
Proceedings of the Fourth International Conference on Language Resources and Evaluation (LREC’04)

This paper presents a simple method for performing a lexical analysis of agglutinative languages like Korean, which have a heavy morphology. Especially, for nouns and adverbs with regular morphological modifications and/or high productivity, we do not need to artificially construct huge dictionaries of all inflected forms of lemmas. To construct a dictionary of lemmas and lexical transducers, first, we construct automatically a dictionary of all inflected forms from KAIST POS-Tagged Corpus. Secondly, we separate the party of lemmas and one of sequences of inflectional suffixes. Thirdly, we describe their lexical transducers (i.e., morphological rules) to recognize all inflected forms of lemmas for nouns and adverbs according to the combinatorial restrictions between lemmas and their inflectional suffixes. Finally, we evaluate the advantages of this method.