This is an internal, incomplete preview of a proposed change to the ACL Anthology.
For efficiency reasons, we don't generate MODS or Endnote formats, and the preview may be incomplete in other ways, or contain mistakes.
Do not treat this content as an official publication.
ViolettaCavalli-Sforza
Fixing paper assignments
Please select all papers that belong to the same person.
Indicate below which author they should be assigned to.
Advances in automatic readability assessment can impact the way people consume information in a number of domains. Arabic, being a low-resource and morphologically complex language, presents numerous challenges to the task of automatic readability assessment. In this paper, we present the largest and most in-depth computational readability study for Arabic to date. We study a large set of features with varying depths, from shallow words to syntactic trees, for both L1 and L2 readability tasks. Our best L1 readability accuracy result is 94.8% (75% error reduction from a commonly used baseline). The comparable results for L2 are 72.4% (45% error reduction). We also demonstrate the added value of leveraging L1 features for L2 readability prediction.
IMORPHĒ is a significantly extended version of MORPHE, a morphology description compiler. MORPHEs morphology description language is based on two constructs: 1) a morphological form hierarchy, whose nodes relate and differentiate surface forms in terms of the common and distinguishing inflectional features of lexical items; and 2) transformational rules, attached to leaf nodes of the hierarchy, which generate the surface form of an item from the base form stored in the lexicon. While MORPHEs approach to morphology description is intuitively appealing and was successfully used for generating the morphology of several European languages, its application to Modern Standard Arabic yielded morphological descriptions that were highly complex and redundant. Previous modifications and enhancements attempted to capture more elegantly and concisely different aspects of the complex morphology of Arabic, finding theoretical grounding in Lexeme-Based Morphology. Those extensions are being incorporated in a more flexible and less ad hoc fashion in IMORPHE, which retains the unique features of our previous work but embeds them in an inheritance-based framework in order to achieve even more concise and modular morphology descriptions and greater runtime efficiency, and lays the groundwork for IMORPHE to become an analyzer as well as a generator.
We describe our experience in adapting an existing high- quality, interlingual, unidirectional machine translation system to a new domain and bidirectional translation for a new language pair (English and Italian). We focus on the interlingua design changes which were necessary to achieve high quality output in view of the language mismatches between English and Italian. The representation we propose contains features that are interpreted differently, depending on the translation direction. This decision simplified the process of creating the interlingua for individual sentences, and allows the system to defer mapping of language-specific features (such as tense and aspect), which are realized when the target syntactic feature structure is created. We also describe a set of problems we encountered in translating modal verbs, and discuss the representation of modality in our interlingua.