The PAISÀ Corpus of Italian Web Texts
Verena Lyding | Egon Stemle | Claudia Borghetti | Marco Brunello | Sara Castagnoli | Felice Dell’Orletta | Henrik Dittmann | Alessandro Lenci | Vito Pirrelli
Proceedings of the 9th Web as Corpus Workshop (WaC-9)
The LOIS (Lexical Ontologies for legal Information Sharing) project The legal knowledge base resulting from the LOIS (Lexical Ontologies for legal Information Sharing) (Lexical Ontologies for legal Information Sharing) project consists of legal WordNets in six languages (Italian, Dutch, Portuguese, German, Czech, English). Its architecture is based on the EuroWordNet (EWN) framework (Vossen et al, 1997). Using the EWN framework assures compatibility of the LOIS WordNets with EWN, allowing them to function as an extension of EWN for the legal domain. For each legal system, the document-derived legal concepts are integrated into a taxonomy, which links into existing formal ontologies. These give the legal wordnets a first formal backbone, which can, in future, be further extended. The database consists of 33,000 synsets, and is aimed to be used in information retrieval, where it provides mono- and multi-lingual access to European legal databases for legal experts as well as for laymen. The LOIS knowledge base also provides a flexible, modular architecture that allows integration of multiple classification schemes, and enables the comparison of legal systems by exploring translation, equivalence and structure across the different legal wordnets.