Corrado Seidenari

Also published as: C. Seidenari


Fixing paper assignments

  1. Please select all papers that do not belong to this person.
  2. Indicate below which author they should be assigned to.
Provide a valid ORCID iD here. This will be used to match future papers to this author.
Provide the name of the school or the university where the author has received or will receive their highest degree (e.g., Ph.D. institution for researchers, or current affiliation for students). This will be used to form the new author page ID, if needed.

TODO: "submit" and "cancel" buttons here


2006

pdf bib
POS tagset design for Italian
Raffaella Bernardi | Andrea Bolognesi | Corrado Seidenari | Fabio Tamburini
Proceedings of the Fifth International Conference on Language Resources and Evaluation (LREC’06)

We aim to automatically induce a PoS tagset for Italian by analysing the distributional behaviour of Italian words. To this end, we propose an algorithm that (a) extracts information from loosely labelled dependency structures that encode only basic and broadly accepted syntactic relations, namely Head/Dependent and the distinction of dependents into Argument vs. Adjunct, and (b) derives a possible set of word classes. The paper reports on some preliminary experiments carried out using the induced tagset in conjunction with state-of-the-art PoS taggers. The method proposed to design a proper tagset exploits little, if any, language-specific knowledge: hence it is in principle applicable to any language.

pdf bib
The DiaCORIS project: a diachronic corpus of written Italian
C. Onelli | D. Proietti | C. Seidenari | F. Tamburini
Proceedings of the Fifth International Conference on Language Resources and Evaluation (LREC’06)

The DiaCORIS project aims at the construction of a diachronic corpus comprising written Italian texts produced between 1861 and 1945, extending the structure and the research possibilities of the synchronic 100-million word corpus CORIS/CODIS. A preliminary in depth study has been performed in order to design a representative and well balanced sample of the Italian language over a time period that contains all the main events of contemporary Italian history from the National Unification to the end of the Second World War. The paper describes in detail such design processes as the definition of the main subcorpora and their proportions, the type of documents inserted in each part of the corpus, the document annotation schema and the technological infrastructure designed to manage the corpus access as well as the web interface to corpus data.

2005

pdf bib
Automatic Induction of a POS Tagset for Italian
Raffaella Bernardi | Andrea Bolognesi | Corrado Seidenari | Fabio Tamburini
Proceedings of the Australasian Language Technology Workshop 2005