home  
User: guest
Versione Stampabile
Cerca
Home Page

ItalWordNet

ItalWordNet (IWN) is a lexical-semantic database developed in the framework of two different research projects: EuroWordNet (EWN)1 and Sistema Integrato per il Trattamento Automatico del Linguaggio (SI-TAL), a national project devoted to the creation of large linguistic resources and software tools for the Italian written and spoken language processing.

Among the resources developed in SI-TAL, IWN has been built as the reference semantic database, by extending the Italian wordnet developed within the EWN project.

In the framework of EWN, a linguistic model providing a rich set of semantic relations was designed [Alonge et al. 1998] and the first nucleus of data (verbs and nouns) was encoded [Roventini et al. 1998].

The wordnet is structured in the same way as the Princeton WordNet2, namely around the notion of synset or set of synonymous word meanings (according to a very wide concept of synonymy: meanings must be interchangeable in a context at least).

In addition to the internal language relations, equivalence relations were also encoded between Italian synsets and the closest concepts in an Inter-Lingual Index (ILI), a separate language-independent module containing all WN1.5 synsets but not the relations among them.

During the SI-TAL project, this wordnet was improved and extended through both the addition of nouns and verbs not yet encoded in EWN and the encoding of adjectives, adverbs and proper names, identifying also some additional relations, mainly in order to encode data about adjectives (please see: [Alonge et al. 2000], [Roventini et al 2000]3, [Marinelli e Roventini 2002]4 and [Roventini et al. 2003].

In its generic version, the IWN database is now formed of:

  • a wordnet containing about 47.000 lemmas, 50.000 synsets and 130.000 semantic relations (among the relations encoded the most important are the following ones: hyperonymy/hyponymy, antonimy, meronimy, relation of cause, relation of role etc.);
  • an Inter-Lingual Index (ILI), which is an unstructured version of WN1.5:
    this module, used in EWN to link wordnets of different languages, was also maintained in IWN to make the resource usable in multilingual applications;
  • the Top Ontology (TO), a hierarchy of language-independent concepts, reflecting fundamental semantic distinctions, built within EWN and partially modified in IWN to account for adjectives (not dealt with in EWN):
    the TO is formed of language-independent features, which may (or may not) be lexicalised in various ways, or according to different patterns, in different languages [Rodriguez et al. 1998]; through the ILI, all the concepts in the wordnet are directly or indirectly linked to the TO.

Since 2003 a terminological wordnet relating to the domain of navigation and sea-transportation and connected to the IWN generic wordnet is being developed [Marinelli et al. 2004]5.

The IWN database is continuously updated and improved at ILC. In particular, studies about proper names and their extensions of (metaphorical and metonymical) use observable on the referring corpus of Italian have been carried out [Marinelli et al. 2005]6.

For further information, do not hesitate to contact Adriana Roventini or Rita Marinelli.


1 For further information about EWN see the project Web site (URL: http://www.illc.uva.nl/EuroWordNet/) and the number of the journal Computers and Humanities devoted to the project (Vol. 32, Nos. 2-3, 1998).

2 Miller G., Beckwith R., Fellbaum C., Gross D., Miller K. (1993), "Introduction to WordNet: an On-Line Lexical Database", ms. (a revised version of the paper appeared in Fellbaum C. (ed.), Wordnet: a Lexical Reference System and its Applications, Cambridge, Mass., MIT Press, 1998).

3 Roventini A., Alonge A., Calzolari N., Magnini B., Bertagna F. (2000), “ItalWordNet: a Large Semantic Database for Italian”, in Proceedings of the 2nd International Conference on Language Resources and Evaluation (LREC 2000), Athens, Greece, 31 May – 2 June 2000, Volume II, Paris, The European Language Resources Association (ELRA), 783-790.

4 Marinelli R. and Roventini A. (2002), “ Proper Names in a Semantic Database”, in Proceedings of the 3rd International Conference on Language Resources and Evaluation (LREC 2002), Las Palmas de Gran Canaria, Spain, 29-30-31 May 2002, Volume II, Paris, The European Language Resources Association (ELRA), 993-997.

5 Marinelli R., Roventini A., Enea A. (2004), “Building a Maritime Domain Lexicon: a Few Considerations on the Database Structure and the Semantic Coding”, in Proceedings of the 4th International Conference on Language Resources and Evaluation (LREC 2004), held in Memory of Antonio Zampolli, Lisbon, Portugal, 26-27-28 May 2004, Volume II, Paris, The European Language Resources Association (ELRA), 465-468.

6 Marinelli R., Bindi R., Roventini A. (2005), “Metonymic and Metaphorical Uses of Proper Names”, in Atti del IX Simposio Internacional de Comunicación Social, Santiago de Cuba, 24-28 de Enero de 2005, 630-634.

Further Information
Project Web Site (in the process of restyling)
IWN Database (in the process of updating)

@

Project Documents
Entra nel progetto
ItalWordNet - Manuale Operativo
[pdf] [zip]
Entra nel progetto
Software di Gestione per ItalWordNet - Manuale per l'Utente
[pdf] [zip]
Entra nel progetto
Risultati della Validazione della Risorsa e del Software ItalWordNet
[pdf] [zip]

@

Project Staff
Adriana Roventini
Rita Marinelli
Francesca Bertagna
Alessandro Enea

@

16/02/2007