Yliopiston etusivulle In English
Helsingin yliopisto
CLT231: Introduction to Natural Language Processing - 2010-2011


Nykykielten laitos

PL 24 (Unioninkatu 40)

Puhelin +358 (09) 1911 (vaihde)
Faksi +358 (09) 191 28313

12. WordNet and Word Senses.

  • Lecture notes
  • Further reading
  • Practical work
    • In IDLE do:
      >>> import nltk
      >>> from nltk.corpus import wordnet as wn
    • List the synonym sets of dog and count them:
      >>> wn.synsets('dog')
      >>> len(wn.synsets('dog'))
    • Do the same for hypernyms and hyponyms at different levels:
      >>> dog_levels = ['mammal', 'animal', 'dog', 'collie']
      >>> for word in dog_levels:
              print wn.synsets(word)
              print "Number of senses:", len(wn.synsets(word))
    • The polysemy of a word is the number of different senses it has. Define a function that returns the number of senses:
      >>> def polysemy(word):
              return len(wn.synsets(word))
    • Use the function to compare polysemy at different levels:
      >>> for word in dog_levels:
              print word, polysemy(word)
    • Do the same for different levels for some other words:
      >>> horse_levels = ['mammal', 'animal', 'horse', 'stallion']
      >>> for word in horse_levels:
              print word, polysemy(word)
      >>> pig_levels = ['mammal', 'animal', 'pig', 'sow']
      >>> for word in pig_levels:
              print word, polysemy(word)
    • This suggests that basic categories like dog, horse, pig have more senses than higher and lower levels. Do you agree?
    • It's clearer if only noun senses are counted (sow is also a verb). Define a new function that counts only the noun senses:
      >>> def noun_polysemy(word):
              return len(wn.synsets(word, pos='n'))
      >>> for word in pig_levels:
              print word, noun_polysemy(word)

Assignment 4.

© 2006-2010 Graham Wilcock

Hae laitoksen sivuilta:

Laitoksen etusivulle | Tiedekunnan etusivulle | Yliopiston etusivulle

Copyright © 2003-2005 Helsingin yliopisto. Kaikki oikeudet pidätetään.