Kieliteknologian oppiaine

Open Source Methods: Ctl257 2004k

- KIT-verkosto | Yleisen kielitieteen laitos| Kieliteknologian opetus| Helsingin yliopisto -

Kurssin materiaali

1. Installing open source software.

2. Reusing published examples.

3. FreeTTS: An open source speech synthesizer.

4. Building with Ant.

Apache Ant

5. Version control: RCS and CVS.

  • Lecture notes:
  • Practical work
    • Make a sandbox in your own directory: mkdir sandbox; cd sandbox
    • Temporarily set CVSROOT: export CVSROOT=~gwilcock/CVS
    • Get this Ant/JUnit source tree: cvs checkout ctl257/junit
      See the Ant buildfile history: cvs log ctl257/junit/build.xml

6. Testing with JUnit.


7. Developing with IDEs.

JBuilder IDE

8. Open source NLP tools.


9. Open source databases.

10. Java and SQL.

11. WordNet: A lexical database.

  • Lecture notes: WordNet
  • Practical work (on venus)
    • WordNet is already installed. Start the browser: venus$ wnb &
    • Query hyponyms of "student": what kind of student are you?
      Compare hypernyms of "student" and "professor": what's same/different?
      Query "professor", "lecturer", "docent": is WordNet only US English?
    • Compare "big", "large, "great". What are their antonyms?
      Which combinations of "big/large/great sister/uncle/toe" are collocations?

12. Java, SQL and WordNet.

  • Lecture notes: WordNet related projects
  • Practical work (on venus)
    • Install JWNL in a subdirectory jwnl-1.3 in your own directory.
      Add jwnl.jar, utilities.jar, commons-logging.jar to your CLASSPATH.
    • Edit file_properties.xml to specify the location of WordNet:
      <param name="dictionary_path" value="/usr/share/wordnet"/>
    • Compile and run it:
      java net.didion.jwnl.utilities.Examples file_properties.xml
  • Assignment 5: Java and WordNet.

13. GATE: An IDE for natural language processing.


14. GATE and WordNet.

  • Lecture notes:
  • Practical work (on venus)
    • Run GATE and create a serial data store in your directory. Load Sonnet130 and save it as XML in the data store. Run ANNIE and save Sonnet130 as XML again (with a new name).
    • Problem solving: When I ran Sentence Splitter on Sonnet130, it went on and on without finishing. How did I find out what was wrong? (Hint and Answer)
    • Use WordNet to get the hypernyms and hyponyms of Shakespeare, sonnet, heaven.
      Save the results in a file. You may prefer the shell command wn for this (see man wn).
  • Assignment 6: GATE and WordNet (and Shakespeare).

© Graham Wilcock 2004.