Yhteystiedot
Yleisen kielitieteen laitos
PL 9 (Siltavuorenpenger 20 A)
00014 HELSINGIN YLIOPISTO
Puhelin +358 (09) 1911 (vaihde)
Faksi +358 (09) 191 29307
|
|
tulostettava versio
7. Further Topics: Named Entity Recognition
- Lecture notes
- Further reading
5.1. OpenNLP Name Finder
- Practical work
-
OpenNLP name finder recognizes several different types of
entities. It uses a separate maximum entropy model for
each type. There are 7 ready-made models: person,
location, organization, date, time, money, percentage.
-
Copy the script
clt350-opennlp-namefinder
to your directory and make it executable.
This script runs OpenNLP sentence detector followed by
OpenNLP name finder.
It takes input from stdin and sends output to stdout.
-
Use it like this to find named entities in Sonnet 130:
./clt350-opennlp-namefinder <sonnet130.txt
>names.txt &
-
Try named entity recognition with bigger texts and corpora:
Jane Austen's
Northanger Abbey,
or half a million words in
Jane Austen's six main novels.
5.2. Training New Models
Assignment 3
© 2007-2008 Graham Wilcock
|