|
|  |
Kurssin materiaali
On-line Resources.
- Organizations
- International Semantic Web Conference
-
1st: ISWC2002 (Sardinia)
with proceedings abstracts
-
2nd: ISWC2003 (Florida)
with proceedings abstracts
-
3rd: ISWC2004 (Hiroshima)
- Language Technology Workshops
- Finland
1. The Semantic Web.
- Lecture notes
- Further reading
2. XML, metadata and ontologies.
- Lecture notes
- Practical work with GATE
-
First, look at Shakespeare's Sonnet 130 in a web browser as
plain text and as
HTML.
View the HTML markup (view page source) in the browser:
this is in-line annotation.
-
To run GATE, copy the shell script
semw-gate
to your directory and make it executable.
If you're new to GATE, see
Introduction
to GATE.
Start GATE by: venus$ semw-gate &
-
Load
sonnet130.html
as GATE document Sonnet130.html
(see
Creating documents).
View the HTML markup in GATE: this is stand-off annotation
(Inspecting
the processing results).
-
Create a GATE corpus Sonnets with
Sonnet130.html in it
(Creating
corpora).
Load ANNIE
(ANNIE
- a ready-made information extraction system for English).
Run ANNIE on the Sonnets corpus. View the annotations
(Inspecting
the processing results).
Surprises? <JobTitle>: mistress,
<Unknown>: Shakespeare, sonnet, heaven.
-
Use WordNet to get hypernyms and hyponyms of
mistress, Shakespeare, sonnet, heaven.
If you're new to WordNet, see
Introduction
to WordNet.
Start WordNet by: venus$ wnb &
-
Copy
sonnet130.xml,
sonnet.dtd to your directory.
Load Sonnet130.xml into GATE.
More sonnet file formats:
SVG,
FO,
PDF,
JSML.
Which can you load and analyse in GATE?
- Assignment 1:
Information Extraction
3. RDF: Resource Description Framework.
- Lecture notes
- Further reading
- Practical work with Jena
4. Jena: RDF in Java. More RDF: RDF/XML.
- Lecture notes
- Further reading
- Practical work continued
- Assignment 2A:
Jena and RDF/XML
5. RDF in SQL databases.
- Lecture notes
- Further reading
- Practical work with MySQL
-
Use the
test database:
venus$ mysql ... mysql> USE test;
If you're new to MySQL, see
MySQL Tutorial.
-
Look at the tables used by Jena:
mysql> SHOW TABLES;
Format of RDF statements in SQL:
mysql> DESCRIBE jena_sys_stmt;
Index of "named" Jena models:
mysql> SELECT * FROM jena_graph;
-
Try:
mysql> SELECT Prop, Obj FROM jena_g5t1_stmt
WHERE Subj LIKE "%winecork%";
- Assignment 2B:
Jena and RDF in SQL
6. RDF query languages: RDQL.
- Lecture notes
- Further reading
- Practical work with RDQL
-
Copy the shell script
semw-rdql
to your directory and make it executable.
Run queries by:
venus$ semw-rdql --data ... --query ...
-
Do the examples in the
RDQL tutorial
(vc-db-1.rdf,
vc-q1,
vc-q2, ...).
-
See
WordNet in RDF/XML by Sergey Melnik and Stefan Decker
Make some interesting RDQL queries for WordNet. For example,
what are the hyponyms of "sonnet"? Use the local
WordNet RDF/XML.
7. RDF Schema (RDFS).
- Lecture notes
- Further reading
8. Ontologies. Protégé and RDFS.
- Lecture notes
- Further reading
- Practical work with Protégé and RDFS
9. Building an ontology.
10. Ontology languages: RDFS, DAML+OIL, OWL.
- Lecture notes
-
Tutorial on OWL
(ISWC2003) by Sean Bechofer, Ian Horrocks, Peter Patel-Schneider.
- Further reading
- Practical work with Protégé and OWL
-
Copy the shell script
semw-protege-beta
to your directory and make it executable.
Start Protégé by: venus$ semw-protege-beta &
-
Import(*) the example
koala ontology and work through the
Protégé OWL Tutorial.
(*)How to do imports.
"Import from format" only handles CLIPS format, not OWL.
The correct way to import an ontology (e.g. koala.owl)
is to create a new ontology (e.g. mykoala.owl)
and use the Metadata tab in Protégé to specify the
ontology to be imported.
This adds an <owl:imports> statement to the new ontology.
See
OWL Plugin FAQs and
Protégé+OWL+Imports.
There is also an incorrect way that can be done by
the following 3 steps:
1. Start a new project (Project -> New).
Choose "OWL Files" in "Select Format".
2. Save the project immediately with the required name
(e.g. koala.pprj).
This also creates a new "empty" OWL file (koala.owl).
Close the project.
3. Outside Protégé, overwrite the "empty" OWL file (koala.owl)
with the "real" OWL file (koala.owl) from the existing ontology.
Then reopen the project in Protégé.
11. OWL: Web Ontology Language.
12. Jena and OWL: Jena ontology API.
- Lecture notes
- Further reading
- Practical work with Jena OWL syntax checker
-
Copy the shell script
semw-owl-checker
to your directory and make it executable.
Run
Jena OWL syntax checker by:
venus$ semw-owl-checker fileURL
Examples: semw-owl-checker file:food.rdf
(file in same directory)
semw-owl-checker
http://www.ling.helsinki.fi/kit/2004k/ctl310semw/OWL/camera.owl
- Assignment 5:
Jena and OWL: Verbalizing an ontology.
13. Linguistic ontologies: WordNet, OntoWordNet, FrameNet.
- Lecture notes
- Further reading
- Practical work: learn more ways to use WordNet
14. Semantic Web information extraction.
- Lecture notes
- Further reading
- Practical work with GATE
-
Study GATE's
DAML+OIL Exporter.
Copy the
DAML+OIL ontology to your directory.
-
Start GATE by:
venus$ semw-gate & and
repeat the first week's practical work.
Make a corpus containing Shakespeare's
sonnet130.html and run ANNIE on it.
-
Load the exporter
(Processing resources -> New -> DAML+OIL Exporter).
Specify the ontology (file:Exporter.daml)
and leave exportFilePath blank.
-
Create a new corpus pipeline
(Applications -> New -> Corpus Pipeline).
Select the DAML Exporter as the only processing
resource, and run it (-> Run).
A mini-ontology sonnet130.html.daml
is automatically populated with Persons.
- Assignment 6:
Semantic Web information extraction
© Graham Wilcock 2004.
|
|