Yliopiston etusivulle Suomeksi
Helsingin yliopisto
clt232: Rakenteisten dokumenttien käsittely (Structured Documents Processing) - syksy 2006

Yhteystiedot

Yleisen kielitieteen laitos
PL 9 (Siltavuorenpenger 20 A)
00014 Helsingin yliopisto

Puhelinvaihde +358 (09) 1911
Faksi +358 (09) 191 29307

7. Introduction to XML.

  • Lecture notes
  • Further reading
  • Practical work: Xerces and Emacs on venus
    • On Windows, start XSession and login to venus via Putty (make sure "Enable X11 forwarding" is ticked in Putty configuration).
    • Copy the shell script clt232-xerces to your own directory.
      Make it executable (chmod +x clt232-xerces).
      This script runs the open source Xerces Java XML parser. It also counts the elements, attributes and characters in the XML file.
    • Copy the example XML file memory.xml to your directory. Parse it by
      ./clt232-xerces memory.xml
      to check that it is syntactically well-formed XML.
    • On venus, we'll use GNU Emacs (via XSession) to edit XML files and we'll use the Xerces shell script (via Putty) to parse and validate XML files.
    • Deliberate mistake 1.
      Edit memory.xml in Emacs by changing the first <memory> to <memoir>.
      Parse it again with Xerces to check that it is now syntactically ill-formed.
    • Deliberate mistake 2.
      Edit the file further by also changing the first </memory> to </memoir>.
      Parse it again. Is it now well-formed or ill-formed?
  • Practical work: Xerces and jEdit on Windows
    • On Windows, we'll use jEdit to edit XML files and we'll use jEdit's Xerces plugin to run the same Xerces Java inside jEdit to parse and validate XML files.
    • On Windows, start jEdit and open Plugins -> Plugin Manager. In the Plugin Manager window, select the Install tab. In the Install tab, tick SideKick, XercesPlugin and XML, then click the Install button to install them.
    • Use jEdit's menus to open SideKick and XMLInsert, and use the docking options to dock them both on the right-hand side.
    • Copy the example XML file memory.xml again, this time to your temporary Windows folder. Open it in jEdit. Use SideKick to view the document structure. Use SideKick's Parse button to check that it is syntactically well-formed.
    • Deliberate mistake 1.
      Edit memory.xml in jEdit by changing the first <memory> to <memoir>.
      Parse it again to check that it is now syntactically ill-formed.
    • Deliberate mistake 2.
      Edit the file further by also changing the first </memory> to </memoir>.
      Parse it again. Is it now well-formed or ill-formed?

8. Validating XML.

  • Lecture notes
  • Further reading
  • Practical work: Xerces and Emacs on venus (continued)
    • Copy the DTD memory.dtd and the file memory1.xml to your own directory.
      This file refers to the DTD. Validate the file against the DTD by
      ./clt232-xerces -v memory1.xml
    • Edit memory1.xml to make deliberate mistake 1 and validate it again.
      Edit it further to make deliberate mistake 2. Is it now valid or invalid?
    • Copy the Schema memory.xsd and the file memory2.xml to your directory.
      This file refers to the Schema. Validate the file against the schema by
      ./clt232-xerces -v -s memory2.xml
    • Edit memory2.xml to make deliberate mistake 1 and validate it again.
      Edit it further to make deliberate mistake 2. Is it now valid or invalid?
  • Practical work: Xerces and jEdit on Windows (continued)
    • Copy memory.dtd and memory1.xml to your temporary Windows folder.
      Open memory1.xml in jEdit and view the document structure in SideKick.
      Validate the file against the DTD by clicking the Parse button.
    • Edit memory1.xml to make deliberate mistake 1 and validate it again.
      Edit it further to make deliberate mistake 2. Is it now valid or invalid?
    • Copy memory.xsd and memory2.xml to your temporary Windows folder.
      Open memory2.xml in jEdit and view the document structure in SideKick.
      Validate the file against the Schema by clicking the Parse button.
    • Edit memory2.xml to make deliberate mistake 1 and validate it again.
      Edit it further to make deliberate mistake 2. Is it now valid or invalid?

Assignment 3.

© 2001-2006 Graham Wilcock