Yliopiston etusivulle In English
Helsingin yliopisto
clt361: Grammar Engineering - lukuvuosi 2009-2010


Nykykielten laitos

PL 24 (Unioninkatu 40)

Puhelin +358 (09) 1911 (vaihde)
Faksi +358 (09) 191 28313

Difference Lists

  • Make a new subdirectory lkb-cfg-4 for this exercise. Copy all the files from ~gwilcock/delphin/lkb/src/data/itfs/g4diff into it.
  • Start LKB, use Load->Complete grammar, and select the script file from your lkb-cfg-4 directory to load the grammar. Parse these dogs sleep.
  • This grammar is based on g2agr, but the use of ORTH is changed. In g2agr, words had an ORTH feature showing the orthography of the word as a string, but phrases did not have ORTH. In this grammar, phrases also have ORTH.
  • View the file types.tdl in your lkb-cfg-4 directory. Both word and phrase are subtypes of sign. A linguistic sign, according to Saussure, associates a form with a meaning. The ORTH feature gives the form as a list of strings. Check that the ORTH of the phrase these dogs sleep is the list < these, dogs, sleep >.
  • In this sign-based kind of grammar, every time two daughter phrases are combined into a mother phrase by a grammar rule, the ORTH lists of the daughter phrases are concatenated to make the ORTH list of the mother. This could be done with a list append operation but it is much more efficient to use difference lists.

Open and closed lists

  • A list with a fixed number of items is called a closed list, and is terminated by REST *null*. For example a list of exactly two items aaa, bbb is [ FIRST aaa, REST [ FIRST bbb, REST *null* ]]. This can be written as <aaa, bbb>.
  • A list with an unknown number of items is called an open list, and is terminated by REST *list*. For example a list of two or more items aaa, bbb, ... is [ FIRST aaa, REST [ FIRST bbb, REST *list* ]]. This can be written as <aaa, bbb, ...>.
  • Because an open list is terminated by REST *list*, it is not really terminated. The length of the list in the final REST is unspecified. It might be an empty list (*null*), and it might be a non-empty list (*ne-list*).
  • A closed sublist at the start of an open or closed list is called a prefix. For example the closed sublist <aaa, bbb> is a prefix of the open list <aaa, bbb, ...>.

Difference lists

  • A closed list can be represented as the difference between two other lists. For example <aaa> is the difference between <aaa, bbb, ...> and <bbb, ...>. In this case, the list <bbb, ...> serves as a kind of pointer to the end of the prefix <aaa> inside the list <aaa, bbb, ...>.
  • In LKB, difference lists are feature structures of type *diff-list* with two attributes LIST and LAST. The value of LIST is an open list, whose REST has the value *list*. The value of LAST is co-indexed with the value of REST in LIST.
  • For example <aaa, bbb> can be represented by the difference list
    [ LIST [ FIRST aaa, REST [ FIRST bbb, REST #last ]], LAST #last ].
    In the short notation, this can be written as <! aaa, bbb !>.
  • View the definition of word in types.tdl in your lkb-cfg-4 directory. The ORTH feature is a difference list. It contains exactly one string, the value of LIST FIRST. The value of LIST REST is #end, and the value of LAST is also #end.

Concatenation with difference lists

  • View the definition of phrase in types.tdl. This definition ensures that the ORTH values of the two ARGS daughters are concatenated to give the ORTH value of the mother. The first daughter has ORTH LIST #first and ORTH LAST #middle, and the second daughter has ORTH LIST #middle and ORTH LAST #last. The mother combines these automatically with ORTH LIST #first and ORTH LAST #last.
  • Using a text editor, edit lexicon.tdl. Add lexical entries for cats and walk. Reload the grammar and parse these cats walk. Check that the ORTH value of the subject NP is < these, cats > and that the final ORTH value of the parsed sentence is < these, cats, walk >.
  • The phrases in this grammar always have two daughters. Can you define a type phrase-3 with three daughters which ensures that the mother's ORTH value is the concatenation of the three daughters' ORTH values? Can you also define a type phrase-1 with only one daughter so that the mother's ORTH value is the same as the daughter's ORTH value?
© 2006-2010 Graham Wilcock

Hae laitoksen sivuilta:

Laitoksen etusivulle | Tiedekunnan etusivulle | Yliopiston etusivulle

Copyright © 2003-2005 Helsingin yliopisto. Kaikki oikeudet pidätetään.