[Up] [Next]
Go up to The Basics of Prolog: List Processing
Go forward to List Processing

Lists

The previous chapters have stressed the importance of the arity of a predicate: for a goal to match a head, not only must their predicates have the same name, but they must also have the same number of terms. For many purposes, however, the input data does not consist of a fixed number of units. For us, this is particular relevant when writing grammars for natural languages: we need to deal with sentences of arbitrary length.

Prolog provides a built-in mechanism for dealing with this issue. The Prolog structure is called a LIST, and it is an implementation of a mathematica object called a SEQUENCE. A sequence is a single object which is made up by an arbitrary number of components. The following is an inductive definition of SEQUENCE: SEQUENCE (an inductive definiton is one in which we give a small number of rules from which all appropriate object can be generated, such as in the definitions of the syntax of propositional logic and predicate logic).

Given a set S of sequence elements, we can characterise the set S' of finite sequences as the smallest set satisfying the following conditions:
  1. The empty sequence <> is a sequence.
  2. If s is a sequence, and e is an element of S, then the pair (e,s) is a sequence. e is called the HEAD of the sequence, and s is called its TAIL.
(1,(2,(3,<>))) is an example of a sequence. The head of this list is the element 1. The tail is the sequence (2,(3,<>)). As all sequences, the innermost compound sequence of this one ends with an empty sequence.

The Prolog representation of a sequence is, as noted above, called a LIST. The figure below shows a schematic representation of a list.

Figure 1 : A schematic representation of a Prolog list.
  In Prolog, the empty list is represented as [], and a list consisting of the element X followed by the list L is represented as [X|L]. In general, a sequence such as
(a, (b, (c, (d ... (n,S)...))))
corresponds to the Prolog list
[a|[b|[c|[d|[...|[n|S]...]]]]]
The notation with all these brackets is difficult to read, so Prolog programmers tend to use a simplified notation, in which the sequence above is written as
[a, b, c, d,...,n|S]
(This is also how SICStus itself displays lists.) When S in the example above is the empty sequence, the expression can be further abbreviated to
[a, b, c, d,...,n]
For most purposes, these last two ways of expressing lists are most practical.

While we will mostly use the simplified list notation, it is important to remember that even though a list may look like a linear structure, it is a pair of entities--a binary structure--as illustrated in the figure. Only the first element is accessible as a unit; the second one is only accessible through the next list. We use this structure to effectively hang one element on each head. In the [a, b, c, d,...,n] list, we hang a on the first head, b on the second head, etc.


<a.von.klopp@bangor.ac.uk>

[Up] [Next]