[Previous] [Up] [Next]
Go backward to Lists
Go up to The Basics of Prolog: List Processing
Go forward to Exercises

List Processing

A Simple List Processing Program

The inductive definition of SEQUENCE consists of a base case (the empty list is a list) and a recursive case (a pair starting with an element and ending in a list is a list). This means that we can define what a list is with the help of a recursive Prolog procedure! The following program defines a procedure which can be used to test whether an argument is a list or not:

list([]).
list([_Head|Tail]):- 
         list(Tail).
This program checks whether a term is a list by first checking whether it is an empty list. If it is not an empty list, it tries to match the term to a list with a head and a tail. The head of a list can have any structure--for instance, it can be a constant, a variable or another list--so all the program tests is that there is a head at all. Then it verifies that the tail of the list is a list using the same procedure.

Procedures that manipulate list are typically recursive: they have a base case which specifies what happens either when the end of the list is reached, or when a specific element is found. The recursive clause (or clauses) removes the head of the list and applies the same predicate to the tail.

Testing for Membership

One of the most commonly used list processing program is the member/2 procedure. It checks if a specified element is a member of a list.

member(Element,[Element|_]).
member(Element,[_|Tail]) :-
        member(Element,Tail).
The base case says that the procedure succeeds when the specified element is the head of the list (remember that element always sit on the head of the binary structure). In that case, the tail of the list can be ignored--hence the underscore, the anonymous variable.

The recursive case, is invoked when Element does not match the head the list. In that case, we remove the head and concentrate on the tail of the list instead. member/2 is called again, now with the tail of the first list as its second argument. Since the interpreter will already have rejected the base case if the recursive clause is used, we can ignore the head of the list in this case.

The fact that can (and should) use underscores in the two clauses is a typical example of something which follows from Prolog's control rather than from the logic of the program.

Let us look at the member/2 procedure in action. The first example shows how member finds an element in a list. When it fails to match the element to the head of the list, it strips off the head and proceeds to check the tail. When the element is found, Prolog steps back to the original query, putting each head back on the list as it goes back through the proof tree.

| ?- member(c,[a,b,c]).
member(c,[a,b,c]).
   1  1  Call: member(c,[a,b,c]) ? 
   2  2  Call: member(c,[b,c]) ? 
   3  3  Call: member(c,[c]) ? 
   3  3  Exit: member(c,[c]) ? 
   2  2  Exit: member(c,[b,c]) ? 
   1  1  Exit: member(c,[a,b,c]) ? 
yes
{trace}
If no element of the list matches the specified element, Prolog strips all heads of until it arrives at the end of the list, which is an empty list. The empty list is special, because it is the only list that does not consist of a pair of elements (a head and a tail). Since both the clauses of the member/2 procedure applies to lists which consist of two elements the fourth call of member does not match any clause in the database, and hence fails, so the entire goal fails.
| ?- member(d,[a,b,c]).
member(d,[a,b,c]).
   1  1  Call: member(d,[a,b,c]) ? 
   2  2  Call: member(d,[b,c]) ? 
   3  3  Call: member(d,[c]) ? 
   4  4  Call: member(d,[]) ? 
   4  4  Fail: member(d,[]) ? 
   3  3  Fail: member(d,[c]) ? 
   2  2  Fail: member(d,[b,c]) ? 
   1  1  Fail: member(d,[a,b,c]) ? 
no
{trace}
Whether a predicate succeeds or not is a matter of unification--whether the current goal matches any of the heads of the facts and rules in the database. Hence, if a list contains a variable in some position, the specified element will unfify with it and the member predicate succeeds:
| ?- member(c,[a,b,X,d]).
member(c,[a,b,X,d]).
X = c ? 
 
yes

Open and Closed Lists

Unification is a very important property when it comes to list processing. So far, all the queries were made with CLOSED LISTS, i.e. lists where the last element is an empty list (which is what the notation where the list appears to end with an element really means). In an OPEN LIST the last tail of the list is unspecified (represented as ...|Tail])--it can point to anything. This is what happens if you call the member predicate with an OPEN LIST:

| ?- member(d,[a,b,c|Tail]).
member(d,[a,b,c|Tail]).
   1  1  Call: member(d,[a,b,c|_91]) ? 
   2  2  Call: member(d,[b,c|_91]) ? 
   3  3  Call: member(d,[c|_91]) ? 
   4  4  Call: member(d,_91) ? 
   4  4  Exit: member(d,[d|_1015]) ? 
   3  3  Exit: member(d,[c,d|_1015]) ? 
   2  2  Exit: member(d,[b,c,d|_1015]) ? 
   1  1  Exit: member(d,[a,b,c,d|_1015]) ? 
 
Tail = [d|_A] ? ; 
 
    1  1  Redo: member(d,[a,b,c,d|_1015]) ? n
 
Tail = [_A,d|_B] ? ;
 
Tail = [_A,_B,d|_C] ? ;
 
Tail = [_A,_B,_C,d|_D] ? ;
 
Tail = [_A,_B,_C,_D,d|_E] ? 
 
yes
{trace}
(The n in the redo line switched trace off, so that only the resulting bindings are shown. You can trace the rest for yourself.)

As can be seen, when the tail is reached, and Prolog calls the member predicate with an unspecified list, this list is unified with one consisting of the specified element as head and a variable tail according to the first clause of the member/2 procedure. And not only this, if we ask Prolog to redo the last goal, it can be unified with the second clause as well, giving further bindings.

This shows that whether a list has a tail that is a variable or a tail that is an empty list is very important. You must always bear in mind what type of list you are using and write your predicates accordingly.

Concatenating Lists

Lists are very important when it comes to natural language processing. We need to represent strings of arbitrary length both in say morphology and syntax. In computational morphology, we represent words as lists of letters, and in computational syntax, we represent sentences as lists of words.

It will be useful to consider how parsers deal with sentences. This is a common way: say that we have the sentence Otters like fish. We will represent that as a list, i.e.

[otters,like,fish]
The parser proceeds by trying to prove that this list is a sentence. Its method is to try and prove that the list can be divided into two parts: one that corresponds to an NP ([otters]) and one which corresponds to a VP ([likes,fish]). When a list corresponding to an NP is concatenated with a list corresponding to a VP, we obtain a sentence.

This means that concatenation of lists is a very important aspect of parsing. The procedure conc/3 below is a simple program for concatenating lists.

% conc(X,Y,Z) if X concatenated to Y is Z
% X,Y,Z lists
 
conc([],Result,Result).
conc([Head|Tail_First],List,[Head|Tail_Result]) :-
              conc(Tail_First,List,Tail_Result).
Once again, there is a base case, which in this procedure says that the procedure should stop when the emtpy list is reached. The recursive clause is the one that does the work. If we call conc/3 with two lists which we want concatenated to one another, it will take of the head of the first list and then try to concatenate the tail of the first list to the second list. At the same time, it says that the result of concatenating the two lists should have the head element on it.

This means that Prolog will step through the conc/3 predicate trying to concatenate successively shorter first list with a second list, to eventually try with an empty list. If the empty list is concatenated with another list, the result should be a copy of the latter, and this is what the base case sees to.

As we step back through the proof tree, each element of the first list is put back on the resulting list, and we end up with the two lists concatenated:

| ?- conc([a,b,c],[d,e,f],Result).
conc([a,b,c],[d,e,f],Result).
   1  1  Call: conc([a,b,c],[d,e,f],_109) ? 
   2  2  Call: conc([b,c],[d,e,f],_413) ? 
   3  3  Call: conc([c],[d,e,f],_635) ? 
   4  4  Call: conc([],[d,e,f],_856) ? 
   4  4  Exit: conc([],[d,e,f],[d,e,f]) ? 
   3  3  Exit: conc([c],[d,e,f],[c,d,e,f]) ? 
   2  2  Exit: conc([b,c],[d,e,f],[b,c,d,e,f]) ? 
   1  1  Exit: conc([a,b,c],[d,e,f],[a,b,c,d,e,f]) ? 
 
Result = [a,b,c,d,e,f] ? 
 
yes
{trace}
(Of course, conc/3 does not have to be called with the first two arguments instantiated. You can try it with the only the third argument instantiated for instance.)

Finally, reverse is a procedure for reversing the elements of a list. Try to figure out how it works.

% reverse(X,Y), X,Y lists in reverse order
 
reverse([],[]).
reverse([H|T],L) :-
      reverse(T,T1),
      conc(T1,[H],L).

<a.von.klopp@bangor.ac.uk>

[Previous] [Up] [Next]