[Previous] [Up] [Next]
Go backward to Introduction
Go up to Difference Lists and Definite Clause Grammars
Go forward to Definite Clause Grammars

Difference Lists


The implementation of the production rules in Context Free Grammars conformed to the following format:

The rules state that a list corresponds to a certain constituent provided it can be proven that the parts of the list correspond to various daughter constituents. The implementation uses the textttconc/3 procedure introduced in List Processing.

Context Free Grammars contains a trace of a query for these rules. Although the sentence in the query is very short, the trace shows that conc/3 is used a lot. With the short sentence, the process is relatively swift, but as conc/3 goes through one step for each individual element of the first list, it is clear that the number of calls to it would increase significantly with a longer sentence. Since a call to conc/3 is made at least once for every constituent (more if the interpreter pursues a blind alley), this means a lot of computation. In fact, the time it takes to prove a query increases exponentially as the sentences become longer.

It is however possible to avoid all this extra computation, by using a method for representing a list which is known as a DIFFERENCE LIST. With difference lists, it is possible to implement concatenation in a single call no matter how long the lists are. The operation is performed as a unification in one step, and so does not require the computation involved when the elements are stripped of one by one.

Using Variable Tails to Add Items to the End of a List

The difference list notation exploites the Prolog interpreter's behaviour when processing open-ended lists (lists whose tails are variables). Remember what happens when we call member with an open-ended list:

| ?- member(d,[a,b,c|Tail]).
Tail = [d|_A] ? 
To verify that the d really has been added to the list, the query can be reformulated as follows: call the goal again as follows:
| ?- member(d,List), List =  [a,b,c|Tail].
List = [a,b,c,d|_A],
Tail = [d|_A] ? 
This shows clearly that the d is now part of List. In fact, we have used member/2 to append an item at the end, which is possible because the list ended in a variable tail. It is precisely this property that will be exploited.

As it happens, member/2 is just as inefficient as conc/3, because it also strips off all individual items before it adds something (and moreover, it is only possible to add individual elements). But we do not have to go via member/2 to achieve this behaviour. Consider the following:

| ?- List = [a,b,c|X], X = [d].
List = [a,b,c,d],
X = [d] ? ;
Here, we add the list of d to the end of List. Since the tail is always a list, an arbitrary number of elements, i.e. a list of any length, can be added in this way:
| ?- List = [a,b,c|X], X = [d,e,f].
List = [a,b,c,d,e,f],
X = [d,e,f] ? ;
It is worth noting that when we start with an open-ended list, and bind its tail to another list, there is only one solution. This is clear from the examples above--in both cases, the interpreter replied no when we asked if there were more solutions.


As the last example in the previous section shows, unifying a list tail with another list is way of concatenating lists. To see that there is a good reason to use the latter method if possible, compare the query made with conc/3 and the query with a tail unification below:

| ?- conc([a,b,c],[d,e,f],X).
   1  1  Call: conc([a,b,c],[d,e,f],_109) ? 
   2  2  Call: conc([b,c],[d,e,f],_403) ? 
   3  3  Call: conc([c],[d,e,f],_625) ? 
   4  4  Call: conc([],[d,e,f],_846) ? 
   4  4  Exit: conc([],[d,e,f],[d,e,f]) ? 
   3  3  Exit: conc([c],[d,e,f],[c,d,e,f]) ? 
   2  2  Exit: conc([b,c],[d,e,f],[b,c,d,e,f]) ? 
   1  1  Exit: conc([a,b,c],[d,e,f],[a,b,c,d,e,f]) ? 
X = [a,b,c,d,e,f] ? 
| ?- List = [a,b,c|X], X = [d,e,f].
   1  1  Call: _59=[a,b,c|_103] ? 
   1  1  Exit: [a,b,c|_103]=[a,b,c|_103] ? 
   2  1  Call: _103=[d,e,f] ? 
   2  1  Exit: [d,e,f]=[d,e,f] ? 
List = [a,b,c,d,e,f],
X = [d,e,f] ? 
There are two things to note here: firstly, it is possible to produce behaviour similar to that of conc/3 with unification only. Secondly, the second method is much more efficient: two goals, as opposed to four. The difference would have been even more noticable if the lists had been longer, because the length of the call to conc/3 increases with every element on the list, whereas the latter procedure remains fixed at two goals.

Using Tail Concatenation in Practice

The query consisting of the two goals List = [a,b,c|X], X = [d,e,f] performs the task of concatenating the two lists [a,b,c] and [d,e,f] but what we need is to replace the call to conc/3 in

But we cannot just write
X = [Y|Z]
instead of conc(Y,Z,X) because that means that Y the element at the head of the list. Instead, we need to incorporate the information that Y is an open-ended list and provide access to the tail of Y. This is where we introduce the concept of a DIFFERENCE LIST. A difference list represents a single list as the "difference" between two lists. The general idea is that we want to be able to say that the list [john] is the difference between the list [john,loves,mary] and [loves,mary]. The code below shows how we can do this.
It is not entirely obvious what these rules do, so let us concentrate on the s/2 rule to explain. It says that part of the list Sentence (the part from the first element to the tail that is identical to Hole) is a sentence provided we can prove that: We can try this grammar out if we add a few dictionary rules such as the following:
Here is a trace for [john,loves,mary] (compare this with the trace at the end of Context Free Grammars.
| ?- s([john,loves,mary],[]).
   1  1  Call: s([john,loves,mary],[]) ? 
   2  2  Call: np([john,loves,mary],_360) ? 
   3  3  Call: noun([john,loves,mary],_360) ? 
   3  3  Exit: noun([john,loves,mary],[loves,mary]) ? 
   2  2  Exit: np([john,loves,mary],[loves,mary]) ? 
   4  2  Call: vp([loves,mary],[]) ? 
   5  3  Call: verb([loves,mary],_1359) ? 
   5  3  Exit: verb([loves,mary],[mary]) ? 
   6  3  Call: np([mary],[]) ? 
   7  4  Call: noun([mary],[]) ? 
   7  4  Exit: noun([mary],[]) ? 
   6  3  Exit: np([mary],[]) ? 
   4  2  Exit: vp([loves,mary],[]) ? 
   1  1  Exit: s([john,loves,mary],[]) ? 
The second argument to s/2 was given as []. This is because the tail part of [john,loves,mary] that follows after the part that is the sentence is the empty list. In general, when we make a query about a list of words, the second argument should be an empty list. This is because the phrase will normally be represent as a closed list, and the query should only succeed if it can prove that the entire list corresponds to the constituent (there should be no words left unaccounted for).

The grammar described in this section is available as grammar2.pl in the directory ~avk/Temp/.


[Previous] [Up] [Next]