[Previous] [Up] [Next]
Go backward to Introduction
Go up to The Basics of Prolog: Logic and Control
Go forward to Control

The Logic of Prolog

Horn Clauses: the Logical Language of Prolog

One property one would like to be able to expect from Prolog, given the remarks above, is that if there is a proof of a theorem given the premises in a program, the interpreter should be able to find it. Also, if there are different proofs, the interpreter should find all of them. There is a term for this: we say that the search strategy should be COMPLETE.

It is however difficult to devise automated methods that operate on all types of predicate logic sentences. For this reason, the logical language Prolog uses is a restricted form of predicate logic, which only allows sentences of a certain format. These are known as HORN CLAUSES (see chapter 1 in Pereira and Shieber (1987) for a detailed explanation).

You may be surprised to hear that despite the fact that predicate logic is the paradigm of scientific research (reasoning which does not conform to the rules of predicate logic is considered invalid), in the general case, predicate logic is not complete. It can however be made complete with fewer restrictions than is the case with Horn Clauses.
There are three types of Horn clause: the FACT, the RULE and the QUERY (DIRECTIVE). The premises that programs consist of are facts and rules. The theorems which the interpreter tries to prove are queries.


Facts Represent Simple Statements

The simplest type of premise is the FACT, which corresponds to a simple sentence in predicate logic. The natural language sentences in (1) below could be represented in Prolog (or to be correct, in Edinburgh Prologs--Computer languages, like natural languages, have dialects. This book follows the conventions of the Edinburgh family of Prologs--other dialects have different ones) by the sentences in 5--8

  1. Lesley is Welsh.
  2. Lesley is Lee's teacher.
  3. Lesley loves Sandy.
  4. Stirling is between Edinburgh and Glasgow.
teacher(lesley, lee).
loves(lesley, sandy).
between(stirling, edinburgh, glasgow).
Facts consist of a PREDICATE followed by and a fixed number of ARGUMENTS inside brackets. In Prolog, the arguments are usually referred to as TERMS.

The predicate corresponds to a property or relation, and it is usually given a mnemonic label (as opposed to the single letters common in logical analysis), while the terms represent some type of data--in this case references to individuals. All the labels (the names of the predicate and of the terms arguments) start with a lowercase letter. This means that they stand for a CONSTANTS--the property of being Welsh is only true of Lesley. The combination of predicate and its terms is called a goal. A fact is a Prolog sentence consisting of a single goal (Prolog sentences are delimited by full stops).

The number of terms a predicate takes is called its ARITY, and it is as important as the label. The predicate teacher in the example above has two terms: we say that it has arity two, and refer to it as teacher/2. teacher/2 would be distinct from, say, teacher/1, and as far as the Prolog interpreter is concerned they are completely unrelated.

When a program is loaded by the Prolog interpreter, the set of facts that make up the program represent all that is known to be true. However, as in formal logic, the facts have a meaning only in the sense that they have a truth value; they do not have an interpretation in terms of an external reality.

The Order of Terms

As with any formal language, the order of the arguments of a predicate is up to the person who translates the facts into the language, as long as they are consistent. Above, Lesley is Lee's teacher was translated as teacher(lesley,lee). If the programmer also has to translate the sentence Sandy's teacher is Lou, the order of the terms must be consistent with the roles, and not with the superficial order of the words in the sentence. Hence, it would be expressed as

In order to remember arbitrary decisions such as the order of terms, programmers add COMMENTS to programs (all programming language interpreters and compilers provide some means to mark parts of a file as not belonging to the actual program). For instance, in Prolog, a percentage sign (%) means that the remaining characters on the line should be ignored. In order to remember what each of the terms in the teacher/2 relation stands for, the program could contain the following comment:
% teacher(Teacher,Pupil): Teacher is the teacher of Pupil.
teacher(lesley, lee).
teacher(lou, sandy).

Inferring Knowledge

It is important to remember that the labels chosen in the translation mean nothing to the Prolog interpreter. It will be useful to consider an example such as the sentence Stirling is between Edinburgh and Glasgow which we translated into Prolog as

between(stirling, edinburgh, glasgow).
Somebody who understands English knows what spatial relation between denotes. With this knowledge, they would readily accept that if one of the sentences below is an accurate description of the positions of Glasgow, Stirling and Edinburgh, so is the other.
Stirling is between Edinburgh and Glasgow.
Stirling is between Glasgow and Edinburgh.
However, the Prolog interpreter knows nothing about English--all it does is to compare meaningless strings of symbols. Unless the programmer explicitly specifies that an inference can be made, the Prolog facts below would be treated as completely unrelated.
between(stirling, edinburgh, glasgow).
between(stirling, glasgow, edinburgh).


Suppose that we want to use the four facts in example 5--8 as a program. The first thing we need to do is to write them to a file. Then this file is loaded into the interpreter. If the file is called facts.pl, we can use the command consult to load it, like this:

?- consult('facts.pl').
Press return and the file will be loaded Running Prolog Under Emacs explains the process in more details).

You must remember to type a full stop--if there is no full stop the interpreter assumes that you have not finished your query--though you can type it after the carriage return if you forget.

When a program consisting of a set of facts has been loaded into the Prolog interpreter, it is possible to ask questions which refer to these facts. There are two types of questions that can be asked. You can ask


As an example of a yes/no-question, we could ask if it is true that Lesley is Welsh using the following query:

| ?-  welsh(lesley).
Likewise, we can ask if it is true that Sandy is a Welsh:
| ?-  welsh(sandy).
The latter query illustrates an important principle in Prolog. The database did not include any information about Sandy's nationality, but even so, the interpreter replies to the query whether Sandy is Welsh with no. This is because Prolog conforms to what is called the CLOSED WORLD ASSUMPTION about the contents of the database. This means that if a statement cannot be shown to be true, it is assumed to be false. For this reason, it is better to treat Prolog's no (and its negation in general) as corresponding to "not provable".


Queries corresponding to wh-questions are asked by filling one or more of the argument positions of a predicate with VARIABLEs. We indicate that a term is a variable by using a label that starts with an uppercase letter. When a query containing a variable is made, Prolog tries to find a BINDING for it. As an example, the following query corresponds roughly to the question Who is Welsh? (or perhaps more accurately to the command Name somebody who is Welsh).

| ?- welsh(Who).
Who = lesley
When a query requesting a binding of a variable is made, the prompt does not return. This is because the interpreter only shows one binding at the time to queries of this type. It is waiting for the user to state whether to try to find further bindings or whether to exit the query at this point. Typing a semi-colon causes the interpreter to search for further bindings of Who, whereas typing a carriage return exits the query. In the former case, since there are no more bindings in the database, the interpreter would return no (this time meaning "no more answers"), whereas in the latter case the response is yes.

It is possible to ask for more than one variable binding in a single query. For instance, we could ask a question equivalent to Who loves whom? by typing the following query:

| ?- loves(Who, Whom).
Who = lesley
Whom = sandy <cr>
The use of Who and Whom in the query above does not reflect an attempt at preaching prescriptive grammar. If the query is to allow for the possibility that the agent and the patient are different individuals, it is necessary to use different variable names. The query loves(Who,Who) would not find any bindings for Who, because the database does not contain any facts about a person who loves him or herself. The property that one variable name must have the same binding in every instance holds within single queries (or more generally, holds within single sentences). The same variable name can however be used in different queries, with different bindings.

In addition to asking simple questions such as the ones above, it is also possible to ask complex questions, such as Who is Welsh and a teacher? This is illustrated by the following query:

| ?- welsh(Who), teacher(Who, _).
Who = lesley <cr>
The comma between the two parts of the query corresponds to the AND of formal logic. The second part of the query is written as teacher(Who,_), which raises two questions: why an underscore, and why teacher(Who,_) as opposed to teacher(Who)?

To begin with the first question, the underscore is called the anonymous variable--it signifies that the term in the corresponding position may have any value (or be of any format), but that the value is not interesting. If an ordinary variable had been used, e.g. Whom, the query would have been more like asking the question Name somebody who is both Welsh and a teacher, and name one of their pupils.

The answer to the second question is that predicates are defined not by their names, but by their names and their arity. As far as the Prolog interpreter is concerned, there are no facts in the current database which relate the property of being a teacher to a single object. In fact, if teacher(Who) was posed as a query, the following would happen:

| ?- teacher(Who).
{EXISTENCE ERROR: teacher(_28): procedure user:teacher/1 does not exist}
Hence the format of the query.


The last query in the previous section, corresponding to Name somebody who is both Welsh and a teacher!, effectively defines a new class of objects: teachers who are Welsh. If, for the purposes of the program, it would turn out that this is a class that needs to be referred to over and over again, the programmer may wish to define the class in the program. This is done by using a RULE. Here is a rule which defines a new predicate welsh_teacher/1 in terms of the welsh/1 and teacher/2 predicates from Fact.

A rule has two parts, separated by :-. The first part is called the HEAD of the rule, and consists of a single term. The second part is called the BODY, and consists of one or more terms, separated by commas. As before, commas stand for logical conjunction, whereas the :- operator stands for logical implication, i.e. the rule above roughly corresponds to the statement If a person is a teacher (of someone) and the person is Welsh, then the person is a teacher who is Welsh. A Prolog rule hence corresponds to a conditional (with the consequent before the antecedent).

If this rule is added to the file containing the facts from before, and the file is loaded into the interpreter again, it will now be possible to pose queries with the unary predicate welsh_teacher/1 to the same effect as the combined query in Queries.

| ?- welsh_teacher(Who).
Who = lesley ? 
What happens if this query is typed is that the interpreter tries to match the query to a fact or a rule. In this case, the query matches the head of a rule. The terms in the body of the rule effectively become the new queries, which in turn match two known facts.

A Prolog Database

A Prolog program consists of a mixture of facts and rules, often called the DATABASE. Queries are used to access the database. They are all Horn clauses, and the three types of clause are related. A fact is a rule without a body, i.e. a rule without conditions--it is always true. A query is a rule without a head. It is an antecedent without a consequent, and what happens when the query is typed at the prompt is that the interpreter tries to find a consequent to the antecedent.


[Previous] [Up] [Next]