Term Objects

Prolog does not feature the kind of variables and data structures that are usually found in imperative programming languages. The value of a variable is tight to the progress of the search procedure and as a data structure we usually find the term model. The Prolog API allows the creation and inspection of these structures.

There are various choices for the internal representation of the term model and of the variables. The choice must at least support efficiently the following operations:

We follow the same approach that has already been followed in the DEC-10 Prolog implementation [1]. In this approach a dynamically instantiated clause is represented by two pointers. One pointer referring to the so called skeleton of the clause and the other pointer referring to the instantiation frame of the clause. The frame will hold the fresh variables of the clause. Each new instantiation of the clause requires a new frame. The skeleton contains indexes into the frame. The pair of the two pointers is usually called a molecule.

In the original DEC-10 Prolog implementation such a molecule could be represented in one machine word. We do not bother with machine representation in our Prolog interpreter. Since machine representation would violate the security provided by our object oriented host languages. We also do not try to generate byte code for the Java virtual machine. Byte code can only capture the surface of a clause and will not help when a real unification is required or when heavy backtracking is involved. We have evidence of this fact by our test scenarios which show this trend.

This is also not to say that we are against local or global program analysis. But we also believe that this can be done inside an object model without regress to some byte code. And even eventually such an analysis can be done dynamically and varying during runtime which invalidates the idea of static byte code to some extent. Proof of concept that an object model can easily be instrumented is our development Prolog which has been built on top of the runtime Prolog. Further the runtime Prolog itself shows some optimizations such as tail recursion and indexing without the use of any byte code.

Nevertheless we have also to be careful when using an object oriented programming language for implementation. For example it would not be feasible to use little pair objects to represent the molecules, instead we pass around two arguments. Further to avoid stack overflow we have to manually convert most of the needed internal routines so that they take advantage of tail recursion. Many of these details will be hidden from the application programmer by the Prolog API. So that externally we will provide objects representing a pair of a frame and a skeleton for the convenience of the application programmer.

In the following we see frames and skeletons in action. We show the object configuration before and after the successful unification of f(X) = f(g(Y)). Each frame is basically an array of molecule containers. A un-instantiated variable corresponds to an empty molecule which we represent by a null (⊥) frame and a null (⊥) skeleton:

Picture 8: Example Unification

The result of the unification is the instantiation X = g(Y). The instantiation is stored in the frame. Both the frame and the skeleton of the instantiation source are stored in the instantiation target molecule. On the other hand the skeletons themselves are left untouched. The skeletons are only referenced completely or partially by eventual instantiations.

Unification is an operation that is heavily used by the interpreter during defined predicate execution. Each successful logical inference is based on the unification between a given goal and a clause head. Nevertheless we will less often see unification in the application programming interface. Instead we will provide the application programmer explicit construction and deconstruction operations. The reason is that these operations are easier to deal with than using a detour over unification based on patterns. Also it is not clear where the needed patterns come from in the first place.

Constructors solve the chicken egg problem for the existence of patterns. They will be particular useful in the call-in interface of the Prolog interpreter. They will allow the application programmer to create goals on the fly as he wishes and to submit them to the Prolog interpreter. The goals can contain built-in predicates and/or user defined predicates. So that constructors provide a uniform access to all predicates of the Prolog interpreter. The constructors should not only allow the construction of atomic values, it should also be possible to construct variables and compounds so that that all kinds of predicates can be invoked.

Deconstructors can simulate many steps of unification. A basic operation that will be useful for the call-in interface is the retrieval of the instantiation of a variable. This way the application programmer can retrieve terms after the successful execution of goals. But the application programmer might also be interested in analysing and dissecting the resulting terms. Therefore we will further find type checking and various accessors among the deconstructing operations. Accessors are needed to retrieve the Java values from atomic terms and to pick the arguments of a compound.

Picture 9: Terms during Execution

Whereby the usage of the call-in interface starts with construction and ends with deconstruction, the roles are reversed for the call-out interface. When custom code from the application programmer is called by the Prolog interpreter, the first step to do is to analyse and dissect the provided arguments. When this phase was successful the custom code will perform the desired task and produce some results. To return these results to the Prolog interpreter a repackaging as terms will be necessary again and thus constructors will be invoked. Finally the custom code can decide by itself which arguments are output. Output arguments can be instantiated by unification.

An alternative to manually constructing terms is the use of the Prolog syntax. The Prolog API should provide methods to parse and un-parse terms. The Prolog syntax is an umbrella for canonical written terms and terms written by respecting operator definitions. Therefore one parse method is sufficient. On the other hand multiple un-parse methods seem appropriate. One un-parse method that writes terms canonical might be directly defined for terms. On the other hand writing terms by respecting operator definitions must come from inside an interpreter that has an associated knowledgebase.

The unification schema allows a high degree of sharing. We are not forced to copy terms during unification since we can re-use the same skeleton by multiple unifications. On the other when a clause is asserted we need to recreate the given skeletons. We have arranged that ground skeletons are not copied but shared. Further methods are known such as sharing common sub-terms [3]. We did not implement a detection of common sub-terms since this also slows down the system. There exist optimizations to share common sub-terms across a find all invocation [3]. We did also not implement such detection.