Clause Indexing

The Jekejeke Minlog extension is based on a notion of clause Java objects. We will explain how these objects come into being and how these objects are associated with the knowledge base. The knowledge base is divided into predicates. The predicates hold clause sets for all their clauses. But they also hold a recursive index structure. The index structure first says which arguments are currently indexed, and then provides further clause sets and index structures for each index entry.


Picture 1: Clause Indexing

The Java virtual machine treats clauses as normal Java objects in the heap. They are automatically reclaimed during the various garbage collection passes of the Java virtual machine. When a clause is removed it will not be any more referenced by the knowledge base. If an active thread does not anymore use the clause, it will then automatically be removed by the garbage collection sooner or later. A thread might use a clause via its clause reference without having it currently associated to the knowledge base.

The Jekejeke Minlog toolbox provides clause reference based operations that allow boot-strapping hypothetical and counter factual reasoning. These clause reference based operations are also used in the forward chaining component. Their main characteristic is that they are automatically undone during backtracking. This is archived by installing unbind handlers in the variable trail of the interpreter. Therefore a thread might reference a clause from the variable trail even if it does not reference it anymore in the current execution.

We did not yet spend too much time on optimizing away references from the variable trail. Instead we tried to make the toolbox operations itself as performing as possible. This means that programs that use the Jekejeke Minlog toolbox might use quite an amount of memory. This could be an issue for programs that perform a great number of operations without backtracking. On the other hand for problems where the number of operations is rather limited and a lot of backtracking occurs memory is not so much an issue. This is for example the case for our CLP(FD) solver.

A first attack vector for performing operations is to assure that the compilation of a clause doesn’t take too much time. In the Jekejeke Runtime the compilation of a clause is a two-phase process. In the first phase the clause is copied and will have its own variables. In the second phase the variables are classified and the code for the head and the body is generated. These two steps have been optimized toward ground facts. Already in the first phase during copy, if a sub-term is found where some extra data structure in the compound indicates that it is ground, no traversal for copying is needed. Finally since a ground fact doesn’t have variables the second phase doesn’t need some work to do.

A second attack vector for performing operations is to assure that the association and deassociation of a clause with the knowledge base doesn’t take too much time. Here we work with custom data structures that are optimized towards representing clause sets and index structures. For clause sets we switch between arrays and a custom hash set depending on the size of the clause set. For the index structure we use a custom hash map. The custom hash set and hash map avoids some bottlenecks of the Java default implementation. For example we do not need to create an iterator object to traverse these data structures.

Comments