Type Inference

We show how a sound type inference algorithm can be implemented in using a subject to occurs check constraint. The occurs-check is a check that is needed for a sound unification. It verifies that a variable is not bound to a term which contains the variable itself. Since this check is costly it is often omitted from Prolog systems. As result a Prolog system which omits the occurs-check might produce wrong results.

Consider the following problem of inferring a simple type for a lambda expression. Assume lambda expressions are built from applications app/2, abstractions lam/1 and variables. Simple types are either variables or functions spaces ->/2. Consider contexts T1 : X1, .., Tn : Xn that enumerate the variables of a lambda expression and their types. Let Γ |- T : E denote that the lambda expression E provides context Γ and has the simple type T. We can readily write down the following inference rules:

    -------------- (Id)
    A : X |- A : X

    Γ |- A -> B : S       Δ |- A : T               Γ, A : X |- B : T
    -------------------------------- (MP)        ---------------------- (->R)
           Γ, Δ |- B : app(S,T)            Γ |- A -> B : lam(X,T)

It is straight forward to produce a predicate typed/3 that takes a context Γ, a lambda expression E and then derives the type T if a smaller context Δ subset Γ exists such that Δ |- T : E. Unfortunately without a sound unification, we might assign types to lambda expressions we do not want them to have a simple type. For example we might get an erroneous affirmative for the self-application app(F,F):

/* SWI-Prolog */
?- typed(app(F,F), [F-A], B).
A = (A->B).

Or it might even happen that the Prolog system crashes:

/* Jekejeke Prolog */
?- typed(app(F,F), [F-A], B).
java.lang.StackOverflowError

There are a number of options to avoid either behaviour. Some Prolog systems provide a global flag that switches on the occurs-check for all unifications. Other Prolog systems implement the ISO predicate unify_with_occurs_check/2. Both approaches have their disadvantage. The former might slowdown unifications that don’t need the occurs-check, and the latter might need some cumbersome code rewriting. We present an alternative here by means of the Jekejeke Minlog attribute variables.

The idea is to use the sto/1 constraint that allows imposing an occurs check on the supplied variables. The sto/1 constraint is provided as part of the module herbrand. The detailed source code of this module can be found in the open source code corner of the Jekejeke Minlog web site. The sto/1 constraint is implemented with the help of attribute variables and eagerly checks the condition before the attribute variable is even instantiated. Here are some example uses of the sto/1 constraint:

?- use_module(library(term/herbrand)).

?- sto(X), X=f(Y).
X = f(Y),
sto(Y)

?- sto(X), X=f(Y), Y=g(X).
No

The sto/1 constraint is now deployed like a variable declaration. Whenever one finds a fresh variable that needs an occurs check the sto/1 constraint should be used. In the case of our typed/3 predicate we only find the (MP) rule that needs an additional sto/1 constraint. The (MP) rule is non-analytic it introduces a new formula A in backward chaining. We therefore define a revised typed2/3 predicate with the following modified clause:

typed2(app(E,F), C, T) :- 
   sto(S),
   typed2(E, C, (S->T)),
   typed2(F, C, S).

The sto/1 constraint needs also to be deployed for Prolog queries in the top-level. The general rule is the same here as for clauses. Whenever one finds a fresh variable that needs an occurs check the sto/1 constraint should be used. It is important that the sto/1 constraint is used before the argument receives a cyclic term. Since Jekejeke Prolog doesn’t support cy-clic terms, the sto/1 constraint will hang if it receives an already cyclic term.

Here are some examples for an accordingly modified predicate typed2/3:

?- sto((A,B,C)), typed2(app(E,F), [E-A,F-B], C).
A = (B->C),
sto(C),
sto(B)

?- sto((A,B)), typed2(app(F,F), [F-A], B).
No

?- sto(A), typed2(lam(X,lam(Y,app(Y,X))), [], A).
A = (_A->(_A->_B)->_B),
sto(_B),
sto(_A)

The predicate sto/1 might be a viable alternative to the other two approaches of introducing an occurs check into a Prolog text. The sto/1 predicate has the advantage that it can be selectively applied to variables where necessary. On the other hand it should be noted that the sto/1 predicate introduces an additional memory footprint in the form of allocated attribute variables and hooks.

Comments