Selective Memoization and Goal-weakening

In the head-corner parser it turns out that the `parse/5` predicate
is a very good candidate for memoization. The other predicates are
not. This implies that each *maximal projection* is computed only
once; partial projections of a head can be constructed during a parse
any number of times, as can sequences of categories (considered as
sisters to a head). Active chart parsers `memo' everything (including
sequences of categories); inactive
chart parsers only memo categories, but not sequences of
categories. In our proposal, we memo only those categories that are
`maximal projections', i.e. projections of a head which unify with the
top category (start symbol) or with a non-head daughter of a rule.

The implementation of memoization uses Prolog's
internal database to store the tables. The advantage of this technique
is that we use Prolog's first argument indexing for such
tables. Moreover, during the consultation of the table we need not
worry about modifications to it (in contrast to an approach
in which the table would be maintained as the value of a Prolog
variable). On the other hand, the use of the internal database brings
about a certain overhead. Therefore, it may be worthwhile to
experiment with a meta-interpreter along the lines of the XOLDT system
[32] in which the table is maintained dynamically.

Memoization is implemented by two different tables. The first table
encodes which goals have already been searched. Items in the first table
are called *goal items*. The second table
represents all solved (i.e. instantiated) goals. Items in this second
table are called *result items*. One may be tempted
to use only the second table. But in that case we would not be able to
tell the difference between a goal which has already been searched,
but did not result in a solution (`fail-goal') and a goal which has
not been searched at all. If we have two tables then we can also
immediately stop working on branches in the search space for which it
has already been shown that there is no solution. The distinction
between these two kinds of item is inherited from BUP [11]. The
memoized version of the parse predicate can be defined as
in (3.1).

The first table is represented by the predicate `'GOAL_ITEM'`.
This predicate simply consists of a number of unit-clauses indicating
all goals that have been searched completely. Thus, before we try to
attempt to solve `Goal` we first check whether a goal item for
that goal already exists. Given the fact that `Goal` may contain
variables we should be a bit careful here. Unification is clearly not
appropriate here since that may result in a situation in which a more
general goal is not searched because a more specific variant of that
goal had been solved. We want exactly the opposite: if a more general
version of `Goal` is included in the goal table,
then we can continue to look for a solution in the result table.
It is useful to consider the fact that if we had previously
solved e.g. the goal `parse(s,3,X,3,12)`, then if we later
encounter the goal `parse(s,3,Y,3,10)`, we can also use the second
table immediately: the way in which the extreme positions are used
ensures that the former is more general than the latter. The
predicates for the maintenance of the goal table are defined in (3.1).

The second table is represented by the predicate `'RESULT_ITEM'`.
It is defined by unit-clauses which each represent an instantiated
goal (i.e. a solution). Each time a result is found, it is checked
whether that result is already available in the table. If so, the
newer result is ignored. If no (more general version of) the result
existed, then the result is added to the table. Moreover, more
specific results which may have been put on the table previously are
marked. These results need not be used anymore. ^{4} This is not
strictly necessary but is often useful because it decreases the size
of the tables; in this approach tables are redundancy-free and hence
minimal. Moreover, such more specific results cannot be used anymore
and no work will be done based on those results. Note that
RESULT_ITEMs do not keep track of the extreme positions. This implies
that in order to see whether a RESULT_ITEM is applicable we check
whether the interval covered by the RESULT_ITEM lies within the
extreme positions of the current goal. The predicates dealing with
the result table are defined in (3.1).

The implementation uses a faster implementation of memoizating in which both goal items and result items are indexed by the functor of the category and the string positions.

In the head-corner parser, parse goals are memoized. Note that
nothing would prevent us from memoing other predicates as well. Experience
suggests that the cost of maintaining tables for e.g. the ` head_corner` relation is (much) higher than the associated profit.
The use of memoization for only the `parse/5` goals implies that
the memory requirements of the head-corner parser in terms of the
number of items that is being recorded is much smaller than in
ordinary chart parsers. Not only do we refrain from asserting
so-called *active* items, but we also refrain from asserting * inactive* items for non-maximal projections of heads. In practice
the difference in space requirements can be enormous. This difference
is a significant reason for the practical efficiency of
the head-corner parser.

It turns out that the use of tables defined in the previous subsection can lead to a problem with cyclic unifications. If we assume that Prolog's unification includes the occur check then no problem would arise. But since most versions of Prolog do not implement the occur check it is worthwhile investigating this potential problem.

The problem arises because cyclic solutions can be constructed that would not have been constructed by ordinary SLD-resolution. Furthermore, these cyclic structures lead to practical problems because items containing such a cyclic structure may have to be put in the table. In SICStus Prolog this results in a crash.

An example may clarify the problem. Suppose we have a very simple program containing the following unit clause:

x(A,B).Furthermore suppose that in the course of the computation a goal of the form

?- x(f(X),X)is attempted. This clearly succeeds. Furthermore an item of that form is added to the table. Later on it may be the case that a goal of the form

?- x(Y,Y)is attempted. Clearly this is not a more specific goal than we solved before, so we need to solve this goal afresh. This succeeds too. Now we can continue by picking up a solution from the second table. However, if we pick the first solution then a cyclic term results.

A possible approach to deal with this situation is to index the items
of the second table with the item of the first table from which the
solution was obtained. In other words: if you want to select a
solution from the second table, it must not only be the case that the
solution matches your goal, but also that
the corresponding goal of the solution is more general than your
current goal. This strategy works, but turns out to be considerably
slower than the original version given above. The reason seems to be
that the size of the second table is increased quite drastically,
because solutions may now be added to the table more than once (for
all goals that could give rise to that solution).

It turns out that an improvement of the head-corner parser using a
*goal weakening* technique often eliminates this occur check
problem. Goal weakening is discussed in the following subsection.

The insight behind `goal weakening' (or *abstraction*
[7]) in the
context of memoization is that we may combine a number of slightly
different goals into a single more general goal. Very often it is much
cheaper to solve this single (but more general) goal, than to solve
each of the specific goals in turn. Moreover, the goal table will be
smaller (both in terms of number of items, and the size of individual
items), which can have a very good effect on the amount of memory and
CPU-time required for the administration of the table. Clearly, one
must be careful not to remove essential information from the goal (in
the worst case this may even lead to non-termination of otherwise
well-behaved programs).

Depending on the properties of a particular grammar, it may
for example be worthwhile to *restrict* a given category to its
syntactic features before we attempt to solve the parse goal of that
category. Shieber's restriction operator [19] can be
used here. Thus we essentially throw some information away before an
attempt is made to solve a (memoized) goal. For example, the category

x(A,B,f(A,B),g(A,h(B,i(C))))may be weakened into:

x(A,B,f(_,_),g(_,_))If we assume that the predicate

Note that goal weakening is sound. An answer *a* to a weakened goal *g* is
only considered if *a* and *g* unify. Also note
that goal-weakening is complete in the sense that for an answer *a* to
a goal *g* there will always be an answer *a'* to the weakening of *g*
such that *a'* subsumes *a*.

For practical implementations the use of goal weakening can be
extremely important. It is my experience that a well-chosen goal
weakening operator may reduce parsing times by an order of
magnitude.

The goal weakening technique can also be used to eliminate typical instances of the problems concerning the occur check (discussed in the previous subsection). Coming back to the example in the previous subsection, if our first goal

x(f(X),X)were weakened into

x(f(_),_)then the problem would not occur. If we want to guarantee that no cyclic structures can be formed then we would need to define goal-weakening in such a way that no variable sharing occurs in the weakened goal.

An important question is how to come up with a good goal
weakening operator. For the experiments discussed in the final section
all goal weakening operators were chosen by hand, based on small
experiments and inspection of the goal table and item table. Even if
goal-weakening is reminiscent of Shieber's restriction operator
[19], the rules of the game are quite different:
in the former case as much information as possible is removed
without risking non-termination of the parser.
In the latter case
information is removed *until* the resulting parser terminates.
For the current version of the grammar of OVIS, it turned out that
weakening the goal category in such a way that all information below a
depth of 6 is replaced by fresh variables eliminated the problem caused
by the absence of the occur check; moreover this goal weakening
operator reduced parsing times substantially. In the latest version we
use different goal weakening operators for each different functor.

An interesting special case of goal-weakening is constituted by a goal-weakening operator which ignores all feature constraints, and hence only leaves the functor for each goal category. In this case the administration of the goal table can be simplified considerably (the table consists of ground facts, hence no subsumption checks are required). This technique is used in the MiMo2 grammar and the Alvey NL Tools grammar (both discussed in section 7).

1998-09-24