The basic structure of BUG that will be explained in a moment can be summarized by the set of Prolog clauses <7>.
This simple Prolog program, that bears a close connection to the lc parser described in Pereira & Shieber 1987, represents the basic structure of the generator I will develop in the remainder of this paper. It is assumed that the grammar consists of clauses 'rule( Head, Mother, Lefts, Rights)' as a result of compiling PATR rules into Prolog, where Head is the feature structure of the semantic head of this rule, Mother the feature structure of the mother node, Lefts is a list of feature structures that describe the daughters that precede the head and Rights the feature structures that are preceded by the head. The lexicon simply consists of clauses 'word(Word,F)' where Word is the terminal string and F its feature structure.
In the program the clause 'link' is used as a top-down oracle. A basic link between A and B exists if and only if there is a rule whose mother's semantics unifies with A's semantics and whose semantic head's semantics unifies with B's semantics. All possible basic links are precompiled. A link between A and B exists if there is a basic link between A and B or if there is a basic link between A and C, and a link between C and B. Usually the number of different basic links is very small. The link predicate is comparable to the reachability tables in parse theory (cf. Kay 1980) and the link predicate within BUP (Matsumoto et al. 1983).
Viewed procedurally, the generation process defined in <7> proceeds bottom-up from semantic head to semantic head. The surface string grows to the left and to the right. First a 'seed' of the semantics is computed by the 'link' predicate. From this seed the predicate 'up' builds a feature structure that can be unified with the original feature structure. A rule is selected if a seed unifies with the semantic head of that rule. Other daughters of this rule are then generated recursively; their resulting strings are concatenated to the left and to the right of the string dominated by the seed. The feature structure of the mother becomes the new seed. It dominates the modified string. Now the 'up' predicate is called with this new seed. Eventually a seed can be unified with the original feature structure, yielding the final string; this case is defined in the first clause of 'up'.
An example will clarify this strategy. Suppose we start with semantic structure <4>. Suppose moreover that we have the following (abbreviated) rules, where the H categories represent the semantic heads <8>.
This simple grammar only allows one type of link: a link exists between
feature structures that share their semantics.
The generation process is started by a call to the generate predicate,
where Top is a feature structure whose semantics is instantiated; the
second argument of 'generate' represents the string that is to be found
in difference list notation.
The predicate 'link' relates feature structures whose semantics are
shared; therefore Down will be a feature structure with the same semantics
The predicate 'word' tries to find a lexical item whose feature structure
unifies with Down. In this example the lexical item 'kisses' is the only
candidate. Now the syntax of Down will be instantiated as well.
The predicate 'up' will now be called with the two feature structures
Down and Top, and two strings in difference list notation. The first
string represents the string that is found up to now (the word 'kisses');
the second string is the resulting string.
The predicate 'up' will select a rule whose semantic head unifies
with Down. In this example the rule <8b> is a possible candidate.
A link is computed between the mother of this rule Mother
and the feature structure Top.
The predicate 'generate_list' generates a string for daughters of this
rule to the left of the semantic head, and a string for daughters to the right
of the head. The first call to 'generate_list' succeeds trivially as there
are no daughters to the left of the head. The second call to 'generate_list'
yields the string 'mary'.
The resulting string 'kisses mary' will be the third argument to
the recursive call to 'up'; the feature structures Mother and Top will be
the first and second argument.
In this next call to 'up' rule <8a> is a possible candidate.
Now the first call to 'generate_list' will yield 'john', whereas the
second succeeds trivially.
Finally the predicate 'up' is called with the instantiation of the mother of <8a>, Top,
and the string 'john kisses mary'. As the two feature structures can unify the
recursion bottoms out by unifying the strings. The string 'john kisses mary' will
therefore be the final result.
Note that words, rules and links are selected
nondeterministically; this nondeterminism is handled by
Prolog's built-in backtrack mechanism.