next up previous contents
Next: Reversible MT Up: Summary Previous: Semantic-head-driven Generation

Head-corner parsing

In chapter 4 I discuss some proposals for string operations beyond concatenation, such as Pollard's proposal for an incorporation of several head-wrapping operations in a GPSG; Joshi's Tree Adjoining Grammars; and Reape's proposal for an incorporation of sequence-union constraints in an HPSG. I define a head-driven parsing algorithm, called `head-corner' parsing, for a class of grammars in which strings are constructed in a linear and non-erasing fashion. Linearity (or `non-copying') requires that a given rule only introduces some constant number of terminal symbols. A grammar rule is non-erasing, if the terminal symbols associated with the mother node is a superset of those associated with the daughter nodes. The proposals mentioned above are in this class. The head-corner parser thus is applicable for a superset of concatenative grammars, whereas most other parsers are only applicable for concatenative grammars.

Head-corner parsing is a parsing technique which proceeds head-driven and bidirectionally; both in the sense that the parser does not proceed from left-to-right, nor does it proceed either bottom-up or top-down. As the parser proceeds head-driven it is possible to exploit the usual percolation of syntactic features between the head-daughter and the mother node, in order to improve upon the goal-directedness of the algorithm. Furthermore, such an order of processing exploits the fact that heads determine what other categories may occur.

In order for the head-corner parser to be generally applicable for linear and non-erasing grammars, the input string is used as a guide during the parsing process. In contrast to parsers for concatenative grammars, the elements from this guide are not necessarily removed from left to right, but can be removed in any order. In other words, the guide functions as a set, rather than a stack. To understand the basics of this algorithm, assume that rules are represented as follows.

For simplicity, assume that all terminal symbols are introduced on rules without daughters (lexical entries), and that all rules with daughters have a head (in chapter 4 no such simplification is assumed). A clause with a head daughter is represented as:

\begin{displaymath}\mbox{\it cr}(\mbox{\rm Head},\mbox{\rm Mother},\langle \mbox{\rm D}_{1} \dots \mbox{\rm D}_{n}\rangle) \mbox{\tt :-} \phi.
Lexical entries dominating the terminal symbol Word are represented as:

\mbox{\it lex}(\mbox{\rm Word},\mbox{\rm Mother})\mbox{\tt :-} \phi.
Finally assume that the predicate head/2 defines the information which is shared between a syntactic head and its mother node. This information might for example be defined as HPSG's Head Feature Principle. Given these assumptions, a simple version of the head-corner parser can be defined as in figure 5.9.

Figure 5.9: A simple version of the head-corner parser. In chapter 4 I discuss a more general version, and some improvements.
\head{\mbox{\it parse}(\mbox{\rm Goal},\mbox{\rm P}_...
...\rm Mid},\mbox{\rm Big},\mbox{\rm P}_{1},\mbox{\rm P}).}

The remainder of chapter 4 discusses extensions and variations of this general scheme. As an example, it is shown how constraint-based versions of Lexicalized Tree Adjoining Grammars can be parsed by a variant of the head-corner parser. An important reduction of the search space for head-corner parsing can be achieved for grammars in which the operations on strings are monotonic with respect to the ordering of the terminal symbols they define. Such a monotonicity property is exhibited by TAGs. It is shown how a simple check improves the efficiency of the parser.

next up previous contents
Next: Reversible MT Up: Summary Previous: Semantic-head-driven Generation
Noord G.J.M. van