next up previous
Next: Head-corner Parsing and Robustness Up: Robust Parsing with the Previous: Introduction


Head-corner Parsing for Natural Language Understanding


I assume that grammars are defined in the Definite Clause Grammar formalism [5]. Without any loss of generality I assume that no external Prolog calls (the ones that are defined within { and }) are used. Moreover I will assume that such a grammar is represented in a somewhat different manner to make the definition of the parser easier, and to make sure that rules are indexed appropriately. This representation will in practice be compiled from a representation in user-friendly notation.

More specifically, I will assume that grammar rules are represented by the predicate headed_rule/4 in which the first argument is the head of the rule, the second argument is the mother node of the rule, the third argument is the reversed list of daughters left of the head, and the fourth argument is a list of the daughters right of the head.

As an example, the DCG rule

x(A,E) --> a(A), b(B,A), x(C,B), d(C,D), e(D,E).

of which the third daughter constitutes the head, is represented now as:

headed_rule( x(C,B), x(A,E), [b(B,A), a(A)], [d(C,D), e(D,E)]).

I will be assuming furthermore that lexical lookup has been performed already by another module. This module has asserted clauses for the predicate lexical_analysis/3 where the first two arguments are the string positions and the third argument is the (lexical) category. For an input sentence `Time flies like an arrow' this module may produce the following set of clauses:

lexical_analysis(0,1,verb).  lexical_analysis(0,1,noun).
lexical_analysis(0,2,noun).  lexical_analysis(1,2,noun).
lexical_analysis(1,2,verb).  lexical_analysis(2,3,prep).
lexical_analysis(2,3,verb).  lexical_analysis(3,4, det).

The Basics

A simple non-deterministic version of the head-corner parser is now given as follows. The predicate visible to the rest of the world will be the predicate parse/3. This predicate is defined in terms of the parse/5 predicate. The extra arguments introduce a pair of indices which represent the extreme positions between which a parse should be found. This is explained in more detail in [8].

% parse(?Cat,+P0,+P)
parse(Cat,P0,P) :- parse(Cat,P0,P,P0,P).

A goal category can be parsed if a predicted lexical category can be shown to be a head-corner of that goal.

% parse(?Cat,?P0,?P,+E0,+E) 
% there is a category Cat from P0-P within the interval E0-E
parse(Cat,P0,P,E0,E) :-

The head-corner predicate constructs in a bottom-up fashion larger and larger head-corners.

% head_corner(?Small,+P0,+P,?Cat,?Q0,?Q,+E0,+E)
% Small from P0-P is a head-corner of Cat from Q0-Q 
% where Q0-Q occurs within E0-E
head_corner(Small,Q0,Q,Cat,P0,P,E0,E) :-
    parse_left_ds(RevLeftDs,QL,Q0,E0),  parse_right_ds(RightDs,Q,QR,E),

To parse a list of daughter categories we have to parse each daughter category in turn.

% parse_left_ds(+RevLeftDs,-Qleft,+Qright,+ExtremeLeft) 
% there are categories LeftDs from Qleft to Qright 
% s.t. RevLeftDs is reverse of LeftDs, and ExtremeLeft < Qleft. 
parse_left_ds([H|T],Q0,Q,E) :-
    parse(H,Q1,Q,E,Q), parse_left_ds(T,Q0,Q1,E).

% parse_right_ds(+RightDs,+Qleft,-Qright,+ExtremeRight) 
% there are categories RightDs from Qleft to Qright s.t. 
% Qright < ExtremeRight. 
parse_right_ds([H|T],Q0,Q,E) :-
    parse(H,Q0,Q1,Q0,E),  parse_right_ds(T,Q1,Q,E).

A predicted category must be a lexical category that lies somewhere between the extreme positions.

% predict(?Cat,-P0,-P,+E0,+E,-SmallCat,-Q0,-Q) 
% SmallCat from Q0-Q is a lexical category and possible 
% head-corner of Cat from P0-P, where P0-P is within E0-E.
predict(_Cat,_P0,_P,E0,E,Small,Q0,Q) :-
    lexical_analysis(Q0,Q,Small),  between(Q0,Q,E0,E).
% between(+P0,+P,+E0,+E) P0-P occurs within E0-E
between(P0,P,E0,E) :-  E0 =< P0, P =< E.


The simple and non-deterministic version of the head-corner parser given above is turned into a very efficient parser by making use of the following techniques. Each of these techniques is explained in [8] in more detail:

Parsing Word-graphs

The head-corner parser is one of the parsers developed within the NWO Priority Programme on Language and Speech Technology. In this program a spoken dialog system is developed for public transportation information [2]

In this system the input for the parser is not a simple list of words, but rather a word-graph: a directed, acyclic graph where the states are points in time, and the edges are labelled with word hypotheses and their corresponding probability. Thus, such word-graphs are acyclic weighted finite-state automata.

In certain approaches toward ill-formed input processing the desire to generalize from input strings to input finite-state automata is also clearly present. E.g., in [3] a framework for ill-formed input processing is described in which certain common errors are modelled as (weighted) finite-state transducers. The composition of an input sentence with these transducers produces a (weighted) finite state automaton which is then input for the parser.

The generalization from strings to weighted automata introduces essentially two complications. Firstly, we cannot use string indices anymore. Secondly we need to keep track of the probabilities of the words used in a certain derivation.

Parsing on the basis of a finite-state automaton can be seen as the computation of the intersection of that automaton with the grammar. It can be shown that if the definite-clause grammar is off-line parsable, and if the finite-state automaton is acyclic, then this computation can be guaranteed to terminate [7]. Moverover, existing techniques for parsing based on strings can be generalized easily by using the names of states in the automaton instead of the usual string indices.

In the head-corner parser, this leads to a change in the definition of the predicate between/4. Rather than a simple integer comparison, we now need to check that a derivation from P0 to P can be extended to a derivation from E0 to E by checking that there are paths in the word-graph from E0 to P0 and from P to E.

The predicate between/4 is implemented using memo-ization as follows. It is assumed that state names are integers; to rule out cyclic word-graphs we also require that for all transitions from P0 to P it is the case that P0 < P. Transitions in the word-graph are represented by clauses of the form wordgraph:trans(P0,Sym,P,Score) which indicate that there is a transition from state P0 to P with symbol Sym and probability Score.

between(L0,R0,L,R) :-  connection(L,L0), connection(R0,R).

        (  var(A)          -> true
        ;  var(B)          -> true
        ;  A=:=B           -> true
        ;  B < A           -> fail % word-graphs are acyclic
        ;  ok_conn(A,B)    -> true
        ;  fail_conn(A,B)  -> fail
        ;  wordgraph:trans(A,_,X,_),
           connection(X,B) -> assertz(ok_conn(A,B))
        ;  assertz(fail_conn(A,B)),

Accounting for Word-graph Probabilities.

In order to account for the probabilities of a derivation (defined as the product of the probabilites associated with all transitions from the word-graph involved in the derivation) we assume that the predicate lexical_analysis represents the probability of the piece of the word-graph that it covers by an extra argument. During the first phase probabilities are ignored. During the second phase (when a particular derivation is constructed) the probabilities are combined.

next up previous
Next: Head-corner Parsing and Robustness Up: Robust Parsing with the Previous: Introduction
Noord G.J.M. van