Parsing Word-graphs with Probabilities

In this system the input for the parser is not a simple list of words, as we have assumed up to now, but rather a word-graph: a directed, acyclic graph where the states are points in time, and the edges are labelled with word hypotheses and their corresponding acoustic score. Thus, such word-graphs are acyclic weighted finite-state automata.

In [9] a framework for processing ill-formed input is described in which certain common errors are modelled as (weighted) finite-state transducers. The composition of an input sentence with these transducers produces a (weighted) finite state automaton which is then input for the parser. In such an approach the need to generalize from input strings to input finite-state automata is also clear.

The generalization from strings to weighted acyclic finite-state automata introduces essentially two complications. Firstly, we cannot use string indices anymore. Secondly we need to keep track of the acoustic scores of the words used in a certain derivation.

In the head-corner parser, this leads to an alternative to
the predicate `smaller_equal/2`. Rather than a simple integer
comparison, we now need to check that a derivation from `P0` to
`P` can be extended to a derivation from `E0` to `E` by
checking that there are paths in the word-graph from `E0` to ` P0` and from `P` to `E`.

The predicate `connection/2` is true if there is a path in the
word-graph from the first argument to the second argument. It is
assumed that state names are integers; to rule out cyclic word-graphs
we also require that for all transitions from `P0` to `P` it
is the case that `P0` < `P`. Transitions in the word-graph
are represented by clauses of the form ` wordgraph:trans(P0,Sym,P,Score)` which indicate that there is a
transition from state `P0` to `P` with symbol `Sym` and
acoustic score `Score`. The connection predicate can be specified
simply as the reflexive and transitive closure of the transition
relation between states:

A somewhat different approach that may turn out to be more efficient is to use the ordinary comparison operator that we used in the original definition of the head-corner parser. The possible extra cost of allowing impossible partial analyses is worthwhile if the more precise check would be more expensive. If for typical input word-graphs the number of transitions per state is high (such that almost all pairs of states are connected), then this may be an option.

1998-09-24