On the Intersection of Finite State
Automata and Definite Clause Grammars
Gertjan van Noord
An elegant framework for robust parsing was presented by Bernard
Lang on last year's TWLT. In Lang's approach parsing is viewed as
the calculation of the intersection of a FSA (the input) and a CFG.
Viewing the input for parsing as a FSA rather than as a string
combines well with standard approaches in speech understanding
systems in which typically a word lattice is produced as an
intermediate result. Furthermore, certain techniques for robust
parsing can be modelled as finite state transducers.
In this paper we investigate how we can generalize this approach
for unification grammars. In particular we will concentrate how we
might calculate the intersection of a FSA and a DCG. It is shown
that existing parsing algorithms can be easily extended for FSA
inputs. However, we also show that the termination properties change
drastically: we show that it is undecidable whether the intersection
of a FSA and an off-line parsable DCG is empty.
Furthermore we discuss techniques to cope with the problem.
Noord G.J.M. van