In this paper we are concerned with the syntactic analysis phase of a speech analysis system. The input for the syntactic analysis phase in such a system is a word lattice, and the output is a packed representation of all possible parse trees. Syntactic analysis is performed on the basis of some grammar G. Following Bernard Lang we can thus characterize the syntactic analysis phase as the computation of the intersection of a finite state automaton (the word-lattice) and the grammar G. This intersection is a grammar that derives exactly all parse trees for the input.
Note that we allow the input to be a full FSA (possibly including cycles, etc.) in order for certain techniques to handle ill-formed input to be applied. Whereas an ordinary word-graph always defines a finite language, a FSA of course can easily define an infinite number of sentences.
It can be shown that the computation of the intersection of a FSA and a CFG requires only a minimal generalization of existing parsing algorithms. We simply replace the usual string positions with the names of the states in the FSA. It is therefore straightforward to show that the complexity of this process is cubic in the number of states of the FSA (in the case of ordinary parsing the number of states equals n + 1) [4,2] (assuming the grammar is in bilinear format).
In this paper we investigate whether the same techniques can be applied in case the grammar G is a constraint-based grammar rather than a CFG. For specificity we will take G to be a Definite Clause Grammar (DCG) . A DCG is a simple example of a family of grammar formalisms that are widely used in natural language analysis (and generation). The main findings of this paper can be extended to the other members of that family of constraint-based grammar formalisms.