The NWO Priority Programme Language and Speech Technology is a research programme aiming at the development of spoken language information systems. Its immediate goal is to develop a demonstrator of a public transport information system, which operates over ordinary telephone lines. This demonstrator is called OVIS, Openbaar Vervoer Informatie Systeem (Public Transport Information System). The language of the system is Dutch.
At present, a prototype is in operation, which is a version of a German system developed by Philips Dialogue Systems in Aachen , adapted to Dutch.
This German system processes spoken input using ``concept spotting'', which means that the smallest information-carrying units in the input are extracted, such as names of train stations and expressions of time, and these are translated more or less individually into updates of the internal database representing the dialogue state. The words between the concepts thus perceived are ignored.
The use of concept spotting is common in spoken-language information systems [12,5,3,1]. Arguments in favour of this kind of shallow parsing is that it is relatively easy to develop the NLP component, since larger sentence constructs do not have to be taken into account, and that the robustness of the parser is enhanced, since sources of ungrammaticality occurring between concepts are skipped and therefore do not hinder the translation of the utterance to updates.
The prototype presently under construction departs from the use of concept spotting. The grammar for OVIS describes grammatical user utterances, i.e. whole sentences are described. Yet, as part of this it also describes phrases such as expressions of time and prepositional phrases involving e.g. train stations, in other words, the former concepts. By an appropriate parsing algorithm one thus combines the robustness that can be achieved using concept spotting with the flexibility of a sophisticated language model.
The main objective of this paper is to show that our grammatical approach is feasible in terms of accuracy and computational resources, and thus is a viable alternative to pure concept spotting.
Although the added benefit of grammatical analysis over concept spotting is not clear for our relatively simple application, the grammatical approach may become essential as soon as the application is extended in such a way that more complicated grammatical constructions need to be recognized. In that case, simple concept spotting may not be able to correctly process all constructions, whereas the capabilities of the grammatical approach extend much further.
Whereas some (e.g. ) argue that grammatical analysis may improve recognition accuracy, our current experiments have as yet not been able to reveal a clear advantage in this respect.
As the basis for our implementation we have chosen definite-clause grammars (DCGs) , a flexible formalism which is related to various kinds of common linguistics description, and which allows application of various parsing algorithms. DCGs can be translated directly into Prolog, for which interpreters and compilers exist that are fast enough to handle real-time processing of spoken input. The grammar for OVIS is in turn written in a way to allow an easy translation to pure DCGs.2
The structure of this paper is as follows. In Section 2 we describe the grammar for OVIS, and in Section 3 we describe the output of the NLP module. The robust parsing algorithm is described in Section 4. Section 5 reports test results, showing that grammatical analysis allows fast and accurate processing of spoken input.