next up previous
Next: String accuracy. Up: Evaluation Procedure and Criteria Previous: Procedure


The NLP components were compared with respect to the following two tasks. Note that in each task, analysis proceeds in isolation from the dialogue context. The first task is to provide an update for the test sentence (in this report we refer to this update as the `best update'). The second task is to provide an update and a sentence for the word graph (`best update' and `best sentence'). The quality of the NLP components will be expressed in terms of string accuracy (comparison of the best sentences with the test sentences), semantic accuracy (comparison of the best updates with the test updates) and computational resources. Each of these criteria is now explained in more detail.