For instance, we can obtain a modest increase in efficiency by exploiting Dijkstra's algorithm if we delay some of the work the parser has to do for some category q, until the robustness component actually has a need for that category q. Since Dijkstra's algorithm will not visit every q, the amount of work is reduced. We exploited this in our implementation as follows. The parser works in two phases. In the first phase (recognition) no semantic constraints are taken into account (in order to pack all ambiguities). In the second phase semantic constraints are applied. This second phase can then be delayed for some category q until Dijkstra's algorithm uses an edge based on q. For a number of categories, therefore, this second phase can be skipped completely.
However, we had three reasons for preferring the DAG-SHORTEST-PATH algorithm given above. Firstly, this algorithm is simpler than Dijkstra's algorithm. Secondly, negative weights do show up in a number of circumstances. And thirdly, the expected efficiency gain was not observed in practice.
An example where negative weights may show up is the following. Suppose we define a method which takes into account Ngram scores but nothing else, i.e. all other sources of information such as acoustic scores are ignored. It turns out that a straightforward implementation of such a method is non-optimal since it will favour paths in the word-graph consisting of a small number of long words over paths (of the same duration) consisting of a larger number of smaller words, only because more scores have to be added. A simple and effective way to eliminate this effect, is to subtract a constant from each score. However, this subtraction may yield negative numbers.