next up previous
Next: Experiments Up: Treatment of -Moves in Previous: Subset Construction

Subsections



Three Variants for $ \epsilon$-Moves

The algorithm presented in the previous section does not treat $ \epsilon$-moves. In this section three possible extensions of the algorithm are identified to treat $ \epsilon$-moves.

Per graph

This variant can be seen as a straightforward implementation of the constructive proof that for any given automaton with $ \epsilon$-moves there is an equivalent one without $ \epsilon$-moves [6][page 26-27].

For a given M = (Q,$ \Sigma$,$ \delta$, S, F) this variant first computes M' = (Q,$ \Sigma$,$ \delta{^\prime}$, S', F), where S' = $ \epsilon$-CLOSURE(S), and $ \delta{^\prime}$(q, a) = $ \epsilon$-CLOSURE($ \delta$(q, a)). The function $ \epsilon$-CLOSURE is computed by using a standard transitive closure algorithm for directed graphs: this algorithm is applied to the directed graph consisting of all $ \epsilon$-moves of M. Such an algorithm can be found in several textbooks (see, for instance, [3]).

The advantage of this approach is that the subset construction algorithm does not need to be modified at all. Moreover, the transitive closure algorithm is fired only once (for the full graph), whereas the following two variants call a specialised transitive closure algorithm possibly many times.

Per subset and per state

Next we discuss two variants (per subset and per state) in which the treatment of $ \epsilon$-moves is integrated with the subset construction algorithm. We will show later that such an integrated approach is in practice often more efficient than the per graph approach if there are many $ \epsilon$-moves. The per subset and per state approaches are also more suitable for a lazy implementation of the subset construction algorithm (in such a lazy implementation subsets are only computed with respect to a given input string).

The per subset and the per state algorithms use a simplified variant of the transitive closure algorithm for graphs. Instead of computing the transitive closure of a given graph, this algorithm only computes the closure for a given set of states. Such an algorithm is given in figure2.

Figure 2: Epsilon-closure Algorithm
\begin{program}
\FUNCT \vert closure\vert(T)
D =: \emptyset
\FOREACH t\in T \DO ...
...d } q \mbox{ unmarked to } D \FI
\OD
\OD
\keyword{return} ~D
\END
\end{program}

In either of the two integrated approaches, the subset construction algorithm is initialised with an agenda containing a single subset which is the $ \epsilon$-CLOSURE of the set of start-states of the input; furthermore, the way in which new transitions are computed also takes the effect of $ \epsilon$-moves into account. Both differences are accounted for by an alternative definition of the epsilon_closure function.

The approach in which the transitive closure is computed for one state at a time is defined by the following definition of the epsilon_closure function. Note that we make sure that the transitive closure computation is only performed once for each input state, by memorising the closure function.


\begin{program}
\FUNCT \vert epsilon_closure\vert(U) \rcomment{variant 2: per st...
...}~ \bigcup_{u\in U}\vert memo\vert(\vert closure\vert(\{u\}))
\END
\end{program}

In the case of the per subset approach the closure algorithm is applied to each subset. We also memorise the closure function, in order to ensure that the closure computation is performed only once for each subset. This can be useful since the same subset can be generated many times during subset construction. The definition simply is:


\begin{program}
\FUNCT \vert epsilon_closure\vert(U) \rcomment{variant 3: per subset}
\keyword{return}~ \vert memo\vert(\vert closure\vert(U))
\END
\end{program}

The motivation for per state approach may be the insight that in this case the closure algorithm is called at most | Q| times. In contrast, in the per subset approach the transitive closure algorithm may need to be called 2| Q| times. On the other hand, in the per state approach some overhead must be accepted for computing the union of the results for each state. Moreover, in practice the number of subsets is often much smaller than 2| Q|. In some cases, the number of reachable subsets is smaller than the number of states encountered in those subsets.


next up previous
Next: Experiments Up: Treatment of -Moves in Previous: Subset Construction
Noord G.J.M. van
1998-09-24