next up previous
Next: Conclusion Up: Treatment of -Moves in Previous: Three Variants for -Moves

Subsections



Experiments

Two sets of experiments have been performed. In the first set of experiments a number of random automata is generated according to a number of criteria (based on [8]) . In the second set of experiments, results are provided for a number of (much larger) automata that surfaced during actual development work on finite-state approximation techniques.

Random automata.

Firstly, consider a number of experiments for randomly generated automata. Following [8], the absolute transition density of an automaton is defined as the number of transitions divided by the square of the number of states times the number of symbols (i.e. the number of transitions divided by the number of possible transitions). Deterministic transition density is the number of transitions divided by the number of states times the number of symbols (i.e. the ratio of the number of transitions and the number of possible transitions in a deterministic machine). [8] shows that deterministic transition density is a reliable measure for the difficulty of subset construction. Exponential blow-up can be expected for input automata with deterministic transition density of around 2.5

A number of automata were generated randomly, according to the number of states, symbols, and transition density. The random generator makes sure that all states are reachable from the start state. For the first experiment, a number of automata was randomly generated, consisting of 15 symbols, and 15, 20, 25, 100 or 1000 states, using various densities (and no $ \epsilon$-moves). The results are summarised in the figures 3 and 4. Only a single result is given since each of the implementations works equally well in the absence of $ \epsilon$-moves.6

Figure 3: Deterministic transition density versus CPU-time in msec. The input automata have 25 states; no $ \epsilon$-moves.
\begin{figure}
\centerline{\psfig{file=dts25.ps,width=\textwidth}}\end{figure}

Figure 4: Deterministic transition density versus CPU-time in msec. The input automata have 100 states; no $ \epsilon$-moves.
\begin{figure}
\centerline{\psfig{file=dts100.ps,width=\textwidth}}\end{figure}

A new concept called absolute jump density is introduced to specify the number of $ \epsilon$-moves. It is defined as the number of $ \epsilon$-moves divided by the square of the number of states (i.e., the probability that an $ \epsilon$-move exists for a given pair of states). Furthermore, deterministic jump density is the number of $ \epsilon$-moves divided by the number of states (i.e., the average number of $ \epsilon$-moves which leave a given state). In order to measure the differences between the three implementations, a number of automata has been generated consisting of 15 states and 15 symbols, using various transition densities between 0.01 and 0.3 (for larger densities the automata tend to collapse to an automaton for $ \Sigma^{*}_{}$). For each of these transition densities, jump densities were chosen in the range 0.01 to 0.24 (again, for larger values the automaton collapses). In figure 5 the outcomes of this experiment are summarised by listing the average amount of CPU-time required per deterministic jump density (for each of the three algorithms). Thus, every dot represents the average for determinising a number of different input automata with various absolute transition densities and the same deterministic jump density. The figures 6, 7 and 8 summarise similar experiments using input automata with 20, 25 and 100 states.7

Figure 5: Average amount of CPU-time versus jump density for each of the three algorithms, and FSM. Input automata have 15 states. Absolute transition densities: 0.01-0.3.
\begin{figure}
\centerline{\psfig{file=15.ps}}\end{figure}

Figure 6: Average amount of CPU-time versus jump density for each of the three algorithms, and FSM. Input automata have 20 states. Absolute transition densities: 0.01-0.3.
\begin{figure}
\centerline{\psfig{file=20.ps}}\end{figure}

Figure 7: Average amount of CPU-time versus deterministic jump density for each of the three algorithms, and FSM. Input automata have 25 states. Absolute transition densities: 0.01-0.3.
\begin{figure}
\centerline{\psfig{file=25.ps}}\end{figure}

Figure 8: Average amount of CPU-time versus deterministic jump density for each of the three algorithms, and FSM. Input automata have 100 states. Absolute transition densities: 0.001-0.0035.
\begin{figure}
\centerline{\psfig{file=100.ps}}\end{figure}

The striking aspect of these experiments is that the per graph algorithm is more efficient for lower deterministic jump densities, whereas, if the deterministic jump density gets larger, the per subset algorithm is more efficient. The turning point is around a deterministic jump density between 1 and 1.5, where it seems that for larger automata the turning point occurs at a lower determinisic jump density. Interestingly, this generalisation is supported by the experiments on automata which were generated by approximation techniques (although the results for randomly generated automata are more consistent than the results for `real' examples).

Experiment: Automata generated by approximation algorithms

The automata used in the previous experiments were randomly generated, according to a number of criteria. However, it may well be that in practice the automata that are to be treated by the algorithm have typical properties which were not reflected in this test data. For this reason results are presented for a number of automata that were generated using approximation techniques for context-free grammars [12,11,4]. In particular, a number of automata has been used generated by Mark-Jan Nederhof using the technique described in [11]. In addition, a small number of automata have been used which were generated using the technique of [12] (as implemented by Nederhof).

The automata typically contain lots of jumps. Moreover, the number of states of the resulting automaton is often smaller than the number of states in the input automaton. Results are given in table 1. One of the most striking examples is the ygrim automaton consisting of 3382 states and 10571 jumps. For this example, the per graph implementation ran out of memory (after a long time), whereas the per subset algorithm produced the determinised automaton relatively quickly. The FSM implementation took much longer for this example (whereas for many of the other examples it performs better than our implementations). Note that this example has the highest number of jumps per number of states ratio.


Table 1: Results for automata generated by approximation algorithms. The dashes in the table indicate that the corresponding algorithm ran out of memory (after a long period of time) for that particular example.
input automaton CPU-time (msec)
Id #states # transitions # jumps graph subset state FSM
griml.n 238 43 485 2060 100 140 40
g9a 342 58 478 260 70 70 30
g7 362 424 277 180 240 200 60
g15 409 90 627 280 130 180 40
ovis5.n 417 702 130 290 320 380 190
g9 438 313 472 560 850 640 110
g11 822 78 1578 1280 160 160 60
g8 956 2415 330 500 500 610 140
g14 1048 403 1404 1080 1240 730 120
ovis4.n 1424 2210 660 2260 2220 2870 1310
g13 1441 1006 1404 2400 3780 2550 440
rene2 1800 2597 96 440 530 600 200
ovis9.p 1868 2791 3120 83340 80400 87040 52560
ygrim 3382 5422 10571 - 2710 70140 784910
ygrim.p 48062 63704 122095 - 1438960 - 8575850
java19 54369 28333 59394 130130 55290 64420 8470
java16 64210 43935 43505 67180 24200 31770 6370
zovis3 88156 78895 79437 - 968160 - 768440
zovis2 89832 80400 80935 - 1176650 - 938040


next up previous
Next: Conclusion Up: Treatment of -Moves in Previous: Three Variants for -Moves
Noord G.J.M. van
1998-09-24