Usage: fsa [Flag=Val]* [ActionOption]

The fsa program can be started with command-line arguments
(options). A command-line consists of a number of global variable
assignments, following by (at most one) **action** option, followed by
more global variable assignments. If no action option is provided,
then the system runs in interactive mode. If the variable
**interpreter** is set to **on**, then fsa runs in interactive mode after
the action indicated by the action option has been performed. If the
flag **interactive** has been set to **cmdint**, then fsa runs the FSA
command interpreter. Otherwise, you get the ordinary SICStus Prolog
prompt.

Typical actions that can be performed through the use of an action option are:

regular expression operations such as kleene closure, complementation for given automata

determinization and minimization of automata

construction of automaton on the basis of regular expression

visualization of given automaton

apply automaton to a string or set of strings

Aux is a file containing auxiliary regular expression operator definitions. It is loaded into module fsa_regex_aux, and will be used for compiling regular expressions.

Note that your file with definitions of regular expression operators is consulted with the special Prolog-syntax operators for regular expression notation loaded. Thus you can use * .. & etc. in your definitions. Drawback is that you cannot use operator notation for e.g. the is/2 predicate.

A typical auxiliary definition will be:

macro(vowel,{a,e,i,o,u}).

A slightly more interesting one:

macro(free(Expr), ~ $ Expr).

You can also explicitly construct an automaton yourself, e.g.:

rx(my_operator(Expr),Fa) :-

fsa_regex:rx(Expr,Fa0),

my_operator_definition(Fa0,Fa).

so you can call fsa_regex:rx/2 for further compilations.

There can be multiple -aux options.

File is supposed to contain the definition of a predicate module. The file
is loaded and moreover the global variable **pred_module** is set to the
name of the module defined in that file. There can be multiple -pm
options

The File is loaded, using use_module/1. There can be multiple -l options.

evaluates Prolog Goal; Goal is parsed as Prolog term. Example:

fsa -cmd 'listing(library_directory),halt').

There can be multiple -cmd options

Run interactively with the FSA6 command interpreter.

This option can be used to test a given string for acceptance by an automaton read from In (or standard input). Fsa prints `yes' or `no' to standard error. Example:

% fsa -r 'a+' | fsa -a aaa

Prints `yes'.

This option can be used to get all best matches for a given string and an automaton read from In (or standard input). In must contain an automaton.

% fsa -r '[a,a]+' | fsa -approx aaa

Prints:

aa

aaaa

If three file names are given, then the automaton read from the first
file is converted to an automaton in AT&T's **fsm** software
format. That automaton is written into Aut; Syms will contain
a mapping from the integers used in Aut to the actual symbols. If
less than three file names are given, then it is assumed that the
actual symbols can be ignored and no Syms is written.

Converts an automaton in AT&T's **fsm** software format into fsa5
format. Note that a separate symbol definition file is currently not
supported.

For a given automaton read from file In, a Prolog program is written to Out. For details, see the module fsa_compiler.

For a given automaton read from In a C program is written to Out. For details, see module compiler_to_c.

For a given automaton read from In, a JAVA program is written to Out. For details, see module fsa_java.

For a given automaton read from In, a C++ program is written to Out. For details, see module fsa_cpp.

For a given automaton read from In, a C++ program is written to Out. For details, see module fsa_cpp.

The transducers read from A and B are composed, and the result is written to Out. Equivalent to

fsa -r 'compose(file(A),file(B))' >Out

The complementation operator is applied to the automaton read from In, and the result is written to Out.

For the automaton read from In the number of transitions and symbols and
some other properties is written to Out. If **-short** is
specified, then the output is given as a single line consisting of the
number of states, start states, final states, transitions, jumps, and
symbols respectively. Otherwise a more elaborate message is printed
meant for human consumption.

For the automaton read from In various densities are reported to Out. Deterministic density is the number of transition divided by the number of states times the number of symbols; absolute density is the number of transitions dividided by the number of states squared times the number of symbols. Jump density is the number of jumps dividied by the squared number of states. Deterministic density can be used to characterize the difficulty of determinization. For deterministic densities of around 2, exponential blow-up of the output (and hence processing time) can be expected (Leslie 1995). Jump density can be used to estimate the most efficient subset construction algorithm (van Noord 1998).

For the automaton read from In a corresponding DaVinci term is written to Out. This can be used to visualize the automaton In using DaVinci:

fsa -davinci a.nd > a.davinci

daVinci a.davinci

For the automaton read from In a corresponding vcg term is written to Out. This can be used to visualize the automaton In using (x)vcg:

fsa -vcg a.nd | xvcg -

For the automaton read from In a corresponding dot term is written to Out. This can be used to visualize the automaton In using dot or dotty:

fsa -dot a.nd | dotty -

fsa -dot a.nd | dot -Tps | gv -

fsa -dot a.nd | dot -Tgif | xv -

The automaton read from In is determinized and written to Out. FSA6
supports four variants of the determinization algorithm. In the
first form, a heuristic is used (based on the jump density) to
select the variant of the determinization variant. The other
forms indicate the particular variant that is to be used. For details,
refer to the **fsa_determinizer** module.

For the automaton read from In an equivalent automaton without any epsilon transitions (jumps) is written to Out.

Equivalent to:

fsa -r 'ignore(file(A),file(B))' > Out

Equivalent to:

fsa -r 'difference(file(A),file(B))' > Out

The program checks each string read from standard input for acceptance by the automaton read from In. Depending on the type of the automaton, the system reports the transductions for each string, or simply `yes' or `no' (for recognizers). In the third form the recognizer is specified by regular expression Regex rather than by an automaton.

Evaluates Prolog goal. Example:

fsa -prolog 'listing(user:file_search_path).'

This option is used to generate random finite automata, using the algorithm of Leslie 1995. States is the number of states, Syms is the number of symbols, Dens is absolute density, and JDens is the jump density.

Equivalent to:

fsa -r 'intersect(file(A),file(B))' > Out

Equivalent to:

fsa -r 'file(In)*' > Out

Writes the path with the minimum weigth of the automaton read from In.

Equivalent to:

fsa -r 'file(In)+' > Out

Equivalent to:

fsa -r 'reverse(file(In))' > Out

Equivalent to:

fsa -r 'inverse(file(In))' > Out

Equivalent to:

fsa -r 'domain(file(In))' > Out

Equivalent to:

fsa -r 'range(file(In))' > Out

Equivalent to:

fsa -r 'cleanup(file(In))' > Out

Equivalent to:

fsa -r 'identity(file(In))' > Out

Equivalent to:

fsa -r 'option(file(In))' > Out

Equivalent to:

fsa -r 'union(file(A),file(B))' > Out

Equivalent to:

fsa -r 'concat(file(A),file(B))' > Out

Minimizes the automaton read from In. The first version uses the default minimization algorithm (by Brzozwski). The other options explicitly require the algorithms by, respectively, Brzozowski or Hopcroft. Refer to the fsa_regex module and the fsa_minimizer module for details.

Applies the minimization algorithm for transducers (by Mohri) to the transducer read from In. Refer to the fsa_t_determinizer module for details.

For the recognizer read from In strings accepted by In are written to Out.

For the automaton read from In strings or string pairs accepted by In are written to Out, using a sampling procedure based on the weights in In. If In is not a weighted automaton, then uniform weights are assumed. The size of the sample is determined by the flag nr_sol_max. Examples:

fsa -r '[a::0.3*,b::5*]' |

fsa nr_sol_max=10000 -sample |sort |uniq -c | sort -nr

produces:

5331 []

2519 a

1028 aa

456 aaa

187 aaaa

172 b

108 aaaaa

73 ab

32 aaaaaa

28 aab

24 aaaaaaa

17 aaab

8 aaaaaaaa

5 bb

3 abb

3 aaaab

3 aaaaab

2 aaaaaaaaa

1 aabb

fsa -r '[a*,b*]' |

fsa nr_sol_max=100 -sample |sort |uniq -c | sort -nr

produces:

32 []

16 b

11 a

9 bb

5 abb

5 ab

3 bbbb

3 bbb

3 aa

2 bbbbb

2 abbbb

2 aaa

1 bbbbbbb

1 abbbbbb

1 abbb

1 aab

1 aaabb

1 aaaabbb

1 aaaaaab

The automaton described by regular expression Regex are written to Out. If Regex is not specified, it is read from standard input.

Starts the graphical user interface on the automaton in File, or the automaton defined by the regular expression Regex.

Produces postscript version of the automaton read from In.

A minimal string-to-weight transducer will be written to Out,
transducing each of the lines read from In into its rank in alphabetic
ordering; in other words, the transducer computes a perfect hash for
the keys read from In. For more info, see module **fsa_dict**.

A minimal recognizer will be written to Out, recognizing each of the lines read from In.

Produces LaTeX code using PsTricks macro's for the automaton read from In. In the first variant a self-contained LaTeX document is produced; in the second variant a LaTeX picture is produced to be included in another document.

Copies the automaton from In to Out. Useful to convert between
different formats using the **read** and **write** global variables:

fsa read=fast write=normal -copy a.nd b.nd

A so-called prefix tree will be written to Out, which is a weighted recognizer of all the lines read from In, where the shape of the recognizer is the corresponding trie, and the weights are derived from the counts that each transition is used in In. Cf. Carrasco and Oncina, 1999.

A so-called trie will be written to Out, which is a deterministic recognizer of all the lines read from In, where no states have more than a single incoming transition.

The determinization algorithm for transducers (by Mohri) is applied to string-to-string transducer In and yields Out. For details, refer to module t_determinizer.

The determinization algorithm for transducers (by Mohri) is applied to string-to-weight transducer In and yields Out. For details, refer to module t_determinizer.

The minimization algorithm for transducers (by Mohri) is applied to string-to-weight transducer In and yields Out. For details, refer to module t_determinizer.

Reports to standard output whether the two automata read from In and In2 are identical. The algorithm used to determine this first determines both automata, and then renames the states of the automata in a canonical way. The automata are identical in case their canonical representations are identical. Note that two non-identical automata might still be equivalent in the sense that they define the same language or the same relation.