This paper describes the FSA Utilities toolbox: a collection of utilities to manipulate finite-state automata and finite-state transducers. Manipulations include determinization (both for finite-state acceptors and finite-state transducers), minimization, composition, complementation, intersection, Kleene closure, etc. Furthermore, various visualization tools are available to browse finite-state automata. The toolbox is implemented in SICStus Prolog.
The motivation for the FSA Utilities toolbox has been the rapidly growing interest in finite-state techniques for computational linguistics. In particular, finite-state techniques are being used in computational phonology and morphology (Kaplan and Kay ), efficient dictionary lookup and part-of-speech tagging (Roche and Schabes ), natural-language parsing (Voutilainen and Tapanainen ), techniques for parsing ill-formed input (Lang , van Noord ) and speech recognition (Oerder and Ney , Pereira and Riley ), etc. The FSA Utilities toolbox has been developed to experiment with the techniques presented in these publications.
The following reasons for the popularity of finite-state techniques can be identified. Finite-state automata provide a well-studied, simple and very efficient formalism. Moreover, finite-state automata can be combined in various ways to construct larger automata from smaller ones. Such combined automata can still be processed very efficiently.
In this paper, I will illustrate the use of the FSA Utilities toolbox by means of a number of examples in section 3. I will then describe each of the operations which is provided by the FSA Utilities toolbox in more detail in section 4. Section 5 describes the visualization tools. Finally I will discuss some of the implementational issues in section 6. I first introduce the format for finite automata that is being used by FSA Utilities.