Exercise 1

General remarks


Produce an FSA macro-file containing macros for A simple example issyllable.pl. You can take this as a start and replace the definitions by something more adequate...

Macro's and auxiliary files

Macro's can be loaded by starting fsa (fsa tkconsol=on -tk), and going to the menu File and choosing LoadAux or Reconsult Aux. Select the corresponding filename (my_macros.pl) in the resulting box. After that you can use the macros you have just defined as if they were a regular expression. Thus, if a macro 'vowel' is defined, you can type 'vowel' in the Regexp line. This expression will be translated as the regular expression in the definition of the macro.


After making the definitions, and checking them in fsa, you can test your work in two ways. This requires the following files:

Test 1: Recognize `foreign' words.

The file 'monosyll' consists of a list of 740 words of the form
[consonant*, vowel+, consonant*].  The Unix command

make not_accepted

produces a file `not_accepted' which contains all words not recognized by 'syllable'. This list should only contain words which consist of more than a single syllable (aaien, beiaard,...) and non-native words (back, blues,...).

Test 2: Hyphenating simple words

The file eow.stem contains a list of 11.382 words. The command

make hyphen_errors

produces a file hyphen_errors which contains all wrongly hyphenated (1st column, 2nd column = correct patterns ), and gives the percentage of correctly hyphenated words.


  1. Check your definitions before testing, for instance by loading them in fsa and trying out some examples.
  2. The unix command make produces files as defined in a 'Makefile'. Rerunning a test sometimes leads to the message 'File up to date'. In those cases, just remove File and run make again. If you want to start all over, do 'make clean' : this removes all files made by make.

Reporting Results

  1. Mail the file syllable.pl and a brief report to your lab-assistent.
  2. In the report you should give the results of test 1 (how many words are not recognized, which kind of words are not recognized?) and of test 2 (how many mistakes? what kind of mistakes?)
  3. Send your results to m.b.villada@let.rug.nl

Deadline: Thursday, April, 17

Good luck!

 n.b. This approach to hyphenation works less well for English, as the spelling of syllables is more irregular than in Dutch, and morphological structure plays a more prominent role. It is still interesting to find out to what extent it can be made to work....