6. Validation

You can draw really nice maps with RuG/L04, but the question is: how well do those maps display the actual situation? The software can't but display what is available in the data. In the end, you'll have to judge the results compared to other sources, other research.

You do have some choices available in RuG/L04. What comparison method should be applied? How should you use the data? What clustering method is the most appropriate? The last question is discussed elsewhere. As for the other questions, RuG/L04 has one tool available: Local incoherence.

6.1 Local incoherence

Local incoherence means something like the lack of coherence on a local scale. It is a formula that expresses the quality of a dialect measurement as a numeric value. It is based on the idea that the dialect in one location differs less from the dialect in another location in the near vicinity, than from the dialect of another location, still in the vicinity, but a bit further away. Differences between locations geographically far apart are discarded, because at that level coincidence sets in.

You can calculate the local incoherence with the linc program. How to use the program, and the exact definition of local incoherence, are explained in the manual of the program.

You can only use local incoherence to compare multiple measurements for one and the same area, because the result highly depends on the geography of the area and the exact geographic distribution of the locations. And of course, one dialect area is not like another. And for instance, if you add locations to a previously analysed area, then the local incoherence can go both up or down, but this says nothing about the relative reliability of the extended set of locations.

Finally, you should note that local incoherence is a simple method. Generally speaking, of two measurements, the one with the best result of local incoherence will be the better measurement. But this doesn't have to be true in each and every case.

6.1.1 Pennsylvania: what type of method?
If you ran all the examples of part 2 and part 3 of this tutorial, then you now have four tables of dialect differences for the state of Pennsylvania. You can calculate the local incoherence of these like this:

    linc -L fon.dif PA.coo
    linc -L lex-lev.dif PA.coo
    linc -L lex-bin.dif PA.coo
    linc -L lex-giw.dif PA.coo

You will get these results:

    phonetic, Levenshtein:  0.728728
    lexical, Levenshtein:   1.32183
    lexical, binary:        1.31965
    lexical, G.I.W.:        1.2249

Smaller values mean better measurements. As you can see here, of the lexical methods, the Gewichteter Identitätswert gives the best result.

The phonetic measurement has a much better score than any of the lexical measurements. For several reasons, it is quite likely that a phonetic comparison is much more precise than a lexical one. But that doesn't mean you should discard the lexical measurements. It may be that these are less accurate than phonetic measurements, but it can still bring to light details that are not expressed as phonetic differences.

6.1.2 Pennsylvania: fine-tuning the lexical measurements
The local incoherence is a useful tool for fine-tuning a measurement, such as determining what parameter settings to use to get the best result. Here is an example.

Data contains noise. Impurities. You may assume that words that are extremely rare in the data set, that those words are largely noise. Suppose you only use words that occur at least twice. Does the result of the measurement improve? And if it does: how often should a word occur before you include it in your measurements? Twice? Thrice? Ten times?

The leven program has an option to exclude infrequent words. Let's do a measurement of lexical differences, using the Levenshtein method, including only words that occur at least twice in the data set (option: -f 2). Afterwards, we determine the local incoherence:

    leven -f 2 -n 67 -l PA.lbl -o lex-lev02.dif lex/*.lex
    linc -L lex-lev02.dif PA.coo

Local incoherence decreased from 1.32183 to 1.23576, quite an improvement. Try higher limits. What limit gives the best result? Make a cluster map of the best result, and compare it to the original map. Are there visible differences?

Run some tests with removing infrequent words using binary measurements or G.I.W. What is the optimal limit in these cases?

Try the -F option (uppercase F), and see how this effects things. Does it always improve things, or never, or does it vary?

6.1.3 Pennsylvania: variations for phonetic measurement
There are a few variants of the Levenshtein algorithm that are generally applicable, also with phonetic differences. Try the effect of the options listed below. Pay attention to differences in local incoherence and the visible effects in the cluster map and MDS map.