Fundamentals of Statistics contains material of various lectures and courses of H. Lohninger on statistics, data analysis and here for more.


The analysis of variance (ANOVA) is a tool to find those factors in a multidimensional model which influence the model most. This can be primarily reduced to the question whether the means of several samples are the same. The samples are, in general, not independent of each other and are often obtained from a designed experiment (factorial design).

In order to compare several means one could possibly use a two-sample t-test for each pair of samples. While this seems reasonable at first, closer inspection reveals at least four drawbacks to this approach:

  1. the number of pairs is n*(n-1)/2 which results in a large number of t-tests.
  2. the level of significance is automatically increased by performing multiple t-tests. If we, for example, define a level of significance of α = 0.01 for the individual tests, the probability of avoiding a type I error is 0.99. If we have to perform k independent tests, the overall probability of avoiding a type I error on all tests is (1-α)k. This means that the propability of making a type I error (which is the level of significance for all tests) is 1-(1-α)k. For α = 0.01 and 10 means which have to be compared (resulting in 10*(10-1)/2 = 45 t-tests) this would create an overall level of significance of 0.364, which is rather poor.(1)
  3. the individual tests are not independent of each other. Suppose we have three samples and we have to compare three means. If we know the differences between two pairs of means, we immediately know the difference between the means of the third pair, i.e. only two of the differences are independent. Again this increases the probability of making a type I error (or, equally, increasing the level of significance).
  4. the individual tests may produce contradictory results. In the case of an n-sample problem only one of the t-tests may be significant. This means that two of the means are not equal, while all other pairs of means are equal. However, this is a contradictory result, since the result of that particular t-test could be calculated from the results of all other t-tests - and those have been found to be equal.

In order to avoid these problems, R.A. Fisher introduced a method which is commonly called "analysis of variance". The idea behind the ANOVA is that any differences among population means should be reflected in the variance among the samples obtained from these populations.

(1) The following table provides a survey on the increase of the level of significance when performing multiple comparisons:
N             alpha=0.001  alpha=0.01  alpha=0.05
1               0.0010       0.0100      0.0500
2               0.0020       0.0199      0.0975
3               0.0030       0.0297      0.1426
4               0.0040       0.0394      0.1855
5               0.0050       0.0490      0.2262
6               0.0060       0.0585      0.2649
7               0.0070       0.0679      0.3017
8               0.0080       0.0773      0.3366
9               0.0090       0.0865      0.3698
10              0.0100       0.0956      0.4013
20              0.0198       0.1821      0.6415
30              0.0296       0.2603      0.7854
40              0.0392       0.3310      0.8715
50              0.0488       0.3950      0.9231
N .... number of comparisons, alpha .... level of significance