Fundamentals of Statistics contains material of various lectures and courses of H. Lohninger on statistics, data analysis and here for more.

Exercise - Similar Mineral Waters

A problem often encountered in data analysis is finding the most similar observations in a set of data. There are several ways to uncover similarity between individual data sets. You can either use algorithms of cluster analysis, or rely on visual inspection by using principal component analysis to look at the high-dimensional data set. Using the correlation table may be misleading, since the correlation does not reflect absolute values.

Use the data set MINWATER to find

(1) the two most similar brands of mineral water in the data set,
(2) the brand which is most similar to "Gasteiner", and
(3) the two most dissimilar brands.

Do you have an idea what to do with the missing values?

You can now go directly to the  DataLab  in order to experiment with the data.

Last Update: 2012-10-08