Fundamentals of Statistics contains material of various lectures and courses of H. Lohninger on statistics, data analysis and here for more.

Exercise - Dependence of PC scores on scaling of data

This exercise will show you how the results of principal component analysis depend on the scaling of the original data. The data used in this exercise contains two classes which can either be discriminated by using a combination of the first and the second variable, or by using the third variable alone. Principal component plots are a useful way to look at the data in a "multidimensional" way. However, the direction of the principal components depends on the scaling of the data.

Go to the  DataLab  to experiment yourself. You should first look at the data set by using the 3D rotation option. By checking the box "isometric axes" you will see the relationship between the three variables in scale.

Next, you should calculate the principal components using different scaling options and compare the results. Why does the PC scores plot only show the two classes as separate clusters if the data is standardized prior to the PCA?

Last Update: 2012-10-08