Fundamentals of Statistics contains material of various lectures and courses of H. Lohninger on statistics, data analysis and here for more.

Data Compression by PCA

Principal component analysis can be considered from the viewpoint of data compression. However, as the compression factor of PCA is comparatively low, data compression using PCA is only meaningful for education purposes giving additonal insights into the mechanics of PCA. For "real" compression much better specialized methods are available.

A few scores of the PCA and the corresponding loading vectors can be used to estimate the contents of a large data matrix. The idea behind this is that by reducing the number of eigenvectors used to reconstruct the original data matrix, the amount of required storage space is reduced. However one should be careful, since this method of compression is only meaningful if the data matrix shows a high amount of correlation, both among the variables and the objects.

Click the logo above to start an  interactive example  which shows some details of this method.