Fundamentals of Statistics contains material of various lectures and courses of H. Lohninger on statistics, data analysis and chemometrics......click here for more. 
Home Multivariate Data Modeling Validation of Models Predictive Ability  
See also: crossvalidation, Validation of Models  
Predictive Ability
For many procedures of multvariate statistics the degrees of freedom cannot be specified, thus preventing the calculation of the standard error. In order to get nevertheless a metric for the prediction error one can resort to the quadratic mean of the observed residuals:
RMSEP stands for Root Mean Squared Error of Prediction, RMSEP is calculated by summing all squared prediction errors during crossvalidation and is an indicator of the reliability and predictive ability of the model. The lower the RMSEP value the higher is the predictive ability of the model. PRESS (or RMSEP) can be used to find the optimum number of components by a stepwise variable selection procedure. The "best" model consists of as few predictor variables as possible and shows the lowest (or almost the lowest) PRESS. In the figure below you see an example of a hypothetical variable selection procedure, resulting in the "best" model of 5 predictor variables. Note: a disadvantage of using PRESS or RMSEP is the enormous number of calculations necessary to obtain the PRESS value. This is especially true for calculationintensive models (such as neural networks) and large data sets.


Home Multivariate Data Modeling Validation of Models Predictive Ability 
Last Update: 20121008