Fundamentals of Statistics contains material of various lectures and courses of H. Lohninger on statistics, data analysis and chemometrics......click here for more.


Variable Selection - Pruning

Backward selection is the counterpart to forward selection: while forward selection starts with one variable, building up a model by adding variables, backward selection starts with all available variables, removing all "unnecessary" variables, step by step. This method is also known as the "pruning" of variables.

The algorithm is defined as follows (specifically described here for multiple linear regression; however, this technique may be used for other modeling approaches, too):
 

1. calculate a model including all available variables
2. calculate all partial F values for each independent variable
3. remove the variable with the lowest F value, if it falls below a predefined limit
4. proceed with step 1


This algorithm will eventually remove all variables which do not contribute much to the explanation of the variance of the dependent variable Y.

Note: you have to recalculate all partial F values after removing a variable, since this changes the F values of the remaining variables.