Fundamentals of Statistics contains material of various lectures and courses of H. Lohninger on statistics, data analysis and chemometrics......click here for more.


Variable Selection - Forward Selection

Generally speaking, forward selection is a method to find the "best" combination of variables by starting with a single variable, and increasing the number of used variables, step by step. Which variables to add is decided according to some criteria, which may vary from method to method. For linear regression the partial F values are usually used.

The method is started by first selecting the variable which results in the best fit for the dependent variable Y. Next, this variable is used to test all combinations with the remaining variables in order to find the "best" pair of variables. In all further steps, additional variables are added until either all variables are used up, or some stopping criterion is met (i.e. the partial F value falls below a certain limit).

Note that the forward selection does not necessarily find the best combination of variables (out of all possible combinations). However, it will result in a combination which comes close to the optimum solution.



Click the figure above to start the visualization of the forward selection.