Fundamentals of Statistics contains material of various lectures and courses of H. Lohninger on statistics, data analysis and here for more.

Exercise - Calculate a polynomial fit by means of MLR

Multiple linear regression can be used to create an arbitrary polynomial fit between two variables. The concept behind this is to calculate different powers of the independent variable and estimate the parameters by MLR. If you want to create a third order fit, you calculate the square and the third power of the independent variable. In addition, you also need a variable which is just constant, preferably 1 (to compensate for any offset in the function to be estimated). So you end up with a matrix containing the following variables:

u (const.), x, x2, x3, and y

You can now apply MLR to estimate the coefficients of the polynomial fit (assuming that u equals 1):

y = a + bx + cx2 + dx3

Use the data set POLYFIT and go to the  DataLab  to create several fits of different order. Try to fit the data by a 2nd, 3rd, and 4th order polynomial. Which of the curve fittings fits best? How can you avoid overfitting?

Hint:    DataLab automatically calculates the constant coefficient, so you don't have to provide an extra (constant) variable for the calculations.