Fundamentals of Statistics contains material of various lectures and courses of H. Lohninger on statistics, data analysis and chemometrics......click here for more.


Design Matrix

The term design matrix origins from experimental design, which deals with the statistically sound design of experiments. In this case the design matrix contains all values of the explanatory variables. As the entirety of these values control the experimental design, this matrix is used to characterize a particular design (hence "design matrix").

At the same time the design matrix may be used as a basis for calculating statistical models, since the design matrix contains all the necessary information. With respect to the simplest linear model (multiple linear regression) this means that the design matrix is nothing else but the matrix of the descriptors xj,i. The system of equations to set up an MLR model reads as follows:

If we denote this linear system in matrix notation, we arrive at the following equation:

Y = A X + E,

where the matrix X is called the design matrix. The vector Y contains the response variable, the vector A holds the coefficients and the vector E the residuals. The following diagram illustrates the matrix multiplication involved in the regression model: