Fundamentals of Statistics contains material of various lectures and courses of H. Lohninger on statistics, data analysis and chemometrics......click here for more. 
Home Bivariate Data Correlation Contingency Coefficient  
Contingency Coefficient
If we look at the contingency table of two uncorrelated nominal variables, we can calculate the frequency of a particular combination of features h_{ij} as h_{ik} = h_{i}h_{k}/N In the case of a correlation of the two variables the actual frequencies H_{ik} will deviate from the ideal uncorrelated frequencies h_{ik}. The difference D_{ik} between ideal (uncorrelated) und actual frequencies thus calculates as D_{ik} = H_{ik}  h_{ik} = H_{ik}  h_{i}h_{k}/N. For uncorrelated variables the difference of frequencies will be around zero for each cell of the table. Thus the correlation of the two variables can be measured by squaring the relative differences and calculating the sum of these squares in relation to the ideal frequencies:
The resulting χ^{2} coefficient, however, has the disadvantage that its value depends both on the dimension of the contingency table and on the size of the sample. After eliminating the dependence on the sample size, we get Pearson's contingency coefficient C: As this coefficient C is still depending on the dimension of the contingency table, it will be normalized so that its range extends from 0.0 to 1.0: with m_{min} = min(q,p).


Home Bivariate Data Correlation Contingency Coefficient 
Last Update: 20121008