Histogram
Histograms are an efficient and common method to describe distributions of continuous variables. In general, histograms plot the frequency of occurrence of an observation within given fixedwidth intervals. Histograms can be regarded as a type of classification of data. Each sample is sorted into one of several "bins" according to some property. The following interactive example shows how histograms are calculated.
n_{class} ~ 2 n_{class} ~ 10log_{10}(n) The last equation is unsuitable for a low number of observations (<50). The question of the bin width does not arise with data measured on a nominal or ordinal level because the number of classes follows naturally from the class assignments (the only exception would be an ordinal variable with many categories). When constructing a histogram one should be careful to establish strict proportionality between the areas of the histogram bars and the underlying frequencies. Humans tend to interpret diagrams which do not exhibit this proportionality in a wrong and misleading way. In addition, one should avoid unequal bar widths. By using equal widths the frequencies can be directly related to the heights of the bars. Histograms, by definition, are stair case functions. A smoother alternative to histograms can be seen in frequency polygons.


