Fundamentals of Statistics contains material of various lectures and courses of H. Lohninger on statistics, data analysis and chemometrics......click here for more.


Optimization Methods - Genetic Algorithms

Various attempts have been made to combine the advantages of deterministic and random-search methods. One particular approach has been investigated in recent years: genetic algorithms.

The idea behind these methods is to exploit the principles of genetics for the optimization theory. First of all, a population of "explorers" is created. These explorers are positioned at random within the search space (phase space). Each "explorer" (in genetic algorithm terminology: individuum) detects the value of the response function at its own location and feeds it to a fitness function. The fitness is designed in a way that it maximizes when the search goal is reached. Depending on the value of the fitness function, several basic operations are performed:

Selection Select the k "fittest" explorers from the population. These selected explorers are further processed.
Mating The selected explorers are allowed to mate with other individuals of the population. The strategy for selecting the mating partners may vary from implementation to implementation. The offspring of the pairs replace the individuals of the population which perform worst (in terms of the fitness function).
Crossover Crossover means the exchange of genetic information between two mating individuals. The idea is to create a "child" which has better properties (in terms of the fitness function) than the two parents. If both parents are located close to an optimum, this strategy tends to perform a hill-climbing strategy.
Mutation Mutation means adding some random fluctuations to the location of the explorers in the phase space. The consequence is that random jumps to other locations become possible with a certain low probability.

These steps are repeated, until some termination criterion is fulfilled (e.g. the best explorer reached some defined threshold of fitness), or a defined number of generations have been calculated. The second strategy is often applied if no information about the global optimum is available.

The most important advantage of genetic algorithms is their ability to find an optimum in huge search spaces. In fact, genetic algorithms are efficient only in systems with very large search spaces. Among the disadvantages of genetic algorithms is their high demand for computational power. As a consequence of the high number of necessary evaluations of the fitness function, each single evaluation has to be cheap in terms of efforts to obtain the response value.