Next: Dynamic Programming ...
Previous: Linear Programming ...
Genetic algorithms
Genetic algorithms apply the methodology of biological evolution to
optimisation problems.
They were invented by John Holland in the 1970s and gained little
recognition at the beginning. Today this has changed drammatically.
Genetic algorithms (GA) are applied to an ever increasing domain
of problems ranging from engineering design, banking and finance,
management decision support, network design and many combinatorial problems.
The methodology is the following:
Given a function f(x) which shall be maximised over a given search space X,
every point x is represented as a chromosome. The algorithm establishes
an initial population of usually between 30 and 100 chromosomes.
Then new chromosomes (children) are produced by either taking one
chromosome (one parent) from the present population and performing a
mutation operation on it, or by taking two chromosomes (two parents)
from the population and performing a crossover operation on them.
The new chromosomes are collected to give a new population (next generation)
or they are inserted one at a time into the population (and deleting
another chromosome in order to keep the population size constant).
Each chromosome is given a fitness value, for example f(x). It is crucial to
give a higher probability of being picked as a parent to a chromosome
with a good fitness value than to another chromosome with a lower fitness
value. This way, fit chromosomes reproduce more often than unfit ones.
In this process the population evolves to a fitter population. Since the
fitness is linked to f(x) this means that the population evolves to
one with higher values f(x).
Let's now look at a little example. A telecommunications company has 10
sites where it can build an antenna for mobile phones. Building antennas
on all 10 sites is not economical because it is too expensive. It has to choose
some sites. For every choice of sites the company can calculate the net profit
this choice will yield. Let f(x) be the net profit which takes into
account the revenues generated and the running costs and building costs,
where x is a choice of sites.
A choice x can be represented as a chromosome with 10 genes each containing a
zero or a one. For example the chromosome
stands for the choice of building an antenna on sites 2, 3, 7, 9 and 10.
On the i-th gene there is a one if on site i an antenna is built,
and otherwise the i-th gene contains a zero.
[A real world problem would have rather around 100 or 1000 sites. However,
in order to be able to draw the whole chromosome representation we only have
10 for the moment. If there are n sites then the number of possible choices
of sites is , i.e. the number of possible choices of sites
increases exponentially with the number of sites.]
A mutation operation on the above chromosome could be to pick a gene at
random and changing it from a zero to a one or from a one to a zero,
depending on which number it contains.
A crossover operator on the chromosome above and the one just below could be
the one-point crossover operator.
The one-point crossover operator picks a break point at random and generates
a child-chromosome by copying everything to the left of the break point
from the first parent-chromosome and everything to the right of the break
point from the second parent-chromosome.
For example, if the break point is just after the forth gene then the
one point operator would produce the child-chromosome
There are, of course, many other mutation and crossover operators and usually
it is beneficial to design problem specific operators. To design a good genetic
algorithm requires experience.
For every problem it is very important to pick the right chromosome
representation. The telecommunications example given used a string of
zeros and ones as the representation.
Other often used representations are:
- *
- string of integers in a certain interval [a,b].
- *
- string of reals in a certain interval [a,b].
- *
- string of integers representing a permutation.
- *
- It is also possible to use a mix of the first three representations.
i.e. a string of numbers consisting of several substrings, each substring
corresponding to one of the first three representation types.
To any chosen representation one has to design mutation and crossover
operators which produce children that conform to the chosen representation.
Genetic algorithms give a great deal of freedom in the algorithm design.
To write a simple GA is fairly straightforward and might work quite well.
To write a GA with advanced features, which generally increase the performance
considerably, takes time.
Here is a list of some advanced features:
- elitism: The chromosome with the best fitness is never
deleted from the population. If one works with generations rather than with
a continuously changing population then the fittest chromosome from the old
generation is carried over to the new generation.
- restarts: One does a series of GA runs one after another. Every
GA run after the first one takes a random initial population of chromosomes
apart from one chromosome which is set to the best chromosome
found by the previous GA run. This method can be very effective and it makes it
possible to work with smaller population sizes.
- dynamic operator selection: Typically, in a GA run
a reproduction operator performs differently well in different stages of the
run. An operator may perform well at the beginning of a run and badly towards
the end. Dynamic operator selection is a mechanism which gives a lot of credit
to operators when producing all-over-best chromosomes and a little credit when producing
better-than-the-parent chromosomes. Over time the credit scores of the
operators are discounted. The selection probability for operators is
proportional to their present credit score. This mechanism is very impressive
and was proposed by Lawrence Davies. In my personal experience I have never
come across an example where this mechanism did not improve substantially
the performance of the GA.
- exclusion of identical chromosomes: In a GA run the diversity of the
population is large at the beginning of the run and towards the end of the run
the population converges and the chromosomes become similar to each other.
Without this mechanism the population often consists of identical chromosomes
at the end of the run.
This mechanism prevents premature convergence of the population.
A nice feature of GAs is that it is possible to integrate other available
optimisation methods and heuristics.
Another nice feature is that they can easily be
parallelised when several processors are at hand.
Roughly speaking, GAs are a good choice for
hard problems for which no mathematically exact algorithm exists.
For example a lot of combinatorial problems can not be solved to global
optimality. For such problems GAs are often able to find very good near-optimal
points in a relatively short time.
My three favourite books that I can recommend are:
- *
- John Holland: ``Adaptation in Natural and Artificial Systems''.
This book gives a good mathematical introduction. In particular the
schemata argument, which explains why crossover operators work well and
are so important, is presented clearly. This book is quite theoretical.
- *
- Lawrence Davies: ``Handbook of Genetic Algorithms''
The first half of this book explains GAs for people interested in
applying them. No mathematical
background of the reader is assumed. The second half of the book consists of
a collection of real world examples to which GAs have been applied.
- *
- Zbigniew Michalewicz:
"Genetic Algorithms + Data Structures = Evolution Programs"
This book is also suitable for people interested in applying GAs themselves.
The book contains many types of examples different from those in Lawrence
Davies' book.
Next: Dynamic Programming ...
Previous: Linear Programming ...