Maths projects  Project III topics

 

Project III 2015-2016

 

Genetic Algorithms

 

Steven Abel

email

 

Description

 

Suppose one wanted to find the highest point of a complicated function with many maxima. Taking the approach of “walking up the slope” can leave one trapped at a local maximum that is not the highest one. On the other hand, if the function has multi-variables, it may be too costly in terms of computing time to work out the height at each point and simply scan the whole surface.

                                                     

This problem can be treated by using genetic algorithms. In this approach, one mimics natural selection by metaphorically covering the curve with a population of “creatures” whose DNA (usually referred to as the genotype) is constructed from their coordinates: that is one takes the coordinates, and literally writes them out as a string of data, perhaps a series of ones and zeroes. Every creature’s genotype gives rise to physical characteristics (phenotype), which in this example is the height of the surface at that point. The creatures are then allowed to breed over several generations with a frequency related to their “goodness of fit” (in this example how high they are). Breeding is done by splicing the DNA: literally mixing the strings of data. Unfit creatures die without breeding, and random mutations are introduced in the data in order to prevent the population getting stuck at a local maximum. Gradually, over a few generations, the unfit members of the population die out and only the fittest survive, namely those that have evolved to the desired solution. An example is shown below (written in Mathematica). A population of only 20 breeding individuals can find the correct global maximum of this function (order one parameters constructed from polynomials and sines), to about 0.01 accuracy over 20 generations: using a more efficient programming language such as Python or C++ one could extend the population to thousands.

 

GA-notebook-fig1.gif        GA-notebook-fig2.gif

 

That example is simple to understand but doesn’t show the real power of genetic algorithms. So here is another one (taken from Mitchell - below). Suppose you want to search for a protein—a sequence of amino acids—that folds up to a particular three−dimensional shape so that it can be used, say, to fight a specific virus. The search space is the collection of all possible protein sequences—an infinite set of possibilities. To constrain it, let us restrict the search to all possible sequences of length 100 or less. But it is still a huge search space, since there are 20 possible amino acids at each position in the sequence. If we represent the 20 amino acids by letters of the alphabet, candidate solutions will look like: A G G M C G B L… and there are of order 20^100 possible solutions!! Given that the goodness of fit will involve some modelling of the physical properties of the protein, it is unlikely that any systematic search could find the right solution, but genetic algorithms can.

 

These techniques have been successfully applied to many similar problems, including financial forecasting (where they are able to “beat the market” and supposedly outperform neural networks), particle physics, and so forth. In this project, which is open ended, you will identify an interesting optimization problem, and attempt to develop the genetic algorithm code to solve it computationally. The precise optimization problem(s) you choose to explore is unspecified, and may involve any area, from for example finding the solution to a variational principle problem in mathematical physics, to predicting the exchange rate of bitcoins. (GA codes already exist for that incidentally). The project will require significant computing ability.

 

Prerequisites

 

Experience with programming in C, C++, Fortran, Python or Mathematica.

 

Resources and references

 

There are a large number of papers and introductory books; 

 

 

Genetic algorithms in financial forecasting http://citeseerx.ist.psu.edu/viewdoc/download?doi=10.1.1.86.9698&rep=rep1&type=pdf

Goldberg is one of the better ones apparently http://books.google.co.uk/books/about/Genetic_Algorithms.html?id=6gzS07Sv9hoC&redir_esc=y

Mitchell is a great introduction: http://library.dur.ac.uk/search~S1?/Ymitchell+melanie&searchscope=1&SORT=D/Ymitchell+melanie&searchscope=1&SORT=D&SUBKEY=mitchell+melanie/1%2C4%2C4%2CE/frameset&FF=Ymitchell+melanie&searchscope=1&SORT=D&1%2C1%2C