Introduction to Machine Learning and Data Mining




Program Discovery

Kyle I S Harrington / kyle@eecs.tufts.edu




## Evolutionary Algorithms EAs are algorithms inspired by evolution which are generally used to solve optimization problems. ![Evolution of Horses](http://api.ning.com/files/E7QmJz0nibmlVobpbSh-eEccvj536Tq7Y1i03De1Ag*jOPTZLKD-VbHdBDZ0-YdAONbHt05hGPSLczCp3wsTy0OyBWeEC88S/HorseyChart.jpg)
## What is evolution? - Natural selection: differential survival/reproduction of individuals based upon phenotypic characteristics (fitness) with heritable traits - Sexual selection: selection for phenotypic characteristics for mate selection

A Simple Evolutionary Algorithm


Generate a random solution, s (may be a model, parameters, etc.)

Loop until s is correct:

  Create an alternative solution, m, by mutating s

  Compare the fitness/quality/accuracy of s and m

  If m is better, then replace s with m

  Otherwise, do nothing

  Repeat
## How do we mutate a solution? Consider a bitstring solution (this could represent a subset of features for use by a classification algorithm) ![Example of bitstring mutation](images/bitstring_mutation.png)
## How does selection work? For the "one's max" problem (maximize sum of bitstring): A population of solutions before selection ![Population before selection](images/selection_before.png) The next generation of the population after selection ![Population after selection](images/selection_after.png)
## Show us an EA in action! # OK! (See notebook/Lecture22.py) This particular method has many names: - Monte Carlo Optimization - 1+1 Evolutionary Strategy - Hillclimber
## Alright, we're kind of sold, but evolution seems wasteful Most mutations are bad, especially for harder problems **Enter stage right:** the Genetic Algorithm
## Genetic Algorithms Now, sometimes instead of mutating a solution, let us take 2 solutions and mix them (crossover)! ![Example of bitstring crossover](images/bitstring_crossover.png)
## Genetic operators Mutation and crossover are "genetic operators" Mutation is a type of global search Crossover is a type of local search There are many other genetic operators that have been proposed, the majority are problem specific.
## But you promised us computer programs! A program is just a genome with a specific type of representation! How would you represent a program in a way that you could use mutation/crossover?
## Representation of computer programs ![Example of a program's representation](images/gp_program.png)
## Lisp - functional programming language - second oldest high-level language behind Fortran - Reverse polish notation is *consistent* e.g, (+ 1 1) - Homoiconicity (code **is** data)

Why evolve Lisp code?

Because this is easy:

(def program '(+ 1 1))
(eval program) => 2
(def new-program (cons '- (rest program))) => (- 1 1)
(eval new-program) => 0

Note: the single quote tells the interpreter not to evaluate the code that follows

## Genetic Programming Genetic programming is the application of genetic algorithms to computer programs Popularized by John Koza. The first application of evolution to computer programs was by Germans (incl. Schmidhuber).
## How do you mutate a program? ![Example of subtree mutation](images/subtree_mutation.png)
## How do you crossover programs? ![Example of subtree crossover](images/subtree_crossover.png)
## Is this practical, or just cool? Genetic programming has been shown to be **competitive** with humans. - Satellite antennae (Lohn, Hornby, Linden 2004) - Optical systems (Koza) - Quantum circuits (Barnum, Bernstein, Spector 2000) - Patentable inventions (analog circuits, PID tuning algorithms, for more see genetic-programming.com) - Discovery of natural laws (Schmidt, Lipson 2008o)
## What Next? The Future